egegungordu/jaime
A fast, lightweight Japanese IME engine for Zig projects that converts romaji to hiragana, kanji, and full-width characters. Supports Google 日本語入力-sty...
A headless Japanese IME (Input Method Editor) engine for Zig projects that provides:
Romaji to hiragana/katakana conversion
eiennni → えいえんに
Full-width character conversion
abc123 → abc123
Dictionary-based word conversion
かんじ → 漢字
Built-in cursor and buffer management
Based on Google 日本語入力 behavior.
On the terminal with libvaxis |
On the web with webassembly |
The minimum Zig version required is 0.13.0.
This project includes the IPADIC dictionary, which is provided under the license terms stated in the accompanying COPYING
file. The IPADIC license imposes additional restrictions and requirements on its usage and redistribution. If your application cannot comply with the terms of the IPADIC license, consider using the ime_core
module with a custom dictionary implementation instead.
You can add jaime as a dependency in your build.zig.zon
file in two ways:
# Get the latest development version from main branch
zig fetch --save git+https://github.com/egegungordu/jaime
# Get a specific release version (replace x.y.z with desired version)
zig fetch --save https://github.com/egegungordu/jaime/archive/refs/tags/vx.y.z.tar.gz
Then instantiate the dependency in your build.zig
:
const jaime = b.dependency("jaime", .{});
exe.root_module.addImport("kana", jaime.module("kana")); // For simple kana conversion
exe.root_module.addImport("ime_core", jaime.module("ime_core")); // For IME without dictionary
exe.root_module.addImport("ime_ipadic", jaime.module("ime_ipadic")); // For IME with IPADIC dictionary
The library provides three modules for different use cases:
For simple romaji to hiragana conversions without IME functionality:
const kana = @import("kana");
// Using a provided buffer (no allocations)
var buf: [100]u8 = undefined;
const result = try kana.convertBuf(&buf, "konnnichiha");
try std.testing.expectEqualStrings("こんにちは", result);
// Using an allocator (returns owned slice)
const result2 = try kana.convert(allocator, "konnnichiha");
defer allocator.free(result2);
try std.testing.expectEqualStrings("こんにちは", result2);
For applications that want to use the full-featured IME with the IPADIC dictionary:
const ime_ipadic = @import("ime_ipadic");
// Using owned buffer (with allocator)
var ime = ime_ipadic.Ime(.owned).init(allocator);
defer ime.deinit();
// Using borrowed buffer (fixed size, no allocations)
var buf: [100]u8 = undefined;
var ime = ime_ipadic.Ime(.borrowed).init(&buf);
// Common IME operations
const result = try ime.insert("k");
const result2 = try ime.insert("o");
const result3 = try ime.insert("n");
try std.testing.expectEqualStrings("こん", ime.input.buf.items());
// Dictionary Matches
if (ime.getMatches()) |matches| {
// Get suggested conversions from the dictionary
// Returns []WordEntry containing possible word matches
}
try ime.applyMatch(); // Apply the best dictionary match to the current input
// Cursor Movement and Editing
ime.moveCursorBack(1); // Move cursor left n positions
ime.moveCursorForward(1);// Move cursor right n positions
try ime.insert("y"); // Insert at cursor position
ime.clear(); // Clear the input buffer
try ime.deleteBack(); // Delete one character before cursor
try ime.deleteForward(); // Delete one character after cursor
WARNING
The IPADIC dictionary is subject to its own license terms. If you need to use a different dictionary or want to avoid IPADIC's license requirements, use theime_core
module with your own dictionary implementation.
For applications that want to use IME functionality with their own dictionary implementation:
const ime_core = @import("ime_core");
// Create your own dictionary loader that implements the required interface
const MyDictLoader = struct {
pub fn loadDictionary(allocator: std.mem.Allocator) !Dictionary {
// Your dictionary loading logic here
}
pub fn freeDictionary(dict: *Dictionary) void {
// Your dictionary cleanup logic here
}
};
// Use the IME with your custom dictionary
var ime = ime_core.Ime(MyDictLoader).init(allocator);
defer ime.deinit();
For web applications, you can build the WebAssembly bindings:
# Build the WebAssembly library
zig build
The WebAssembly library uses the IPADIC dictionary by default. For a complete example of how to use the WebAssembly bindings in a web application, check out the web example.
The WebAssembly library provides the following functions:
// Initialize the IME
init();
// Get pointer to input buffer for writing input text
getInputBufferPointer();
// Insert text at current position
// length: number of bytes to read from input buffer
insert(length);
// Get information about the last insertion
getDeletedCodepoints(); // Number of codepoints deleted
getInsertedTextLength(); // Length of inserted text in bytes
getInsertedTextPointer(); // Pointer to inserted text
// Cursor movement and editing
deleteBack(); // Delete character before cursor
deleteForward(); // Delete character after cursor
moveCursorBack(n); // Move cursor back n positions
moveCursorForward(n); // Move cursor forward n positions
Example usage in JavaScript:
// Initialize
init();
// Get input buffer
const inputPtr = getInputBufferPointer();
const inputBuffer = new Uint8Array(memory.buffer, inputPtr, 64);
// Write and insert characters one by one
const text = "ka";
for (const char of text) {
// Write single character to buffer
const bytes = new TextEncoder().encode(char);
inputBuffer.set(bytes);
// Insert and get result
insert(bytes.length);
// Get the inserted text
const insertedLength = getInsertedTextLength();
const insertedPtr = getInsertedTextPointer();
const insertedText = new TextDecoder().decode(
new Uint8Array(memory.buffer, insertedPtr, insertedLength)
);
// Check if any characters were deleted
const deletedCount = getDeletedCodepoints();
console.log({
inserted: insertedText,
deleted: deletedCount,
});
}
// Final result is "か"
To run the test suite:
zig build test --summary all
Contributions are welcome! Please feel free to open an issue or submit a Pull Request.
For those interested in the data structures and algorithms used in this project, or looking to implement similar functionality, the following academic papers provide excellent background: