alac / instant_jp_vocab

Monitor the clipboard and generate vocabulary lists for Japanese sentences

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

instant_jp_vocab

Monitor the clipboard and generate vocabulary lists for Japanese sentences. Useful with a text hooking utility like Textractor.

"Recording of the live translation behavior"
Live translations, vocabulary lists.

"Recording of the ai questioning behavior"
Ask questions, with the previous lines as context. Uses ctrl+enter as a shortcut.

Setup

  1. Install with pip install -r requirements.txt and python -m unidic download.
  2. Edit settings.toml to enable what you want.
  3. For AI behaviors, you can either set a Google Gemini API key in the settings file.
    Or, install Oobabooga's Text-Generation-WebUI, enable the API and configure it in the settings file.
  4. Start the app by entering python -m jp_vocab_monitor_ui [story_name].

Configuration

You can also configure the program by creating a user.toml in the root directory. Then, settings will be loaded from settings.toml first, with any overlapping values overridden by user.toml.

Per story configuration is also possible by adding a [story_name].toml in the settings folder.
In particular, you can add synopsis to guide the AI with the ai_translation_context key.

Suggested Models

I've seen decent translation quality with the following local models:
Command-R
Mixtral-8x7B-instruct-cosmopedia-japanese20k

If you're going to ask the AI questions about Japanese, I'd recommend using Google's Gemini Pro via API (Gemini 1.5's accuracy is great; and the free-tier rate limiting should be fine for reading).

Credits

For definitions and katakana readings in non-ML mode, we use a modified Jitendex, which is under the ShareAlike license.

For breaking sentences into words, we use fugashi.

About

Monitor the clipboard and generate vocabulary lists for Japanese sentences

License:MIT License


Languages

Language:Python 100.0%