Converts a set of Wiktionary entries to a .mobi dictionary usable by a Kindle.
- A Wiktionary dump is downloaded.
- JWKTL is used to parse the downloaded XML and to create a database of the results.
- Some Java code iterates on the wanted entries and generates a text file in which each line has the following format:
word<TAB>definition
. - tab2opf is used to convert the text file into a set of OPF and HTML files.
- KindleGen is used to convert the above OPF and HTML files to a MOBI eBook that can be used as a dictionary by a Kindle.
- Clone the repo as well as its submodule.
git clone https://github.com/nyg/wiktionary-to-kindle.git
git submodule update --init --recursive
- Build the Java project. Apache Maven is required.
mvn package
- Download the latest English Wiktionary dump. In the following command, the
en
andlatest
arguments are the defaults so they are not needed. Note that the specified language should be parsable by JWKTL (currently it only supportsen
,de
,ru
). To specify another date use theYYYYMMDD
format. The dump downloaded ispages-articles.xml.bz2
.
java -jar target/wiktionary-to-kindle-1.0.0.jar download en latest
- The dump must now be parsed using the following command (as mentioned above,
en
andlatest
are not needed).
java -jar target/wiktionary-to-kindle-1.0.0.jar parse en latest
- Time has now come to generate the dictionary text file. As said before, the default language is
en
but here it is possible to select only the entries of a particular language. For example, if we want only the Greek entries (el
) of the English Wiktionary, the following command is to be used:
java -jar target/wiktionary-to-kindle-1.0.0.jar generate el
- The dictionary file has been generated in
dictionaries/lexicon.txt
. To convert it into an OPF file, execute the commands below. Python 3 is required. The-s
and-t
options are the source and target languages respectively.
cd dictionaries
python ../scripts/tab2opf/tab2opf.py -s el -t en lexicon.txt
- Convert the OPF file into a MOBI eBook using KindleGen. Replace by either
linux
,mac
orwin
. On Windows, the name of the executable iskindlegen.exe
.
../scripts/kindlegen_<OS>/kindlegen lexicon.opf
- If all went well, you should now have the
lexicon.mobi
file in your possession. You can either send it to your Kindle via its Kindle email address, or drag and drop it as you would with another eBook.
- Improving JWKTL's parsing abilities (template support, etc.).
- Finding how to properly structure the OPF entries.