victor-pavlychko / WebsterParser

Convert Webster's Unabridged 1913 dictionary in to a more usable format

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WebsterParser

A better dictionary for your mac

In the a blog post named [You’re probably using the wrong dictionary] (http://jsomers.net/blog/dictionary), James Somers proposes using Webster’s Unabridged Dictionary as it provides more evocative and accurate definitions than most modern dictionaries.

The text of the 1913 version has been digitized and can be found on Project Gutenberg. Unfortunately the text files are in a very arcane format. Being created before UTF-8 was commonly used, it specifies a lot of non-standard entities to encode the all the various accents and special symbols.

This project parses these original text files and creates a reasonably clean UTF-8 XML version which can be converted into a mac dictionary file with Apple’s Dictionary Kit.

Screenshot of the dictionary

How to build

With NodeJS installed, run

npm install
node index.js

Building the dictionary might take a while (around three minutes on my machine)

Just want the .dictionary file?

Download it in the releases section of this project.

Using it on iOS?

You can use the dictionary file on your iDevice if it is jailbroken. SSH into your device and navigate to /private/var/mobile/Library/Assets/com_apple_MobileAsset_DictionaryServices_dictionary2. On your iDevice dowload any new stock dictionary (Select a word -> Define -> Manage -> Download) that you don't need. In your SSH browser find out which folder was just added. Navigate to folderwithcrypticnumber/AssetData. Replace the .dictionary folder with the webster.dictionary folder, but keeping the name. You should now be able to lookup words.

I don't know how to change the name of the dictionary in the list, pointers are welcome.

About

Convert Webster's Unabridged 1913 dictionary in to a more usable format

License:GNU General Public License v3.0


Languages

Language:Assembly 13.4%Language:C 11.8%Language:OpenEdge ABL 10.1%Language:D 7.2%Language:Brainfuck 6.7%Language:M 6.4%Language:Perl 6.3%Language:Rebol 5.7%Language:Forth 5.2%Language:E 4.9%Language:C++ 4.5%Language:Lex 4.2%Language:GAP 3.8%Language:CWeb 3.3%Language:Roff 2.3%Language:Coq 2.0%Language:Objective-J 1.0%Language:q 0.7%Language:Yacc 0.4%Language:Logos 0.1%Language:JavaScript 0.0%Language:CSS 0.0%Language:Makefile 0.0%