Imaginatio / langdetect

Statistical language detection with 50 profiles. forked from http://code.google.com/p/language-detection/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Source of language datasets

DonaldTsang opened this issue · comments

Where is the source text dataset for the Ngrams of those 50 languages? Would like to see if it is different from wooorm/franc#78 usage of UDHR, and if it is more accurate than them.