Source of language datasets
DonaldTsang opened this issue · comments
Don Tsang commented
Where is the source text dataset for the Ngrams of those 50 languages? Would like to see if it is different from wooorm/franc#78 usage of UDHR, and if it is more accurate than them.