Nolbir01 / Uyghur-Wordlist

Uyghur Word List

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This repository contains various Word Lists and related information.

Files

  • wordlist-Internet-[version].zip

    A list of more than 2 million unique words automatically extracted from HTML content of many popular Uyghur websites as well as Wikipeda. This word list contains majority of Uyghur words used on the Internet. Notice that it contains many misspelled or erroneous words. The list also includes some numerucal statistical information of words such as raw frequency and document frequency. Each line of the file consists of three fields separated by comma:

[Word],[Raw frequency],[Document frequency]

where, Raw frequency: number of times that a word occurs in all documents (web pages). Document frequency: number of documents containing a word.

Licence

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

About

Uyghur Word List