nachocab / words-by-frequency

A repository of words in multiple languages sorted by their frequency

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Words by Frequency of Use

Word frequencies come from this website.

Italian

# freq  word    pronunciation
521174  un      (un)
291383  sono    (ˈsono)
204090  cosa    (ˈk⊃sa)
186605  come    (ˈkome)
170403  io      (ˈio)
149049  questo  (ˈkwesto)
140200  hai     (ˈai)
140019  bene    (ˈbεne)
138657  sei     (ˈsεi)
138657  sei     (ˈsεi)
...

English

The original source is the Carnegie Mellon University Pronuncing Dictionary. Instead of IPA it uses its own pronunciation guide. The table explaining what each letter means is on their website.

#  freq  word  pronunciation
6281002  you   Y UW
5685306  i     AY
4768490  the   DH AH
3453407  to    T UW
3048287  a     AH
2879962  it    IH T
2127187  and   AH N D
2030642  that  DH AE T
1847884  of    AH V
1554103  in    IH N
...

French

#  freq word
1622928 de
1622619 je
1348809 est
1128894 pas
1093232 le
1043411 vous
992154  la
927396  tu
909177  que
853927  un
...

Spanish

#  freq word
1109867 de
677127  la
517925  que
514187  y
498562  el
455194  en
358662  a
303229  los
232670  se
204272  las
...

Catalan

The original source is Softcatala

#  freq word
6010951 de
4994785 la
4657836 el
4004551 i
3896615 que
3374723 a
3284365 un
2373551 l
2140801 en
1981974 va
...

If you would like to collaborate with another language, feel free to send me a message or pull request.

About

A repository of words in multiple languages sorted by their frequency

License:GNU General Public License v2.0