Perl scripts to rank Wikipedia page titles
This collection of Perl scripts will create files of ranked WIkipedia pages along the lines of those at http://crosswordnexus.com/wiki. To use:
- Download and extract https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
- Download and extract https://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2
- Run perl WikiExtract.pl to create WikiMonYr.storable
- Run perl WiktionaryExtract.pl to create WiktionaryMonYr.storable
- Run perl final_rankings.pl WikiMonYr.storable to create RankedWiki.txt and FamousNames.txt
- Run perl wiktionary_final_rankings.pl WiktionaryMonYr.storable to create RankedWiktionaryNoInflections.txt
- Run python WiktionaryInflect.py to create RankedWiktionary.txt
- Run perl combine_wiki_wikt.pl to create RankedWikiWikt.txt