rogersmarin / wikipedia-extractor

Extracts and cleans text from Wikipedia database dump and stores output in a number of files of similar size in a given directory. This is a mirror of the script by Giuseppe Attardi.

Home Page:http://medialab.di.unipi.it/wiki/Wikipedia_Extractor

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rogersmarin/wikipedia-extractor Stargazers