vsinha / wiki_redirect_parser

parse wikipedia xml dumps to find all pages which redirect to a given page title

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wikipedia XML Dump Redirect Parser

In which we parse massive wikipedia data dumps to find all pages that redirect to a given page.

Usage:

$ python wikiparse.py <file.xml> "Search Query"

XML files can be found here: https://dumps.wikimedia.org/enwiki/latest/

(I used "enwiki-latest-abstract.xml")

Happy parsing!

About

parse wikipedia xml dumps to find all pages which redirect to a given page title


Languages

Language:Python 100.0%