A wikipedia search engine
- pyStemmer: fast stemming module
- xml.sax: scalable XML parser
- nltk.corpus: for using english stopwords
- index.sh: main file for executing the indexing code
- indexer.py: main python file that handles the flow for indexing
- handler.py: file handler file that fetches given file paths and writes indexing files
- wikiProcessor.py: python file for processing a Wikipedia page
- textProcessor.py: python file for simple text processing