djinn-anthrope / Wiki_Search_Engine

Search Engine for Wikipedia Dump

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wikipedia Search Engine

To run the search engine, please run the following commands:

index.sh is run as follows ./index.sh <path_to_dump> <path_to_index_folder>. search.sh is run as follows ./search.sh <path_to_index_folder> <path_to_input_query_file> <path_to_output_file>.

Please note that the data dump has not been pushed.

This project relies on the concepts of inverted index creation, hash mapping, creating tries and page ranking algorithms based on TF-IDF.

About

Search Engine for Wikipedia Dump


Languages

Language:Python 99.1%Language:Shell 0.9%