pengnam / SearchEngine

Creating a simple search engine that supports ranked retrieval, boolean retrieval, query expansion.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SearchEngine

How to re-build dataset

The data-set is too large and exceeds GitHub's limit of 50 MB file size limit.

I have split them using split -b 10m dataset.csv

To re-build the dataset, run this:

cat dataset/* > dataset/dataset.csv

About

Creating a simple search engine that supports ranked retrieval, boolean retrieval, query expansion.

License:GNU General Public License v3.0


Languages

Language:Python 100.0%