xadityax / Search-engine-TF-IDF

Offline search engine for any corpus. Uses TF-IDF scores for ranking.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Search Engine

Uses td-idf scores and dot products to calculate similarity between user query taken as input and documents in the corpus that have been indexed.

Ranks according to similarity scores and displays top K most similar documents.

Input can be any kind of textual data that is supposedly present in the corpus.

Corpus can be collection of documents.

About

Offline search engine for any corpus. Uses TF-IDF scores for ranking.


Languages

Language:Python 65.4%Language:HTML 34.6%