Dushanthimadhushika3 / TF-IDF

TF-IDF stands for “Term Frequency — Inverse Document Frequency”. This is a technique to quantify a word in documents, we generally compute a weight to each word which signifies the importance of the word in the document and corpus. In here I have used TF-IDF on sinhala documents and try to identify similarity between two sets of documents. Final output shows the query document, highest similarity with given documents and similar document number.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This repository is not active

About

TF-IDF stands for “Term Frequency — Inverse Document Frequency”. This is a technique to quantify a word in documents, we generally compute a weight to each word which signifies the importance of the word in the document and corpus. In here I have used TF-IDF on sinhala documents and try to identify similarity between two sets of documents. Final output shows the query document, highest similarity with given documents and similar document number.


Languages

Language:Jupyter Notebook 100.0%