Dushanthimadhushika3 / TF-IDF

TF-IDF stands for “Term Frequency — Inverse Document Frequency”. This is a technique to quantify a word in documents, we generally compute a weight to each word which signifies the importance of the word in the document and corpus. In here I have used TF-IDF on sinhala documents and try to identify similarity between two sets of documents. Final output shows the query document, highest similarity with given documents and similar document number.

Geek Repo

Github PK Tool

nlp python3 nltk sinhala-language tf-idf

This repository is not active

About

nlp python3 nltk sinhala-language tf-idf

Languages

Language:Jupyter Notebook 100.0%