Harshpatel44 / Cosine-Text-Similarity-Algorithm

This repository contains Cosine text similarity algorithm to compare 2 documents

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cosine-Text-Similarity-Algorithm

This repository contains Cosine text similarity algorithm to compare 2 documents.

So if we want to compare 2 files i.e. 'file1' and 'file2', and we have 3rd file containing all the tokens in file 'term_list'
This algorithm works on the formula : (a*b) / ||a|| . ||b||
Here a and b is the frequency of 'file1' and 'file2' respectively.

About

This repository contains Cosine text similarity algorithm to compare 2 documents


Languages

Language:Python 100.0%