jaskier07 / DocumentComparator

Compares PDF documents and visualizes similarity using graph. Documents are represented as TF-IDF vector and their similarity is based on cosinus similarity. Visualization is done using Python's library Dash.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Needed modules:
pip install pdfminer
pip install nltk
pip install numpy
pip install sklearn

About

Compares PDF documents and visualizes similarity using graph. Documents are represented as TF-IDF vector and their similarity is based on cosinus similarity. Visualization is done using Python's library Dash.

License:MIT License


Languages

Language:Python 96.5%Language:CSS 3.5%