kennedyCzar / NLP-PROJECT-BOOK-INSIGHTS-WITH-PLOTLY

Plotly-Dash NLP project. Document similarity measure using Latent Dirichlet Allocation, principal component analysis and finally follow with KMeans clustering. Project is completed with dynamic visual interaction.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NLP-PROJECT-BOOK-INSIGHTS-WITH-PLOTLY

forthebadge made-with-python

Project is hosted live on Heroku Hosted Live

Project implements machine learning model for Natural Language Processing (NLP). Visualization is done with Plotly Dash. Flexibility of hovering over data points to visualize book properties (meta-data) and similarity score, horizontal bar chart and book imprint. Major processing on books to extract tokenized and lemmatized features, principal component analysis for dimension reduction, and Kmeans clustering to visualize relationship among books. Project is hosted live on heroku.

PROJECT WORKFLOW

  • Import and preprocess all 148 French books
  • Stemming & Lemmatization of extracted tokens
  • Visualize most frequent words on hover. Return ordered Barplot
  • TF-IDF Model
  • Document Similarity using Cosine distance of book content
    • Principal component analysis
      K-Means clustering
  • Topic Models
    • LatentDirichletAllocation

HOW TO USE

git clone https://github.com/kennedyCzar/NLP-PROJECT-BOOK-INSIGHTS-WITH-PLOTLY

Open the script folder in your terminal and run the following command

python mplot_script.py
Navigate http://127.0.0.1:8050/ 

Image 1

About

Plotly-Dash NLP project. Document similarity measure using Latent Dirichlet Allocation, principal component analysis and finally follow with KMeans clustering. Project is completed with dynamic visual interaction.

License:MIT License


Languages

Language:Python 100.0%Language:Procfile 0.0%