kennedyCzar / Social-Mining-Recommendation-System

Datamining for Big Data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Social-Mining-Recommendation-System

Datamining for Big Data hosted live on Heroku Hosted Live

About Project

Content based Recommedation system based on Graph mining and Unsupervised Machine learning method. The application is visualizes the similarity between the content of different publications and how this publications are similar in terms of content.

Using advance unsupervised machine learning models, we recommend research papers that are worth reading/consulting in terms of researching using the content of a prior paper. We use cosine measure as a metric to build the similarity matrix between different books.

We also use graph mining modelling in finding different communities which a book can be classified with highly related papers. Highly related research papers are grouped in the same communities using the graph modeling approach.

Project Workflow

  • Import and preprocess all 1269 French books
    • Convert all english research papers to french
    • Update/replace translated papers with french papers
  • Stemming & Lemmatization of extracted tokens from each research paper
  • Visualize most frequent words on hover and return output in barplot on Web App
  • TF-IDF Model for vectorizing document into numerics
  • Document Similarity using Cosine distance of paper content
  • Clustering based Recommender System
    • Kernel Principal component analysis
    • Kernel KMeans clustering
  • Graph based Recommender System
    • Greedy approach
    • Louvain Algorithm
    • Clique Based Approach
  • Web GUI Viusalization
    • TSNE 2D Visualizaion
    • Hover over data points to see
    • Hover over data to see top recommendation based
    • Bar chat for 15 most frequent words in research paper

How to use

git clone https://github.com/beteko/Social-Mining-Recommendation-System
Change the directory of the project to system directory format

Open the script folder in your terminal and run the following command

python mainapp.py
Navigate http://127.0.0.1:8050/ 

Live Demo

Live demo of the application is available on heroku

https://graphminerbigdata.herokuapp.com/

Image 1

Challenges

Due to the limited server hardware resources, the graph network may take time to load or possible crash. This is because heroku provides limited free space and uptime.

About

Datamining for Big Data


Languages

Language:Jupyter Notebook 87.4%Language:Python 12.6%