Datamining for Big Data hosted live on Heroku
Content based Recommedation system based on Graph mining and Unsupervised Machine learning method. The application is visualizes the similarity between the content of different publications and how this publications are similar in terms of content.
Using advance unsupervised machine learning models, we recommend research papers that are worth reading/consulting in terms of researching using the content of a prior paper. We use cosine measure as a metric to build the similarity matrix between different books.
We also use graph mining modelling in finding different communities which a book can be classified with highly related papers. Highly related research papers are grouped in the same communities using the graph modeling approach.
- Import and preprocess all 1269 French books
- Convert all english research papers to french
- Update/replace translated papers with french papers
- Stemming & Lemmatization of extracted tokens from each research paper
- Visualize most frequent words on hover and return output in barplot on Web App
- TF-IDF Model for vectorizing document into numerics
- Document Similarity using Cosine distance of paper content
- Clustering based Recommender System
- Kernel Principal component analysis
- Kernel KMeans clustering
- Graph based Recommender System
- Greedy approach
- Louvain Algorithm
- Clique Based Approach
- Web GUI Viusalization
- TSNE 2D Visualizaion
- Hover over data points to see
- Hover over data to see top recommendation based
- Bar chat for 15 most frequent words in research paper
git clone https://github.com/beteko/Social-Mining-Recommendation-System
Change the directory of the project to system directory format
Open the script folder in your terminal and run the following command
python mainapp.py
Navigate http://127.0.0.1:8050/
Live demo of the application is available on heroku
https://graphminerbigdata.herokuapp.com/
Due to the limited server hardware resources, the graph network may take time to load or possible crash. This is because heroku provides limited free space and uptime.