raklugrin01 / Book-Recommendation-with-EDA

This Repository contains the data analysis of Book crossing dataset with a machine learning implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DisasterTweets

Dataset Language ML Library Scipy

Project Description

In this Project we analyse and preprocess the Book Crossing Dataset collected by Cai-Nicolas Ziegler and apply Machine Learning to recommend different books from a book you previously read. Whole code below is in Python using various libraries. Open source library Scipy is used for preprocessing and Scikit-Learn is used for creating the model.

Project Contents

  1. Exploratory Data Analysis
  2. Different ways of building Recommendation system
  3. Model and flask Api

Resources Used

1. Exploratory Data Analysis

  • Visualising Explicit Rating Counts (for 1-10 rating value)

    Explicit Rating Counts
  • Visualising top 30 most read books

    top 30 most read books
  • Visualising top 30 most read books with there average ratings

  • Visualising top 30 years with most book being published

    Target Variable
  • Visualising top 30 authors with most books

    top 30 authors with most books
  • Visualising the age distribution of the users

    age distribution of the users
  • Extra Analysis

    • Some of the Plots and wordclouds which aren't present here can be found in Notebook

2. Different ways of building Recommendation system

  1. Popularity-based

    These simply recommend the most popular items to users. Popularity-based systems are simplest of all and have minimal computational requirements. However, as these systems do not make personalized recommendations based on specific user’s likes & behaviors, they tend to be less accurate than content-based or collaborative filtering based systems. This type of recommendation is performed in the notebook, the output i.e. 10 most popular books is

    Popularity-based
  2. Content-based

    Content-based systems depend on external information for creating user and item profiles and this information might not be easily available. Also, these do not take users behavioral information into account and discount the fact that user interest and preferences may change over time.

  3. Collaborative Filtering

    • Memory-based/ Neighborhood-based

      Memory Based recommendation systems can again be divided into two categories i.e. User Based and Item Based which can easily be implemented using similarity measure like Cosine similarity, Pearson similarity are used to find most similar items according to the Data

    • Model-based/Matrix Factorization

      Model-based Collaborative Filtering approach employs dimensionality reduction techniques like matrix factorization (Singular Value Decomposition — SVD, Principal Component Analysis- PCA and Latent Factor models) to discover hidden concepts and their relationship with users and items.

    • Hybrid Approach

      Memory-based and model-based collaborative filtering approaches can be combined in practice to exploit the benefits each of the approaches provide. Also, content-based and collaborative filtering approaches can be combined in various ways to achieve greater synergies between them.

3. Model and Flask Api

  • Model :-

    Scikit-Learn's Nearest Neighbors model is build under collaborative filtering approach. Also we use the Scientific computing library for creating compressed sparse row matrix(csr matrix) from pivot table and is used for modelling with a brute algorithm and cosine as metric

  • Flask Api :-

    1. Clone the Project and download Book_names_with_urlM.csv from the output section and put it in the directory containing model
      
        git clone https://github.com/raklugrin01/Book-Recommendation-with-EDA
      
    1. Install Flask
      
        pip install flask
      
    1. Run the python file
      
        python api.py
      
  • Testing result :-

  • We can see that for a Book Title as input the api returned us 10 books as the recommendations

    API

Refrences

Please do ⭐ the repository, if it helped you in anyway.

About

This Repository contains the data analysis of Book crossing dataset with a machine learning implementation

License:Apache License 2.0


Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%