doguilmak / Book-Recommendation-with-Collaborative-Filtering

In this project, with Pearson correlation, book recommendation algorithm builded to make recommendation between users by their ratings.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Book Recommendation with Collaborative Filtering


GitHub

Picture Source: Jessica Stillman


Introduction

In the realm of book recommendation with collaborative filtering, Pearson correlation is a fundamental statistical measure employed to quantify the similarity between the preferences of different users. Collaborative filtering, the core technique of this project, aims to predict a user's book interests by leveraging the preferences and behaviors of users with similar tastes.

The Pearson correlation coefficient ($ρ$) is a statistical measure that quantifies the linear relationship between two variables, X and Y. The formula for calculating Pearson correlation is as follows:


$$ \rho = \frac{\sum{(X_i - \bar{X})(Y_i - \bar{Y})}}{\sqrt{\sum{(X_i - \bar{X})^2} \sum{(Y_i - \bar{Y})^2}}} $$


Here's a breakdown of the terms in the formula:

  • $\rho$: Pearson correlation coefficient.

  • $X_i$ and $Y_i$: Individual data points in the datasets X and Y.

  • $\bar{X}$ and $\bar{Y}$: Mean (average) of the respective datasets X and Y.

The numerator represents the sum of the product of the differences between each data point and the mean of its respective dataset. The denominator involves the square root of the product of the sums of squared differences from the mean for both datasets.

The resulting Pearson correlation coefficient ranges from -1 to 1:

  • $\rho = 1$: Perfect positive correlation.

  • $\rho = -1$: Perfect negative correlation.

  • $\rho = 0$: No linear correlation.


In collaborative filtering for book recommendations, Pearson correlation is commonly used to measure the similarity between user preferences based on their ratings. A positive correlation suggests similar tastes, while a negative correlation implies dissimilar preferences.


Getting Started

To kick off this project, start by importing essential libraries like Pandas, NumPy, and warnings. Load the books and ratings dataset using Pandas. In the data cleaning phase, select relevant columns (e.g., 'ISBN,' 'Book-Title,' 'Book-Author,' 'Book-Rating') and eliminate duplicate book titles for improved data quality.

For collaborative filtering, first, implement User-Based Collaborative Filtering by grouping data by 'User-ID,' sorting by book title, calculating Pearson correlation coefficients between users, and selecting the top correlated users. Move on to Item-Based Collaborative Filtering, aggregating ratings, generating recommendations based on weighted scores, and displaying the top book recommendations. Evaluate the collaborative filtering models for performance metrics and showcase the top recommended books to users. These steps lay the groundwork for a successful implementation of collaborative filtering for personalized book recommendations.


Usage

  1. Clone the repository to your local machine.
  2. Load and preprocess the book and rating datasets.
  3. Implement collaborative filtering algorithms to generate book recommendations.
  4. Evaluate the performance and present the results.

Contribution Guidelines

Contributions to this project are encouraged. Feel free to contribute by optimizing algorithms, improving data preprocessing, or enhancing the recommendation performance.


Contact Me

If you have something to say to me please contact me:

  • Twitter: Doguilmak
  • Mail address: doguilmak@gmail.com

About

In this project, with Pearson correlation, book recommendation algorithm builded to make recommendation between users by their ratings.


Languages

Language:Jupyter Notebook 82.2%Language:Python 17.8%