tabaraei / Recommender-System

Implementing a Recommender system using Matrix Factorization Collaborative Filtering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Recommender System

Implementing a Recommender system using Matrix Factorization Collaborative Filtering

rs

In this project, our goal is to recommend top 5 movies to a user, based on Matrix Factorization, using MovieLens 20M dataset. You can download the dataset from kaggle. Four steps are taken through this project. Corresponding (.py) files should run in this order:

  • Preprocess the data (preprocess.py)
  • Data analysis (analyzie.py)
  • Create model (learning.py)
  • Predict user rating (predict.py)

1- Preprocess:

Since processing 20 million ratings takes a lot of time, we will use a subset of dataset. So our first step is to shrink data into a reasonable amount by choosing most common user and movies. Then, an id-correction is needed in order to fill dataset with identifiers starting from 0 to N-1. Finally, we will shuffle the data and divide dataset into training and test data. The result is shown as below:

preprocess

2- Data analysis:

A distribution of important data such as rating, movie genres and publication year of movies is plotted for better data understanding.

year

rating

genre

3- Create model

In this section, model will be created. Later we will plot results of our loss function, which is Mean Squarred Error (MSE) in this project. The model will be trained within 25 epochs.

epoch

After 25th epoch:

MSE

4- Predict user rating

For a specific user, ratings over unseen movies will be generated. Then we will recommend top 5 movies that user might like.

recommend

user taste

About

Implementing a Recommender system using Matrix Factorization Collaborative Filtering

License:MIT License


Languages

Language:Jupyter Notebook 97.3%Language:Python 2.7%