Movie-Recommender

The recommender systems are used in several services such as Netflix, Amazon Prime. We often rate products on internet and all the preferences we express and data we share (explicitly or not), are used by recommender systems to generate, in fact, recommendations. The two main types of recommender systems are :

Collaborative Filters
Content-based filters

Content Based Filter

This type of filter does not involve other users if not ourselves. Based on what we like, the algorithm will simply pick items with similar content to recommend us. In this case there will be less diversity in the recommendations, but this will work either the user rates things or not.

Dataset : Top 250 top rated movies from IMDB. In fact, 250 observations (the movies) and 38 columns. However, I want my recommender to be based only on the movie director, main actors, genre and plot, so these are the only columns I considered in the modeling.
Similarity Function : Cosine Similarity ie similarity between two vectors u and v is the ratio between their dot product and the product of their magnitudes.

Collaborative Filter

This type of filter is based on users’ rates, and it will recommend us movies that we haven’t watched yet, but users similar to us have, and like. To determine whether two users are similar or not, this filter considers the movies both of them watched and how they rated them. By looking at the items in common, this type of algorithm will basically predict the rate of a movie for a user who hasn’t watched it yet, based on the similar users’ rates.

Dataset : MovieLens Datasets. It contains 27,753,444 ratings and 1,108,997 tag applications across 58,098 movies. These data were created by 283,228 users between January 09, 1995 and September 26, 2018. The ratings are on a scale from 1 to 5. Used only two files from MovieLens datasets: ratings.csvand movies.csv.
Similarity Function : K-Nearest Neighbours to find similar movies

About

https://bisariautkarsh.github.io/Movie-Recommender/

Languages

Language:Jupyter Notebook 100.0%