deep-learning machine-learning nextjs numpy react tensorflow typescript

Neural network content-based Recommendation model

Live version

Screencast.from.06-02-2024.15.53.56.webm

Using users (e.g., preferences for each of the defined genres (0-5)) and movies (e.g., binary attribute indicating whether or not a movie is part of a certain genre (1.0)) content, train the model to predict movie recommendations. In this case specifically, since we are not going to have users, we are going to focus on how similar a movie is to other movies.

The dataset contains:

847 movies
397 users
14 genres
25,521 ratings

Feature engineering is applied repeating some ratings to boost underrepresented genres.

Goal

Compute the distance between movies using the features trained by the neural network model. This distance is saved in a 2D matrix so that the calculation of the distance is as follows:

i: Movie_i index in the matrix

j: Movie_j index in the matrix

distance: matrix[i][j]

In order to improve the accuracy of the recommendation, instead of just picking the closest movie from movie_i, say movie_j, we can take a list of movies and compute the sum of the distance from the movies in the list with each movie in the dataset. After that, we take the least sum.

One disadvantage of the approach defined above is that it might give inaccurate predictions for a movie list that is too sparse, i.e., movies are not similar. Otherwise, it can be efficient for finding a similar movie with more accuracy.

The UI implementation follows the exact method defined above, allowing the user to pick 3 movies.

Results

You can find the recommendation UI deployed here

In the output folder you can find:

recommendations_movies_data.json: The data for all movies used for training and for prediction. This is generated by the scripts/populate-recommendations.js file.
recommendations.json: The recommendations matrix used for generating recommendations. This is the result from the training notebook recommender.ipynb.
top_movies.json: The list of the most rated movies ids, i.e, most popular. This is generated by the scripts/top-most-rated.js file. Also, these are the movies displayed on the UI.

More details on the recommender notebook.

Credits

Most of the resources come from assignments on the Machine Learning Specialization.

The dataset is derived from the MovieLens ml-latest-small dataset.

F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4: 19:1–19:19.

About

🔥 A machine learning recommendation application.

https://movies-recommender-tau.vercel.app

deep-learning machine-learning nextjs numpy react tensorflow typescript

Languages

Language:Jupyter Notebook 78.9%Language:TypeScript 11.9%Language:JavaScript 4.0%Language:CSS 3.2%Language:Python 2.1%