There are 2 repositories under movielens-data-analysis topic.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Movie Recommender based on the MovieLens Dataset (ml-100k) using item-item collaborative filtering.
Exploratory Dataset Analysis (EDA) will be uploaded to this repository. Libraries such as Pandas, Matplotlib, Seaborn and Plotly will be used for data analysis.
This repository contains analysis of IMDB data from multiple sources and analysis of movies/cast/box office revenues, movie brands and franchises
Building a movie recommender system with factorization machines on Amazon SageMaker.
Some code to analyse MovieLens’ datasets
MovieLens Dataset analysis using Hadoop and Pyspark
Implementation of Spotify's Generalist-Specialist score on the MovieLens dataset.
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using MongoDB.
Spark MLLIB: Collaborative Filtering Movie Recommendation System
Analysis of MovieLens Dataset in Python
Movie recommendation system based on Collaborative filtering using Apache Spark
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using Hadoop.
Analytics data for looking : filter movies with drama genre, most rated movies, number of users and average rating for each age range,etc. Visualize the count and age of moviegoers with the Matplolib library and Show movies, age range, average rating.
Recommendation systems are well-known machine learning systems that use data to predict and provide suggestions for an item or items in such a way that users can choose it from a huge number of items offered to them.
This is one of my final projects for the HarvardX Data Science Professional Certificate Program. As the title suggests, it is on the GroupLense database colloquially known as MovieLens. The goal of the project is to predict ratings with a RMSE below .86490. I was able to surpass the goal with 3 different models. Happy reading!
This is a project made as a part of my data science master's program to analyze and draw inference from Movielens data.
Created visualizations of the MovieLens data set using matrix factorization http://www.yisongyue.com/courses/cs155/2018_winter/assignments/project2.pdf
A recommendation algorithm capable of accurately predicting how a user will rate a movie they have not yet viewed based on their historical preferences. The models and EDA are based on the 1M MOVIELENS dataset
Contains my custom implementation of various machine learning models and analysis.
Data analysis and movie recommendation of OpenMovie dataset by using the shell, Python, Cosine Similarity algorithm, Apache PySpark, and Apache Hadoop.
Project to determine the ratings for a movie using each of the Spark & Hadoop Eco-system.