vishank94 / Movie-Recommendation-System

Movie Recommendation System

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Table of Contents

  1. Introduction
  2. Approach
  3. Dependencies
  4. Running the Code
  5. Directory Structure

Introduction

This repository contains solution to coding challenge recommendation-system.

Approach

  1. This recommendation system uses data from IMDB/MovieLens dataset.
  2. Some concepts used here are - PageRank, Content-Based Recommendation, Collaborative Filtering.
  3. Machine learning terminology you'll come across here - One-Hot Encoding, Cross-Validation, R-squared metric.
  4. The notebook uses and compares Linear Regression and Decision Trees models to predict movie ratings for users.
  5. ML pipeline used here - data wrangling->exploratory data analysis->feature engineering->baseline model->best model.

Dependencies

Python libraries: re, ast, time, heapq, decimal, operator, subprocess, numpy, scipy, pandas, seaborn, networkx, rpy2, itertools, matplotlib, datetime, collections, networkx, sklearn, surprise

R libraries: doMC, Kmisc, igraph, data.table

Running the Code

  1. Jupyter notebook recommendationSystem.ipynb (Python kernel) is the master file.
  2. It makes use of:
  • wd_um_graph.txt generated by weightedDirectedUserGraph.ipynb (R kernel)
  • wu_movie_graph.txt generated by weightedUndirectedMovieGraph.ipynb (R kernel)
  1. The repository directory structure given below must be maintained for the code to run successfully.

Directory Structure

The directory structure for my repo is as follows:

├── README.md 
├── Data
│   └── u.data
│   └── u.genre
│   └── u.info
│   └── u.item
│   └── u.occupation
│   └── u.user
├── Files
|   └── *
├── Scripts
    └── recommendationSystem.ipynb
    └── weightedDirectedUserGraph.ipynb
    └── weightedUndirectedMovieGraph.ipynb