Movie Lens Dataset and Neo4j

config/: contains docker compose script
cypher/: contains cypher code
data/: contains movie lens dataset
data_preprocessing.py: data preprocessing python script
requirements.txt: Python libraries used
utils.py: contains utilities used in this project

Data Setup

Create data/ folder
Place the csv files downloaded from this link https://www.kaggle.com/rounakbanik/the-movies-dataset to data folder

Data preprocessing

To execute preprocessing over movie lens dataset execute the following command:

python data_preprocessing.py

API to receive results from Neo4j Queries

Run the API using this command: python -m flask run --reload

POST /query HTTP/1.1
Host: localhost:5000
Content-Type: application/json

{
    "query": "MATCH (m:Movie)-[:HAS_GENRE]->(g:Genre {name: 'Mystery'}) WHERE m.budget > 60000000 RETURN m"
}

Get similar movies

POST /jaccard/similarity HTTP/1.1
Host: localhost:5000
Content-Type: application/json

{
    "title": "Cloud Atlas"
}

About

This repository process movies data (cast, crew, movies, genres, keywords) and load the process output data to Neo4j

Languages

Language:Python 100.0%