kapsali29 / MovieLensNeo4j

This repository process movies data (cast, crew, movies, genres, keywords) and load the process output data to Neo4j

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Movie Lens Dataset and Neo4j

  • config/: contains docker compose script
  • cypher/: contains cypher code
  • data/: contains movie lens dataset
  • data_preprocessing.py: data preprocessing python script
  • requirements.txt: Python libraries used
  • utils.py: contains utilities used in this project

Data Setup

  1. Create data/ folder
  2. Place the csv files downloaded from this link https://www.kaggle.com/rounakbanik/the-movies-dataset to data folder

Data preprocessing

To execute preprocessing over movie lens dataset execute the following command:

python data_preprocessing.py

API to receive results from Neo4j Queries

Run the API using this command: python -m flask run --reload

POST /query HTTP/1.1
Host: localhost:5000
Content-Type: application/json

{
    "query": "MATCH (m:Movie)-[:HAS_GENRE]->(g:Genre {name: 'Mystery'}) WHERE m.budget > 60000000 RETURN m"
}

Get similar movies

POST /jaccard/similarity HTTP/1.1
Host: localhost:5000
Content-Type: application/json

{
    "title": "Cloud Atlas"
}

About

This repository process movies data (cast, crew, movies, genres, keywords) and load the process output data to Neo4j


Languages

Language:Python 100.0%