cj2001 / spooky_movie_graph

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kaggle Challenge: Is a Movie Really Cursed???

This repository is in support of the Kaggle Cursed Movie challenge. The goal of this challenge is to try and predict whether a movie is cursed or not based on a variety of graph features within Neo4j.

Contact Info

Name: Clair J. Sullivan, Data Science Advocate at Neo4j

Email: clair.sullivan@neo4j.com

Twitter: @CJLovesData1

Contents

  • ./cypher_queries/

    • create_rdf_data.cql: Cypher query to create the RDF data used in this challenge. You will not actually need to run this since the data is already provided (see below). This script just shows how the data file was created.
    • graph_import.cql: You will be creating one graph database populated with two different datasets (see below). The first 3 queries are used to pull in the RDF data. The final query is used to pull in the CSV data.
  • ./data/

    • movies-small.nt: This is the data file, in N-Triples format, of the starting movie list as collected from Wikidata and put into RDF format.
    • curse_data_mined.csv: This is supplemental information about the above movies that was hand collected.

References

Please create an issue in this repo if you find any errors!

About

License:MIT License