zixuan-x / neo4j-data-processing

Analyze movies related data using neo4j and more!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Contributors Forks Stargazers Issues MIT License LinkedIn


Neo4j Data Processing

Analyze movies related data using neo4j and more!
Explore the docs »
View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Usage
  3. Contributors
  4. Contributing
  5. License
  6. Contact
  7. Acknowledgements

About The Project

This big data project utilizs Neo4j Graph Database to analyze Movies data from IMDb to provide insight about movies, actors, and customers.

We ask questions like:

  • What is the max/min/avg of the MovieLens ratings for “Avatar”?
  • How many directors Christian Bale and Michael Caine worked with?
  • Which actor/actresses does a specific customer probably like the most, who we can recommend to this customer in the future, based on all the movies he/she rated?

Here's the project report and a visual representation of a small fraction of data in the database: Graph Database Visual

Built With

Usage

Use the following instructions to achieve the same database state.

Prerequisites

  • python3 is used for data cleansing and validation.
    brew install python
  • pipenv is used for modern dependency management
    brew install pipenv

Instructions

  1. Download the required dataset.

  2. Cleanse and validate the above dataset with provided python3 code (or you can directly download validated data directly from here).

  3. Copy the datasets into Neo4j import folder and open Neo4j browser.

  4. Follow the comment and run the cypher code inside the load-csv-v2.cypher file to import the data into the Neo4j database.

  5. Execute cypher queries in the business-questions-v2.cypher file or write your own queries.

Contributors

This project is a collective effort of the following members:

  • @Karashan
  • @StrongWeiUMN
  • @LuyaoZhang5380
  • @zixuanzhang98

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Zixuan Zhang - zixuanzhang.x@gmail.com

Project Link: https://github.com/zixuanzhang98/neo4j-data-processing

Acknowledgements

About

Analyze movies related data using neo4j and more!

License:MIT License


Languages

Language:Jupyter Notebook 98.6%Language:Python 1.3%Language:Shell 0.1%