hassansahhin / Genres-Analysis---EDA-Clustring

Building a movies recommendation system based on genres column in the kaggle TMDB data set

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Genres-Analysis---EDA-Clustring

TMDB data set

  • Kaggle data set The movies data base contain over 10,000 movies contain sevral information about New columns:

    homepage id original_title overview popularity production_companies production_countries release_date spoken_languages status tagline vote_average

  • the CSV files could be found Here

Overview

  • The notebook was build genreally on the genres column to create a well polished EDA graphs using matplotlib package
    import matplotlib.pyplot as plt

  • The secound task is to create a movie recommendation system based on the movies generes using sklearn Kmean algorithm
    from sklearn.cluster import KMeans

Contribution

You are most wellcome to fork my notebook and update my code , below some inspiration points could be worked on :

  • Can you categorize the films by type, such as animated or not? We don't have explicit labels for this, but it should be possible to build them from the crew's job titles.
  • How sharp is the divide between major film studios and the independents? Do those two groups fall naturally out of a clustering analysis or is something more complicated going on?

Original notebook

The original notebook was build on kaggel karnel please visit my notebook Here and upvote if you found some thing useful.

About

Building a movies recommendation system based on genres column in the kaggle TMDB data set


Languages

Language:Jupyter Notebook 100.0%