realnitinworks/IMDB-1000-movies-EDA

data-science data-visualization data-analysis exploratory-data-analysis data-cleaning data-wrangling pandas matplotlib seaborn

Project

This project explores the dataset of 1000 most popular movies from the IMDB database during the period of 2006-2016. The project is divided into 5 main components:

Problem Statement
Data Wrangling
Questions and EDA
Conclusion
Actionable Insights
Communicate

The components from 1 through 5 are captured in Jupyter Notebook. Component 6 is done through Presentation and Voice Overlay of Presentation (You will have to download these from the links)

Components

Problem Statement

After collecting some initial questions, I came up with a hypothetical problem: In 2017, a certain production company, ABC decides to produce movies that will earn the best in terms of revenue, popularity and acclaim. This company approaches agency, XYZ and asks them to come up with characteristics of movies that will help them achieve their purpose.

Data Wrangling

I gathered the data, examined and cleaned it to make it ready for EDA.

Questions and EDA

Then I added more questions that aligns with the Problem Statement. Used these questions to explore the data using descriptive statistics and visualization. Noted down my findings from the exploration.

Conclusion

I drew conclusions from my data exploration in this section.

Actionable Insights

In this section, I came up with actionable insights from the exploration and conclusion to solve the Problem Statement.

Communicate

Finally, I communicated my results through Presentation and Voice Overlay.

About

This project explores the 1000 most popular IMDB movies between 2006 to 2016. At the end of the project, certain actionable insights are drawn based on the exploration.

data-science data-visualization data-analysis exploratory-data-analysis data-cleaning data-wrangling pandas matplotlib seaborn

Languages

Language:Jupyter Notebook 55.2%Language:HTML 44.8%