Exploratory Data Analysis of Titanic dataset
The Titanic challenge on Kaggle is a competition in which the goal is to predict the survival or the death of a given passenger based on a set of variables describing them by their age, their sex, or their passenger class on the boat.
I have been playing around with the data and this notebook details the exploratory data analysis(EDA)steps.
- Data extraction : I will load the dataset and have a first look at it.
- Data cleaning : I will fill in missing values.
- Plotting : I will create charts (bar,histogram and scatter) and use them to spot correlations of the data.
The main libraries I used are:
- Numpy for multidimensional array computing
- Pandas for data manipulation
- Matplotlib and Seaborn for data visualization