Titanic Survivor Prediction
Table of contents
Introduction
Project made in Jupyter Notebook with Kaggle Titanic dataset, which aims at detailed data analysis and prediction of which passengers survived the sinking of the Titanic.
Data description
pclass: A proxy for socio-economic status (SES)
1st = Upper
2nd = Middle
3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way...
Sibling = brother, sister, stepbrother, stepsister
Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way...
Parent = mother, father
Child = daughter, son, stepdaughter, stepson
Some children travelled only with a nanny, therefore parch=0 for them.
Methods used
- Cleaning Data
- Statistical Inference
- Exploratory Data Analysis (EDA)
- Data Visualization
- Supervised Machine Learning Algorithms: Logistic Regression, Random Forest, Naive Bayes, K-nearest Neighbors, SVC
Technologies used
- Python 3.8.8
- Pandas 1.2.4
- Matplotlib 3.3.4
- Seaborn 0.11.1
- Sklearn 0.24.1