This repository contains the python code used for predicting the survivors.
titanic_EDA_18june2021.ipynb.
Highlights:
This notebook contains EDA
titanic_prediction_with_score_0.78708_21july2021.ipynb
This is my best Titanic prediction with score of 78.708% till now!
Highlights:
- Family feature created by summing Parch, SibSp
- Isalone feature created where Family = 0
- Title feature created from Name by extracting the titles from them and then grouping them
- Missing value of Embarked filled with mode value
- Missing value of Age filled with random numbers generated as per mean and standard deviation
- Bins created for age and farehyoer
- Columns dropped: Cabin, Ticket,Age,Name,Fare
- Used Random Forest, Logistic Regression and XGBoost for prediction
Future efforts:
- Changing the missing value imputation for Age
- Keeping the cabin and ticket features
- Feature selection
- Stacking results from different models