faisal-irzal / titanic_survival_pred

This repo guides you to to build predictive models of Titanic survival, including data-viz & pre-processing, feature analysis, building predictive models and performance evaluation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Titanic Survival Prediction

This is part of the Kaggle Machine Learning competition where contestants are asked to predict the survival of the Titanic disaster using own choice of ML method(s). In this repository, I will explain how I succeeded to build predictive models which put me in the top 10% out of almost 20,000 contestants.

The following steps are used:

  1. Data visualization & pre-processing Here, all given datasets were described using descriptive statistical methods, missing data are treated with some kind of imputation, resulting data are visualized and analysed
  2. Feature analysis All features and correlation between features are described. The required datasets are made ready to build the predictive models
  3. Predictive analysis using machine learning techniques Selected machine learning techniques are explained, classifiers are trained based on the training dataset, the models performance are evaluated utilizing K-Fold cross validation based on the training dataset.

The best classifier are chosen based on the resulting performance analysis on step 3. This classifier are used to predict the survival from the test dataset.

Titanic_1st_entry

Visit the following link to get a lite version of the report.

Titanic Survival Prediction Report

About

This repo guides you to to build predictive models of Titanic survival, including data-viz & pre-processing, feature analysis, building predictive models and performance evaluation.


Languages

Language:Jupyter Notebook 100.0%