hariprasath-v / Zindi_UmojaHack-India-Income-Prediction-Challenge

Create a machine learning model to predict whether an individual earns above 50,000 in a specific currency or not.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Zindi_UmojaHack-India-Income-Prediction-Challenge

Competition hosted on zindi.africa

About

Create a machine learning model to predict whether an individual earns above 50,000 in a specific currency or not.

Final Score is 0.613205338

Leaderboard Rank is 30

Evaluation Metric is F1 score.

File information

  • zindi-income-prediction-challenge-umojahack-eda.ipynb Open in Kaggle

    Basic Exploratory Data Analysis

    Packages Used,

     * seaborn
     * Pandas
     * Numpy
     * Matplotlib
    
  • zindi-income-prediction-challenge-umojahack-model.ipynb Open in Kaggle

    Data Pre-processing and model.

    Packages Used,

      * Sklearn
      * Pandas
      * Numpy
      * Matplotlib
      * catboost    
    

    Created catboost classifier model and evaluate the validation data with f1 score.

Catboost model optimal threshold(0.3862)

Alt text

Based on the optimal threshold,

Train data classification report,

Alt text

Validation data classification report,

Alt text

Catboost – SHAP feature importances

Alt text

Catboost – SHAP top feature impact

Alt text

SHAP Feature impact for single observation(class 0)

Alt text

SHAP Feature impact for single observation(class 1)

Alt text

About

Create a machine learning model to predict whether an individual earns above 50,000 in a specific currency or not.

License:Apache License 2.0


Languages

Language:HTML 60.9%Language:Jupyter Notebook 39.1%