hariprasath-v / DOCEREE_machine-learning-hackathon_round_1

Create a model that can accurately predict whether a user belongs to the HCP(Healthcare Professional) category or not. Based on server logs.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DOCEREE_machine-learning-hackathon_round_1

Leaderboard

  • Rank:4
  • Score:99.8824

Competition hosted on Techgig

Problem

Create a model that can accurately predict whether a user belongs to the HCP(Healthcare Professional) category or not. Based on server logs.

Evaluation

Evaluation metric for this competition is Accuracy.

Solution:

Exploratory Data Analysis

The basic exploratory data analysis of the data,

  • Target distribution
  • Categorical column level count

The above analysis had done by using,

  • pandas
  • numpy
  • seaborn
  • matplotlib

Model

Created catboost classifier model and tuned hyperparameters by using optuna framework. The model was evaluated by Accuracy.

Packages Used,

  * Sklearn
  * Pandas
  * Numpy
  * Matplotlib
  * catboost
  * optuna
  * shap

Catboost Model Feature Importances

Alt text

SHAP Catboost Model Feature Importances

Alt text

Catboost Model train and validation accuracy

Alt text

Catboost Model validation data confusion matrix

Alt text

File information

doceree-machine-learning-hackathon-1-eda.ipynbOpen in Kaggle

doceree-machine-learning-hackathon-1-model.ipynbOpen in Kaggle

About

Create a model that can accurately predict whether a user belongs to the HCP(Healthcare Professional) category or not. Based on server logs.

License:Apache License 2.0


Languages

Language:HTML 54.0%Language:Jupyter Notebook 46.0%