Humana Mays Case Competition 2020

About the Competition:

Intro:

With the rapid pace of social development, we humans have enjoyed an easier way of living and commuting. Meanwhile, we are exposed to a growing number of health-threatening substances and sites, which makes it harder for the medicare industry to evaluate health and its cost. Fortunately, the introduction of Humana’s integrated value-based health ecosystem based on Social Determinants of Health brings a promising prospect to this industry. Among all those challenges to health, transportation challenge stands out to be the most common one since it involves daily life of a person’s commuting, schooling, and living. The pain point lies in the aspect of prediction and optimal choices of model fitting. With a large amount of robust data provided by Humana, this analysis identifies MediCare members who are most likely to experience Transportation Challenges and propose viable solutions. This analysis is intended to find the interconnection between our socio-economic and community environments and lifestyle behaviours, based on which employees, patients, clients are better medically assisted.

For a detailed analysis, please refer to the report.

-- Project Status: [Completed]

-- Competition Status: [Top 50 teams]

Methods Used:

Exploratory Data Analysis (EDA)
Missing value imputation
Categorical variable encoding
Feature engineering
Hyperparameter tunning

Models Used:

Logistic Regression Classifier
Random Forest Classifier
XGBoost Classifier

Model Performance Comparison:

Model	Average AUC
Logistic Regression Classifier	0.739
Random Forest Classifier	0.720
XGBoost Classifier	0.736

The values presented above are calculated using 10-folds cross-validation on the training set.

Tools/Packages Used:

Tools:

R for preliminary data inspection
Interactive Python in Jupyter Notebook

Packages:

Pandas for data wrangling and manipulation
NumPy for arithmetic operation
matplotlib for data visualization
Category Encoders for encoding categorical variables
scikit-learn for building Random Forest & Logistic Regression classifiers, selecting features, tunning hyperparameters and evaluating performance
XGBoost for building XGBoost classier

Project Member:

Yicheng Huang: @Yicheng Huang
Michael Tang: @Michael Tang
Shangwen Yan: @Shangwen Yan

yichenghuang980 / Humana-Mays-Case-Competition