This is a Python-based implementation of two different types of machine learning models [mentioned below] on the task of "Home Credit Default Risk".
Home Credit Default Risk DataSet from Kaggle Competitions
!kaggle competitions download home-credit-default-risk
- Checking Missing Values (Data contains lots of null values and need to be clean or replace using Imputation Techniques)
- Checking Duplicate Data (The no. of duplicates in the data: 0)
- Data Visualization
- Feature Engineering Application Train Data
- Merging all 6 Datasets - Key = SK_ID_CURR
- Imputing Categorical & Numerical Data (SimpleImputer)
- Scaling Numerical Data (StandardScaler)
- Encoding Categorical Data (OneHotEncode)
- Class Balancing (RandomOverSampling)
Model Used - LGBMClassifier