steggie3 / loan-default-prediction

Loan Default Prediction Machine Learning Project

Repository from Github https://github.comsteggie3/loan-default-predictionRepository from Github https://github.comsteggie3/loan-default-prediction

Loan Default Prediction Machine Learning Project

This is an exploratory project for me to apply and compare different ML models and techniques, including:

  • Feature Engineering
    • One-hot encoding for categorical features
    • Normalization/standardization
    • Imputation
    • Feature expansion
    • Feature reduction:
      • Feature hashing
      • Feature selection
      • Principal component analysis
    • Feature discretization with Decision Trees or Random Forests
  • Machine Learning Models:
    • Logistic Regression
    • Decision Trees, Random Forest, Gradient-Boosted Decision Trees
    • K-Nearest-Neighbor
    • Support Vector Machines
    • Neural Networks

The data is from a Kaggle competition Loan Default Prediction.

Dependencies

Python 3, numpy, pandas, scikit-learn, matplotlib, xgboost, tensorflow, keras.

Usage

Download train_v2.csv from https://www.kaggle.com/c/loan-default-prediction/data and put in the loan-default-prediction directory. Running the first Jupyter notebook, LDP 01 - Data Preprocessing.ipynb will give you a few processed CSV files, which are used in subsequent notebooks as the training data. The rest of the notebooks do not have strong dependencies.

About

Loan Default Prediction Machine Learning Project


Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%