Lsuantah / Credit_Risk_Analysis

The objective of this analysis was to use machine learning models to accurately predict credit risk.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Credit Risk Analysis

Overview of the analysis:

The objective of this project is use Machine learning techniques such as Resampling, SMOTEENN and Emsemble classifiers to analyzed LoanStats data and predict the best possible credit risk outcome.

Results:

Random Oversampling

  • Balanced Accuracy Score: 63.39%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 69%
  • Recall Score low risk: 61% randomoversampler

SMOTE Oversampling

  • Balanced Accuracy Score: 63.07%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 60%
  • Recall Score low risk: 66%

SMOTE

Cluster Centroids

  • Balanced Accuracy Score: 52.95%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 61%
  • Recall Score low risk: 45%

CLUSTERCENTROID

SMOTEENN

  • Balanced Accuracy Score: 63.76%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 70%
  • Recall Score low risk: 57%

SMOTEEN

Random Forest Classifier

  • Balanced Accuracy Score: 78.37%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 67%
  • Recall Score low risk: 89%

Randomforest

Easy Ensemble Classifier

  • Balanced Accuracy Score: 91.78%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 89%
  • Recall Score low risk: 94%

EasyEnsemple

Summary:

From all the machine learning model, Easy Ensemble Classifier produced the most accurate with the best predictions for loans at all risk levels. The second best is the Random Forest Classifier

  1. Easy Ensemble Classifier

Balanced Accuracy Score: 91.78%

Recall Score high risk: 89%

Recall Score low risk: 94%

  1. Random Forest Classifier

Balanced Accuracy Score: 78.37%

Recall Score high risk: 67%

Recall Score low risk: 89%

I would recommend Easy Ensemble Classifier to be used as it gives the most accurate prediction from all the models.

About

The objective of this analysis was to use machine learning models to accurately predict credit risk.


Languages

Language:Jupyter Notebook 100.0%