nustalgic / Logistic-Regression-and-Imbalanced-Learning

Used a dataset of historical lending activity from a peer-to-peer lending services company to build a model that can identify the creditworthiness of borrowers. Using my knowledge of the imbalanced-learn library, I used a logistic regression model to compare two versions of the dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Imbalanced-Learn & Logistic Regression

Module 12 Challenge

Screen-Shot-2022-03-27-at-8-39-09-PM.png

Background

Credit risk poses a classification problem that’s inherently imbalanced. This is because healthy loans easily outnumber risky loans. In this Challenge, I’ll use various techniques to train and evaluate models with imbalanced classes. I’ll use a dataset of historical lending activity from a peer-to-peer lending services company to build a model that can identify the creditworthiness of borrowers.


Technologies

The data we're analyzing comes from a jupyter notebook that we'll create and import files to. We'll be using Python to run and read our data.


Installation Guide

In order for us to get the data we need we must import pandas, plots and the csv files we want to observe.

# Import the modules
import numpy as np
import pandas as pd
from pathlib import Path
from sklearn.metrics import balanced_accuracy_score
from sklearn.metrics import confusion_matrix
from imblearn.metrics import classification_report_imbalanced
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from imblearn.over_sampling import RandomOverSampler

import warnings
warnings.filterwarnings('ignore')

Usage


Evaluating the model’s performance by doing the following:

  • Calculate the accuracy score of the model.

  • Generate a confusion matrix.

  • Print the classification report.

# Print the balanced_accuracy score of the model 
accuracy = balanced_accuracy_score(y_test, y_pred_test2)
print(accuracy)

Screen-Shot-2022-03-27-at-8-34-39-PM.png


# Generate a confusion matrix for the model
training_matrix2 = confusion_matrix(y_test, y_pred_test2)
training_matrix2

Screen-Shot-2022-03-27-at-8-34-21-PM.png


# Print the classification report for the model
training_report2 = classification_report_imbalanced(y_test, y_pred_test2)
print(training_report2)

Screen-Shot-2022-03-27-at-8-34-08-PM.png


Contributors

Brought to you by Elgin Braggs Jr.


License

MIT

About

Used a dataset of historical lending activity from a peer-to-peer lending services company to build a model that can identify the creditworthiness of borrowers. Using my knowledge of the imbalanced-learn library, I used a logistic regression model to compare two versions of the dataset.

License:MIT License


Languages

Language:Jupyter Notebook 100.0%