YamanAlBochi / Bank-Predictive-Machine-Learning-Model

A Bank wants to make use of machine learning to assess the creditworthiness of an applicant by implementing a model that will predict if the potential borrower will default on his/her loan or not, and do this such that they receive a response immediately after completing their application.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bank-Predictive-Machine-Learning-Model

Background:

A Bank plans to leverage cutting-edge technologies to provide their clients with a full range of services through the comfort of their mobile devices as it embraces the wave of digital transformation. The bank, which is Canada's largest lender by assets, wants to enhance the present procedure for house loan applications. Home loan applications must currently be processed manually by loan officials. A decision on whether or not to give the applicant the loan for the required amount will be communicated to them after this procedure, which takes 2 to 3 days. By implementing a model that predicts whether a potential borrower will default on his or her loan or not, the Bank wants to use machine learning to evaluate an applicant's credit worthiness and streamline the process so that the applicant hears back right away after submitting their application.

Context:

  • We should understand the CRoss Industry Standard Process for Data Mining (CRISP-DM), and have an idea of the business needs.

  • Understand the data.

  • Prepare for modelling.

  • Train a model.

In this project, I will make use of automated machine learning as well as traditional machine learning. The Stakeholders, wants to know a few things about the data, also in understanding what machine learning really is with particular use case. I will use Python and its extensive collection of libraries to derive valuable insights from the data, prepare the data and train machine learning models - the old fashioned way and in newer, automated ways.

Tasks:

The Department manager wants to know the following:

  1. An overview of the data. (HINT: Provide the number of records, fields and their data types. Do for both).

  2. What data quality issues exist in both train and test? (HINT: Comment any missing values and duplicates)

  3. How do the loan statuse's compare? i.e. what is the distrubition of each?

  4. How do women and men compare when it comes to defaulting on loans in the historical dataset?

  5. How many of the loan applicants have dependents based on the historical dataset?

  6. How do the incomes of those who are employed compare to those who are self employed based on the historical dataset?

  7. Are applicants with a credit history more likely to default than those who do not have one?

  8. Is there a correlation between the applicant's income and the loan amount they applied for?

About

A Bank wants to make use of machine learning to assess the creditworthiness of an applicant by implementing a model that will predict if the potential borrower will default on his/her loan or not, and do this such that they receive a response immediately after completing their application.


Languages

Language:Jupyter Notebook 100.0%