SenZmaKi / NyakaMwizi

A credit card fraud detection machine learning model

Home Page:https://youtu.be/dQw4w9WgXcQ

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

NyakaMwizi is a machine learning model built to detect potentially fraudulent transactions

The dataset used contains 1.3M instances and 23 features

Table of Contents

  1. How to test out the model
  2. Visual Insights
  3. Final Model Performance

How to test out the model

Ensure you have Python 3.11 and Git installed.

Open a terminal and run the following commands.

  1. Set everything up.
  • Linux/Mac
git clone https://github.com/SenZmaKi/NyakaMwizi && cd NyakaMwizi && python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
  • Windows (Command Prompt)
git clone https://github.com/SenZmaKi/NyakaMwizi && cd NyakaMwizi && python -m venv .venv && .venv\Scripts\activate && pip install -r requirements.txt
  1. Test the model.
python test_model.py

Visual Insights

These are insights I gained as I was exploring the data-set with graphs and computations

They are in order of hierachy

Time

  • The time bracket under which the most fraudulent transactions occured is between 10:00PM and 4:00AM

Graph for frauds

image

Graph for non frauds

image

Amount

  • Contrary to what you'd expect, most fraudulent transactions didn't involve exorbitant amounts of money
  • Instead they involved both reasonably large amounts of money e.g 30k and average amounts of money

Graph for frauds

image

Graph for non frauds

image

Categories

  • Certain transaction categories appeared to be way more fraudulent, to be specific category 4 and 11

Graph for frauds

image

Graph for non frauds

image

Age

  • The age brackets that involved the most fraudulent transactions is 30 to 70
  • But the same can be said for non-fraudulent transactions so this insight may be a misinterpretation

Graph for frauds

image

Graph for non frauds

image

Longitude and latitude

  • Some areas on the scatter matrix seemed to experience more fraudulent transactions

Scatter matrix for frauds

image

Scatter matrix for non frauds

image

Job

  • Specific jobs experienced more fraudulent transactions e.g, job 300
  • But this behaviour is inline with what is observed with non-fraudulent transactions so it may also be another misinterpretation

Graph for frauds

image

Graph for non frauds

image

Final Model Performance

  • Model: DecisionTreeClassifier
  • Precision: 82.88%
  • Recall: 17.12%

About

A credit card fraud detection machine learning model

https://youtu.be/dQw4w9WgXcQ


Languages

Language:Jupyter Notebook 99.4%Language:Python 0.6%