Brunopaes / mooncake

This Data Science-like project is aimed on analyzing a bunch of techniques for False Negatives (Frauds classified as Non-Frauds) in a extreme imbalanced dataset - Financial Fraud and improving the predictors performance.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mooncake: Improving predictors for fraudulent transactions

GitHub language count GitHub top language GitHub repo size GitHub

Bachelor's final project.

This project is optimised for python 3+

This Data Science-like project is aimed on analyzing a bunch of techniques for False Negatives (Frauds classified as Non-Frauds) in a extreme imbalanced dataset - Financial Fraud and improving the predictors performance.


Dependencies

For developers, python requirements could be find in the project's root. For installing the requirements, in your venv or anaconda env, just run the following command:

pip install -r requirements.txt


Project's Structure

.
└── mooncake
    ├── data
    │   ├── entropy
    │   │   ├── tree_entropy_2020-03-04.png
    │   │   └── tree_entropy_2020-03-05.png
    │   ├── datasource.csv
    │   ├── validation-2020-02-27.csv
    │   └── validation-2020-03-24.csv
    ├── docs
    │   ├── confusion-matrix.xlsx
    │   ├── Ficha.pdf
    │   ├── PGT 04.17.docx
    │   ├── PGT 05.11.docx
    │   ├── PGT 05.23.docx
    │   ├── PGT 05.26.docx
    │   ├── PGT 05.28.docx
    │   ├── PGT 05.29.docx
    │   └── PGT 05.29.pdf
    ├── mooncake
    │   ├── __init__.py
    │   ├── helpers.py
    │   ├── models.py
    │   ├── plotting.ipynb
    │   ├── ros.py
    │   ├── smote.ipynb
    │   └── smote.py
    ├── tests
    │   └── unittests
    │       ├── __init__.py
    │       ├── ...
    │       └── test_helpers.py
    ├── .gitignore
    ├── LICENSE
    ├── README.md
    ├── requirements.txt
    └── setup.py

References


About

This Data Science-like project is aimed on analyzing a bunch of techniques for False Negatives (Frauds classified as Non-Frauds) in a extreme imbalanced dataset - Financial Fraud and improving the predictors performance.

License:MIT License


Languages

Language:Jupyter Notebook 73.6%Language:Python 26.4%