HasnaeTalibi / Credit_card_fraud_detection_ULB_with_Apache_Spark_on_Databricks

Credit card fraud detection model using Spark and LightGBMClassifier in Databricks runtime environment using dataset provided by Machine Learning Group at Université libre de Bruxelles (ULB).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Building credit card fraud detection model using Spark on Databricks with dataset provided by ULB

In this project, I have implemented credit card fraud detection model using Spark and LightGBMClassifier in Databricks runtime environment using dataset provided by Machine Learning Group at Université libre de Bruxelles (ULB). The dataset with 300,000 rows consisting 31 variables related to European Credit Card holder's transactions out of which 28 are numeric variables derived by performing Principal Component Analysis on some unrevealed original parameters. The remaining three variables are Amount of transaction, time of transaction in seconds relative to first tranaction and Class of transaction indicating whether its genuine or fradulent.

You may also refer to accompanying notebook on Kaggle which was used for finding optimal values of Hyperparaeters for LightGBMClassifier, which was used in this project for training the model inside Spark.

About

Credit card fraud detection model using Spark and LightGBMClassifier in Databricks runtime environment using dataset provided by Machine Learning Group at Université libre de Bruxelles (ULB).


Languages

Language:Jupyter Notebook 100.0%