AML End to End Example

Project description

This project demonstrates end to end pipeline how to train binary anti money laundering (AML) classifier based on Generative Adversarial Networks (GANs) and Graph embeddings. Proposed solution includes following sub sections:

Data ingestion - We will use sample of transactions data generated by AMLSim
Feature store – We use Hopsworks Feature Store to compute features, organize them as feature groups and store for downstream analysis, such as creating training datasets for model training, as well as retrieving them
Graph Embeddings - We will use StellarGraph library to compute graph embeddings.
Anomaly detection model - We will use keras implementation of adversarial anomaly detection that was adapted to tabular data.
Hyper parameter tuning - We will use Maggy to conduct experiments for hyperparameter tuning.
Model serving - We will use Hopsworks model server to predict anomalous transactions.

Demo dataset

A sample of transaction data is provided in the folder ./demodata, including upload alert_transactions.csv, party.csv and transactions.csv.

Anomaly detection model

Keras implementation of adversarial anomaly detection is provided in the folder ./adversarialaml. To use this library install as python library from https://github.com/logicalclocks/AMLend2end.git.

End to End pipeline

To successfully complete this tutorial clone this repository to your Hopsworks project

Jupyter notebooks step by step

Run jupyter notebooks in the following order:

1_transaction_feature_engineering_ingestion.ipynb
2_prep_training_dataset_for_embeddings.ipynb
3_maggy_node_embeddings.ipynb
4_compute_node_embeddings.ipynb
5_predict_and_create_node_embeddings_fg.ipynb
6_create_anomaly_detection_td.ipynb
7_maggy_adversarial_aml.ipynb
8_train_adversarial_aml.ipynb
9_aml_model_server.ipynb

dileepshaik / aml_end_to_end