yunusulucay / supervised_anomaly_detection

Supervised anomaly detection. 9 models implemented.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SUPERVISED ANOMALY DETECTION

Anomaly detection has 3 parts. In this project I examined supervised learning anomaly detection. This is actually a classification problem. But there are some difference like used metrics(mahalanobis distance instead of eucledian distance)[1]. Or unbalanced data. But because of this dataset has a balanced dataset, I didn't touch those problems. There are 3 steps in this project: exploratory data analysis, feature engineering and modelling. I implemented 9 models(Logistic Regression, Stochastic Gradient Descent Classifier, Passive-Aggressive Algorithms, LightGBM, Extra Trees, Neural Networks, KNN, Naive Bayes and XGBOD(XGBClassifier from PyOD))

image Difference Between Eucledian Distance and Mahalanobis Distance Depending on Correlation[1]

Depending on models' comparison winner is XGBClassifier.

Imbalanced classification. Binary Classification. Sources.

If you have an imbalanced data(like 90000 labeled with 1 and 100 data labeled 0) you can look the links below.

About

Supervised anomaly detection. 9 models implemented.


Languages

Language:Jupyter Notebook 99.7%Language:Python 0.3%