OrigamiSL / AFMF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AFMF: Time Series Anomaly Detection Framework With Modified Forecasting

Python 3.8 PyTorch 1.11.0 License CC BY-NC-SA

This is the origin Pytorch implementation of AFMF in the following paper: [AFMF: Time Series Anomaly Detection Framework With Modified Forecasting], which has been submitted to Knowledge-Based Systems.

Components of AFMF



Figure 1. An overall algorithm flows of forecasting-based anomaly detection with AFMF. All of components changed/added by AFMF are emphasized in red. Oplus ($ \oplus $) refers to concatenation operation.

Local Instance Normalization (LIN)

We propose Local Instance Normalization (LIN) which performs normalization per window $ x_{w_1:w_2}^N $ (Equation 1) as the preprocessing procedure to replace the conventional one which normalizes all data points with statistics of the entire dataset (Equation 2). $ [w_1,w_2] $ denotes time span of a certain window and $ [u_1,u_2] $ denotes time span of train subset. ${x}{w_1:w_2}^i$ refer to obversations in the dataset of the $ i $-th variate spanning this window and $ \hat{x}{w_1:w_2}^i $ are after normalization.



Lopsided Forecasting (LF)

As shown in Figure 2, inputs and outputs of Lopsided Forecasting (LF) are different from canonical forecasting format: (1) Inputs. Continuous variates are masked at prediction timestamp; Discrete variates are entirely inputted, including values at prediction timestamp. (2) Outputs. Only continuous variates are predicted. As the whole window, including observations of input and prediction parts, are all known in anomaly detection tasks, it is reasonable to utilize the value of discrete variates at prediction timestamps as inputs. When combining LF with LIN, we do not apply LIN to discrete variates, otherwise their implicit information will be damaged.



Figure 2. Lopsided forecasting only forecasts continuous variates $ \{C_{{w_2}}\}^{N_C} $ and its inputs extra include discrete variates at prediction timestamp $ \{D_{{w_2}}\}^{N_D} $. Additional zero $ \{0\}^{N_C} $ (red asterisk) concatenated with continuous inputs $ \{C_{{w_1:w_2-1}}\}^{N_C} $ is to ensure the same length with discrete inputs $ \{D_{{w_1:w_2}}\}^{N_D} $.

Progressive Adjacent Masking (PAM)

In practice, our proposed Progressive Adjacent Masking (PAM) is a progressive algorithm which avoids to decide the number of adjacent elements to mask. The pseudo-code of it is shown in Algorithm 1. $ N $ refers to the number of variates. $ A_{w_2} $ is certain anomaly detected by prediction error at $ w_2 $ ($ S_{w_2}^N $) and anomaly threshold ($ \alpha $) given by certain forecasting network. $ \beta^N $ is the hyper-parameter to initialize the decline ratio to determine whether the prediction error degradation is distinct enough to validate that the new prediction error after masking is more reliable.



Requirements

  • python == 3.8.8
  • numpy == 1.20.1
  • pandas == 1.2.4
  • scipy == 1.9.0
  • scikit_learn == 0.24.1
  • torch == 1.11.0
  • torch-cluster == 1.6.0
  • torch-geometric == 2.1.0.post1
  • torch-scatter == 2.0.9
  • torch-sparse == 0.6.15
  • torch-spline-conv == 1.2.1

Dependencies can be installed using the following command:

pip install -r requirements.txt

Data

SMD, MSL, SMAP, SMD datasets were acquired at datasets and SWaT, WADI can be requested at Itrust. MBA, UCR, NAB was acquired at TranAD datasets and MSDS can be requested at zenodo. Pruned and remedied {SMD, MSL, SMAP} were acquired at TranAD datasets.

Data Preparation

There are several versions of SWaT/WADI. We choose SWaT in the version of SWaT.A1 & A2_Dec 2015. The train subset is SWaT_Dataset_Normal_v1.xlsx and the test subset is SWaT_Dataset_Attack_v0. We choose WADI in the version of WADI.A2_19 Nov 2019. The train subset is WADI_14days_new.csv and the test subset is WADI_attackdataLABLE.csv. After you acquire raw data of all datasets, please separately place them in corresponding folders at ./AFMF/data. Then you can get the folder tree shown as below:

|-data
| | preprocess.py
| | data_loader.py
| |-MBA
| | |-labels.xlsx
| | |-test.xlsx
| | |-train.xlsx
| |
| |-MSDS
| | |-concurrent_data
| | | |-logs
| | | |-logs_aggregated_concurrent.csv
| | | |-metrics
| | | | |-wally113_metrics_concurrent.csv
| |
| |-MSL
| | |-test
| | | |-C-1.npy
| | |-train
| | | |-C-1.npy
| | |-labeled_anomalies.csv
| | |-MSL_test.npy
| | |-MSL_test_label.npy
| | |-MSL_train.npy
| |
| |-NAB
| | |-ec2_request_latency_system_failure.csv
| | |-labels.json
| |
| |-PSM
| | |-test.csv
| | |-test_label.csv
| | |-train.csv
| |
| |-SMAP
| | |-test
| | | |-P-1.npy
| | |-train
| | | |-P-1.npy
| | |-labeled_anomalies.csv
| | |-SMAP_test.npy
| | |-SMAP_test_label.npy
| | |-SMAP_train.npy
| |
| |-SMD
| | |-labels
| | | |-machine-1-1.txt
| | |-test
| | | |-machine-1-1.txt
| | |-train
| | | |-machine-1-1.txt
| | |-SMD_test.npy
| | |-SMD_test_label.npy
| | |-SMD_train.npy
| |
| |-SWaT
| | |-SWaT_Dataset_Attack_v0.xlsx
| | |-SWaT_Dataset_Normal_v1.xlsx
| |
| |-UCR
| | |-137_UCR_Anomaly_InternalBleeding18_2300_4485_4587.txt
| |
| |-WADI
| | |-WADI_14days_new.csv
| | |-WADI_attackdataLABLE.csv

Then you can run ./AFMF/data/preprocess.py to preprocess these raw data. Only raw data of SWaT, WADI and MSDS are preprocessed. We do not change any of their values but only remove useless information, e.g., blanks. We remove variates {'load.cpucore', 'load.min1', 'load.min5', 'load.min15'} in MSDS following TranAD. Names of variates are renamed for the convenience of variates classification in Lopsided Forecasting (LF). After you successfully run ./AFMF/data/preprocess.py, you will obtain folder tree:

|-data
| | preprocess.py
| | data_loader.py
| |-MBA
| | |-labels.xlsx
| | |-test.xlsx
| | |-train.xlsx
| |
| |-MSDS
| | |-concurrent_data
| | | |-logs
| | | |-logs_aggregated_concurrent.csv
| | | |-metrics
| | | | |-wally113_metrics_concurrent.csv
| | |-labels.csv
| | |-test.csv
| | |-train.csv
| |
| |-MSL
| | |-labels
| | | |-C-1.npy
| | |-test
| | | |-C-1.npy
| | |-train
| | | |-C-1.npy
| | |-labeled_anomalies.csv
| | |-MSL_test.npy
| | |-MSL_test_label.npy
| | |-MSL_train.npy
| |
| |-NAB
| | |-ec2_request_latency_system_failure.csv
| | |-labels.json
| |
| |-PSM
| | |-test.csv
| | |-test_label.csv
| | |-train.csv
| |
| |-SMAP
| | |-labels
| | | |-P-1.npy
| | |-test
| | | |-P-1.npy
| | |-train
| | | |-P-1.npy
| | |-labeled_anomalies.csv
| | |-SMAP_test.npy
| | |-SMAP_test_label.npy
| | |-SMAP_train.npy
| |
| |-SMD
| | |-labels
| | | |-machine-1-1.txt
| | |-test
| | | |-machine-1-1.txt
| | |-train
| | | |-machine-1-1.txt
| | |-SMD_test.npy
| | |-SMD_test_label.npy
| | |-SMD_train.npy
| |
| |-SWaT
| | |-Attack.csv
| | |-Normal.csv
| | |-SWaT_Dataset_Attack_v0.xlsx
| | |-SWaT_Dataset_Normal_v1.xlsx
| |
| |-UCR
| | |-137_UCR_Anomaly_InternalBleeding18_2300_4485_4587.txt
| |
| |-WADI
| | |-Attack.csv
| | |-Normal.csv
| | |-WADI_14days_new.csv
| | |-WADI_attackdataLABLE.csv

You may manually delete raw data of SWaT, WADI and MSDS if you want.

Baseline

We redo all experiments related to other baselines. These experiments are conducted with their default experiment settings. The only change to their projects is that we replace their threshold selection approach with that of Anomaly Transformer. Their source codes origins are given below:

Baseline Window Source Code Origin
MERLIN {10, 50, 100} MERLIN
DAGMM 5 DAGMM
GANF 60 GANF
DeepSVDD 100 DeepSVDD
DGHL 64 DGHL
COUTA 100 COUTA
Anomaly Transformer 100 Anomaly Transformer
TranAD 10 TranAD
CAE-M 5 CAE-M
MTAD-GAT 100 MTAD
GDN 128 GDN
GTA 60 GTA
CAT 64 CAT

Usage

Commands for training and testing models combined with AFMF of all datasets are in ./scripts/<model>.sh.

More parameter information please refer to main.py.

We provide a complete command for training and testing models combined with AFMF:

python -u main.py --model <model> --data <data> --root_path <root_path> --input_len <input_len> --variate <variate> --out_variate <out_variate> --learning_rate <learning_rate> --dropout <dropout> --batch_size <batch_size> --train_epochs <train_epochs> --itr <itr> --anomaly_ratio <anomaly_ratio> --retrain --detection_adjustment --drop <drop> --thresh <thresh> --data_process --LIN

Here we provide a more detailed and complete command description for training and testing the model:

Parameter name Description of parameter
model The model of experiment combined with AFMF.
data The dataset name
root_path The root path of the data file
data_path The data file name
checkpoints Location of model checkpoints
input_len Input sequence length of the model
variate Input variate number
out_variate Output variate number
dropout Dropout
itr Experiments times
train_epochs Train epochs
batch_size The batch size of training input data
patience Early stopping patience
learning_rate Optimizer learning rate
anomaly_ratio The proportion for threshold determination
retrain Whether to train the model
partial_train Whether to use partial train subset
partial_ratio The proportion of train subset used
detection_adjustment Whether to use detection_adjustment
adjust_k The proportion of threshold used in the adjustment stratergy PA%K
drop Loop variate k
thresh Decline ratio
data_process Whether to preprocess data
LIN Whether to use local instance normalization
load_anomaly Whether to load anomaly
save_predictions Whether to save prediction results
save_mses Whether to save mses
reproducible Whether to make results reproducible

Results

The experiment parameters of certain model under each data set are formated in the <model>.sh files in the directory ./scripts/. You can refer to these parameters for experiments, and you can also adjust the parameters to obtain better results.



Figure 3. Quantitative results, i.e., P, R, F1 and AUC (as %), under five flawed benchmarks



Figure 4. Results of five forecasting networks '-wo-'/'-w-' AFMF

About

License:Apache License 2.0


Languages

Language:Python 92.7%Language:Shell 7.3%