weiwei-liu / anomaly_detection

Detect anomalies from a embed system log using RNN with attention layer.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Anomaly Detection Neural Network with Attention

This readme consists of four main parts to briefly describe the workflow of training a Recurrent Neural Network with attention layer model to classify anomaly events in a sequence based embeded software log. These four parts are Data Loading and Preprocessing, Model building, Model training, Results Analysis and Visualization. Check the python notebook for details.

Data loading and preprocessing

  • Load in all 15 .csv data files, and save as pandas dataframes.

Overview of the data

  • Groupby class and event column in the dataframe to get the occurrence count of different events under different class.
clean-01 clean-02 clean-03 clean-04 clean-05 clean-06 clean-07 clean-08 clean-09 clean-10 fifo-ls-01 fifo-ls-02 fifo-ls-sporadic full-while half-while
class event
COMM MSG_ERROR 6.0 6.0 8.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 5715 5589.0 6006.0 509.0 422.0
REC_MESSAGE 17968.0 17969.0 17963.0 18135.0 18134.0 18147.0 18216.0 18213.0 18260.0 18347.0 65232 66973.0 65666.0 44802.0 45072.0
REC_PULSE 24710.0 24226.0 24173.0 24871.0 24849.0 24358.0 24390.0 24397.0 24442.0 24644.0 28312 28349.0 25631.0 39342.0 39529.0
REPLY_MESSAGE 17947.0 17950.0 17938.0 18098.0 18103.0 18131.0 18190.0 18180.0 18248.0 18329.0 59477 61336.0 59627.0 44202.0 44565.0
SIGNAL NaN 1.0 2.0 NaN NaN 1.0 NaN 1.0 1.0 2.0 37 36.0 39.0 NaN 1.0
SND_MESSAGE 18089.0 18077.0 18073.0 18234.0 18235.0 18247.0 18300.0 18286.0 18373.0 18447.0 65378 67122.0 65808.0 45149.0 45426.0
SND_PULSE NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 882 884.0 943.0 11226.0 11289.0
SND_PULSE_DIS NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 877 880.0 958.0 NaN NaN
SND_PULSE_EXE 36701.0 48219.0 60157.0 72854.0 84809.0 96321.0 108339.0 120360.0 132390.0 144575.0 181312 193406.0 202078.0 172297.0 160365.0
CONTROL BUFFER 2161.0 2158.0 2170.0 2224.0 2242.0 2243.0 2264.0 2278.0 2298.0 2329.0 4845 4938.0 4751.0 4326.0 4334.0

From above table, it could be seen that the clean and anomalous files are quite different based on the occurrence counts of different events. For example, normally event COMM-SND_MESSAGE occurced around 18000 times, while in the anomalous files it occured around 45000~67000 times. This may not be seen as an effective way to detect anomalous activity, however, it can show a general picture of the data where the anomaly could be residing.

Model Building

  • Load the encoder and decoder model

The architecture of this model is:

input of events sequence ------>> encoder(GRU unit) ------>> attention layer ----->> decoder (GRU unit) ------->> output layer

The input is a small segment of the log file, in this case, 5 continuous events, and the target output is the next 5 continuous events following the input one. The general idea is that using this proposed NN model to train inputs and predicting the following outputs. Assuming the event sequences patterns between the clean and anomalous ones are different, then the preciting/test accuracy should be different using the same model and trained weights.

Check model.py file for the details of encoder, attention, and decoder models.

Model Training

Check anomaly_detection_NN_train.ipynb for details.

Results

The next step is to predict results using the above model and trained weights of each layer (saved in sumitmodel_checkpoint folder).

The test inputs are processed using event sequence length Tx = 5, same as the trained data, while using stride stride = 5 instead of 2.

Save all the predicted result into .npy files for further analysis use.

  • Set anomaly creteria

As mentioned above, Assuming the event sequences patterns between the clean and anomalous ones are different, then the preciting/test accuracy should be different using the same model and trained weights.

In the following code, I use squence length of 1000 as one input sample, and use the above trained model to precited output, and then compare the precited output with target values to get the misclassification accuracy.

After predict outputs on all the 10 clean files, calculate the mean and variance of the misclassification accuracy. Finally, I set the criteria to be (mean + 3* standard_deviation).

Any 1000 events long sequence with misclassification rate higher than the criteria will be deemed as anomaly segment.

In this case, any misclaasification rate higher than 0.365 will be classified as anomaly event.

  • Visualize anomalous events

Normal sequences

png

Abnormal sequences A

png

Abnormal sequences B

png

Reference

About

Detect anomalies from a embed system log using RNN with attention layer.


Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%