Robustness of time-series models to adversarial attacks

Abstract 📝

Adversarial attacks are special techniques used to modify input data samples, aiming to changing model predictions and disrupt its work. This topic is very popular in computer vision, but can also be transferred to the time-series domain. Different machine learning models have different levels of sensitivity to adversarial attacks, the so-called robustness.

In our project we compare the robustness of models used in time-series binary classification task against 3 adversarial strategies: DeepFool, SimBA and BIM. Specifically, we train 3 state-of-the-art neural networks (LSTM, CNN, Transformer) with custom architectures for FordA dataset classification task, subjecting them to attacks, and compare the level of robustness.

Quick start 🚀

Download repo:

git clone https://github.com/ocenandor/ts_robustness.git

cd ts_robustness

Build docker image:

docker build . -t ts_robustness

Run image (don't forget to use gpu if available)

docker run  --gpus device=0 -it --rm ts_robustness

Run main.py to download data, train models and get some statistics of the models' robustness in report (~40 min on L40 GPU)

python main.py

Usage 🕹️

models – directory with models' weights. To download our weights run:

bash models/download_weights.sh

configs – directory with models' configuration files. To download FordA dataset from source run:

bash data/downloadFordA.sh

tools:

train.py – train model from config (positional argument - path to config). For wandb logging change entity and project in file

python tools/train.py --dir models/ --no-wandb --data data/FordA configs/cnn_500.json

attack.py – base script to test model with attack (positional arguments - config, weights, type of attack)

python tools/attack.py -s 0.5 --max_iter 50 --data data/FordA --scale 0.5 configs/cnn_500.json demo/cnn.pt deepfool

hypersearch.py – script for tuning models' hyperparameters with wandb sweeps (change entity and project in file). The support for CLI arguments will be added in the future

python tools/hypersearch.py

Results 📊

Test Accuracy, %

Model / Dataset	FordA (length=500)
CNN	86.80 CI 95% [85.94; 87.43]
LSTM	79.56 CI 95% [79.50; 79.61]
Transformer	75.96 CI 95% [75.76; 76.15]

Test Accuracy (Max), %

Model / Dataset	FordA (length=500)	FordA (length=150)
CNN	90.45	89.48
LSTM	81.36	88.09
Transformer	77.80	89.11

Attack implementation

Model / Attack	DeepFool	SimBA	IFGSM (BIM)
CNN	✔️	✔️	✔️
LSTM	✔️	✔️	✔️
Transformer	✔️	✔️	✔️

Mean robustness

FordA500

Attack / Model	CNN	LSTM	Transformer
IFGSM (BIM)	0.083	0.105	0.076
DeepFool	0.059	0.179	0.047
SimBA	1.066	1.037	1.011

FordA150

Attack / Model	CNN	LSTM	Transformer
IFGSM (BIM)	0.119	0.156	0.045
DeepFool	0.096	2.599e8	0.030
SimBA	1.047	1.110	1.048

About

Final project for course "Machine learning". Robustness of time-series models to adversarial attacks

Languages

Language:Python 98.8%Language:Dockerfile 0.7%Language:Shell 0.6%