FGSM_adversarial_attack_on_malware_detection
- NTUST_1111_ML and Application for Cybersecurity_Final Project
- Members:
- M11115015 廖唯任
- M11115035 康帷晟
- M11152025 陳彥合
$a d v_{-} x=x+\epsilon * \operatorname{sign}\left(\nabla_x J(\theta, x, y)\right)$
- Taking an input image.
- Making predictions on the image using a trained model.
- Computing the loss of the prediction based on the true class label.
- Caculating the gradients of the loss with respect to the input image.
- Computing the sign of the gradient.
- Using the signed gradient to construct the output adversarial image.
Steps for the whole final project
- training a malware detector(binary classifier)
- CNN model
- Siamese network
- generate the adversarial example (AE) for each model
- attack the model
- input. AE
- result. ACC decline
Malware Detector Training
- Dataset
- Training: 50/50
- Testing: 50/50
- Feature Engineering
- Dataset
- Training: 10/10
- Testing: 90/90
- Feature Engineering
Adversarial Attack on CNN Model
- method. FGSM
- example
ORIGIN_API_IMG |
+ ε x Noise |
=Adversarial Example |
|
|
|
Remain White Area AE Generation
- method. FGSM & remain the white area
- example
ORIGIN_API_IMG |
+ ε x Noise |
=Adversarial Example |
|
|
|
Experiment (attack CNN model)
Epsilon |
Accuracy(whole image) |
Accuracy(Remain White Area) |
0.01 |
1 |
1 |
0.05 |
0.92 |
1 |
0.075 |
0.84 |
0.96 |
0.1 |
0.76 |
0.93 |
0.15 |
0.68 |
0.92 |
0.2 |
0.36 |
0.82 |
0.25 |
0 |
0.76 |
Adversarial Attack on Siamese network
- method. FGSM
- example
ORIGIN_API_IMG |
+ ε x Noise |
=Adversarial Example |
|
|
|
Remain White Area AE Generation
- method. FGSM & remain the white area
- example
ORIGIN_API_IMG |
+ ε x Noise |
=Adversarial Example |
|
|
|
Epsilon |
Accuracy(whole image) |
Accuracy(Remain White Area) |
0.01 |
1 |
1 |
0.05 |
0.98 |
1 |
0.075 |
0.74 |
0.97 |
0.1 |
0.14 |
0.96 |
0.15 |
0 |
0.37 |
0.2 |
0 |
0.13 |
0.25 |
0 |
0.10 |
Model |
without AEs |
AEs |
Remain White Area AEs |
CNN |
1 |
0.84 |
0.96 |
Siamese Network |
1 |
0.74 |
0.97 |
- CNN is robust than Siamese network against adversarial attack in our work.
- We could construct an adversarial attack to the malware classification model in feature space.
- We could try problem space of adversarial attack in the work in the future.