the loss function is not useful in the experiment?
xiajun112233 opened this issue · comments
Hello, I'm very interested in this paper, when I see the main.py, the three at_loss are used .detach() to out of the calculate graph in the PyTorch. So I delete the at1_loss、at2_loss、at3_loss in the loss function. But, when I run the changed code, the ASP is still very low. I think the at_loss is not useful in the code. The training dataset in the main code is the clean dataset, not the backdoor dataset, so the NAD ASR is very low. However, the training dataset in the train_badnets code uses the backdoor dataset, so the baseline ASR is high. I changed the training dataset in the main code to the backdoor dataset. Unfortunately, the NAD is not useful in the backdoor dataset.
Hi, Thanks for your interest in our work. To verify the effectiveness of NAD, you could finetune the backdoored student with/without the NAD loss, i.e. setting at1_loss, at2_loss, and at3_loss all to be non-zero/zero, and compare the ASR under two types of settings.
Thanks for providing the screenshot. It is clear to see that there achieves a better erasing result with NAD loss(ASR decreases to 3.78%, compared to the result without NAD loss). By the way, the selection of trigger types\teacher models\data augmentation techniques
also causes different erasing effects for distillation.
But, when I run without NAD loss train code, there also have good results in ASR, so I think it is random results for the CE loss in the clean dataset, you can see the next pictures. Whether use the clean dataset to retrain the backdoor model is good enough to defend against the backdoor attack? Thank you.
To be honest, It is not surprising that Fine-tuning can effectively erase BadNets attack; the erasing effect is probably attributed to the data augmentation techniques, i.e. Padding, flip, and cutout, as they are highly related to the original trigger pattern. You can change the param of Cutout
as 1 hole with a litter size 9 or 4 to verify this observation. By the way, I think the adaptive attacks
shown in Appendix K(Table 9)
in our paper will be beneficial to your understanding of our NAD.
OK, thank you, which parameters in the code should I change to use the adaptive attacks in this code?
The most simple case is that changing the location of the backdoor trigger (i.e. BadNets trigger) from the bottom-right to the center of the image.