Harry24k / adversarial-attacks-pytorch

PyTorch implementation of adversarial attacks [torchattacks]

Home Page:https://adversarial-attacks-pytorch.readthedocs.io/en/latest/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QUESTION] output format

cestwc opened this issue · comments

Currently, the output is a single tensor (adv images). However, it is common that we might need to know more information from the long-lasting searching process, e.g., in which iteration of PGD do we start to get adversarial example for this input.

More information can use useful, like the loss one suggested in #86 . This is because running an attack can take long, and currently we have to run it again to know what will happen if we change the settings a little bit.

I understand that we can't endless add more customized details that could slow down the program, but something like model ouput from huggingface can be a good choice. We could let different attack methods, e.g., CW or APGD, have different attributes in the output format, but always keep the adv image at the beginning.

Thank you for providing this good library.

Hi @cestwc , I don't quite understand exactly what kind of output you are trying to get, is it a process of change from the original images to adversarial images or something else? 😀

commented

To clarify, the current output format as of April 11th, 2023, is a torch tensor with a size of [batch size, channel, W, H]. However, a suggested output format can include additional information, including this torch tensor. For example, the suggested format could include adversarial images at each step, depending on the attack used.

To illustrate, consider the following use case examples:

Current:

adv_images = attack(images, labels)
print(adv_images.shape)

Suggested:

outputs = attack(images, labels)
print(outputs.adv_images.shape)
print(len(outputs.adv_images_at_each_step))

or

print(outputs['adv images'].shape)
print(len(outputs['adv images at each step'])

By using the suggested format, the output outputs can contain more information, such as the adversarial images at each step.

Hi @Harry24k , what do you think of this code feature?
I'm not sure if this extra operation would slow down the rate of adversarial examples generation or increase GPU memory usage.

And for @cestwc , if i implement it, it could be this way, what do you think?

atk = PGD(model, eps=8/255, alpha=2/255, steps=10, random_start=True)
atk.set_store_each_step_result()
adv_result = atk(images, labels)

The adv_result shape is [steps, batch_size, C, W, H].
The last image adv_result[-1] is the final generated adversarial example.