efficientnet-b8 and AdvProp

Question

efficientnet-b8 and AdvProp

seefun opened this issue 5 years ago · comments

With advprop, efficientnet got greater score in ImageNet. Would you update to the new ckpt?
the paper: https://arxiv.org/pdf/1911.09665.pdf
https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet

Luke Melas-Kyriazi · Answer 1 · Tue Nov 26 2019 11:22:49 GMT+0800 (China Standard Time)

Awesome, thanks for posting this. It's on the way.

Oktai Tatanov · Answer 2 · Wed Nov 27 2019 15:14:13 GMT+0800 (China Standard Time)

@lukemelas maybe your current code in “tf_to_pytorch” dir will work?

laurentdillard · Answer 3 · Wed Dec 11 2019 01:09:55 GMT+0800 (China Standard Time)

@lukemelas any update on that ?

jionie · Answer 4 · Wed Dec 18 2019 17:03:26 GMT+0800 (China Standard Time)

@lukemelas any update on that ?

SeeFun · Answer 5 · Wed Dec 18 2019 17:04:20 GMT+0800 (China Standard Time)

SeeFun commented 5 years ago

Luke Melas-Kyriazi · Answer 6 · Thu Dec 19 2019 06:27:12 GMT+0800 (China Standard Time)

Apologies for the delay on this (I had final exams this past week). Coming soon.

laurentdillard · Answer 7 · Wed Jan 22 2020 00:57:23 GMT+0800 (China Standard Time)

Hi @lukemelas , just wondering if you had the chance to work on that one.

Luke Melas-Kyriazi · Answer 8 · Fri Jan 24 2020 13:13:07 GMT+0800 (China Standard Time)

Sorry this took forever. It should be in now :)

Let me know if you have any issues.

Luke Melas-Kyriazi · Answer 9 · Fri Jan 24 2020 13:18:52 GMT+0800 (China Standard Time)

Closing this, but feel free to re-open it if you have any issues/questions.

Suryadiputra Liawatimena · Answer 10 · Mon Apr 20 2020 17:18:38 GMT+0800 (China Standard Time)

Dear all,

I am using https://colab.research.google.com/drive/1Jw28xZ1NJq4Cja4jLe6tJ6_F5lCzElb4
Why the efficientnet-b0 advprop=True result is very bad?
Thank you.

Suryadi

Loaded pretrained weights for efficientnet-b0, advprop=False
-----
giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca           (83.44%)
brown bear, bruin, Ursus arctos                                             (0.62%)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens         (0.60%)
ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus                 (0.44%)
Arctic fox, white fox, Alopex lagopus                                       (0.34%)

Loaded pretrained weights for efficientnet-b0, advprop=True
-----
wombat                                                                      (3.34%)
candle, taper, wax light                                                    (2.00%)
Angora, Angora rabbit                                                       (1.87%)
schipperke                                                                  (1.86%)
hog, pig, grunter, squealer, Sus scrofa                                     (1.60%)

Luke Melas-Kyriazi · Answer 11 · Tue Apr 21 2020 01:58:24 GMT+0800 (China Standard Time)

Did you use the advprop image preprocessing or the usual preprocessing? See https://github.com/lukemelas/EfficientNet-PyTorch/blob/master/examples/imagenet/main.py#L211. That's the reason advprop is not enabled by default. Let me know if it still doesn't work and I can look into it.

Jian Shi · Answer 12 · Tue Apr 21 2020 14:54:28 GMT+0800 (China Standard Time)

Did you use the advprop image preprocessing or the usual preprocessing? See https://github.com/lukemelas/EfficientNet-PyTorch/blob/master/examples/imagenet/main.py#L211. That's the reason advprop is not enabled by default. Let me know if it still doesn't work and I can look into it.

Hi @lukemelas ,

Thanks for the repo. Do you know the reason for using a different normalization for advprop? If I am training a new model with advprop, why should I use it than imagenet mean and std?

diziOh · Answer 13 · Fri Jun 19 2020 13:49:15 GMT+0800 (China Standard Time)

one question, if I'm getting the gist of the paper right, It seems like advprop uses two batchnorm layer (one for standard data and the other one for adversarial data). However in the code, I don't see where it implements that other batchnorm layer. Am I misunderstanding the paper? Or is the code not providing it?

Jian Shi · Answer 14 · Fri Jun 19 2020 15:00:10 GMT+0800 (China Standard Time)

@ooodragon94 I think that part is not in this repo. But it is easy to implement.

Class EfficientNet():
    def __init__(self, advprop=False, **kwargs):
        self.somelayers = nn.Layer
        self.norm = nn.BatchNorm
        if advprop:
            self.aux_norm = nn.BatchNorm

    def forward(self, x, advprop=False):
        x = self.somelayers(x)
        if advprop:
            x = self.aux_norm(x)
        else:
            x = self.norm(x)

diziOh · Answer 15 · Fri Jun 19 2020 15:19:17 GMT+0800 (China Standard Time)

@shijianjian
you are definitely right (and your code is compact and beautiful!)
I'm just afraid that if I were to use pretrained model and fine tune it, I'm assuming that since I am not loading aux_norm's parameters, the performance might degrade due to different settings.

Jian Shi · Answer 16 · Fri Jun 19 2020 15:35:29 GMT+0800 (China Standard Time)

@ooodragon94
Not pretty confident. But if you want to fine-tune with adversarial examples. I think the easiest way is to freeze the whole model apart from the auxiliary normalization layer. In initial epochs, you only train the auxiliary mean and stds. Then you may save the whole model and fine-tune it as normal.

diziOh · Answer 17 · Thu Jul 09 2020 09:58:12 GMT+0800 (China Standard Time)

@shijianjian
Trying to implement advprop
I totally agree with your comment on "different normalization"
I also don't see where it makes adversarial samples...

normal · Answer 18 · Wed Jul 15 2020 01:09:39 GMT+0800 (China Standard Time)

When using advprop pretrained weight and advprop normalization, the training results become very unstable, and the accuracy also decreases.

diziOh · Answer 19 · Wed Jul 15 2020 08:58:14 GMT+0800 (China Standard Time)

When using advprop pretrained weight and advprop normalization, the training results become very unstable, and the accuracy also decreases.

@feiwofeifeixiaowo If you didn't by any chance implemented advprop but only loaded its weights, than it will definitely suffer in training accuracy.