Validation accuracy for large model low, mistakes in model

Question

Validation accuracy for large model low, mistakes in model

rwightman opened this issue 5 years ago · comments

As with #5, the validation accuracy for the large model is also well below the stated. I was curious because the stated result, beating the official, with 1.4m less parameters would be impressive.

I only get: Prec@1 70.788 (29.212) Prec@5 89.410 (10.590)

Several things to fix in the model:

squeeze-excite layers should reduce the spatial dims with either a mean across spatial dim or avgpool. You have the avg pool in there but aren't using it.
should be no BN in SE module
SE module should be applied between the 3x3 DW conv and the 1x1 PWL, not after the PWL
as per paper, the reduction for the SE layer in mobilnet v3 should be applied to the expanded width
there were mistakes in the last block of 5x5 convs in the paper, those mistakes have been fixed with a new version, location of the last stride 2 changed and one of the 672 expansions should be 960
should be no batch norm after the linear before the classifier layer

laishenqi · Answer 1 · Sun May 26 2019 13:17:49 GMT+0800 (China Standard Time)

I add some tricks, some important tricks like warmup and cosine learning rate are really useful，besides, I use DALI bu Nvidia to load the model.

laishenqi · Answer 2 · Sun May 26 2019 13:50:50 GMT+0800 (China Standard Time)

I think the main cause is the dataloader, I will reproduce the model by dataloader in pytorch, instead of DALI.

JTzhuang · Answer 3 · Tue Oct 22 2019 17:22:17 GMT+0800 (China Standard Time)

加载模型之后，第一个epoch的验证精度大幅度降低，从第二个epoch开始恢复正常，请问是什么原因。

Ross Wightman · Answer 4 · Fri Nov 29 2019 03:58:53 GMT+0800 (China Standard Time)

Revisiting this. Google finally released their official version of MobileNet-V3 a few weeks ago now. It confirmed the known issues mentioned here and several more: https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet

I have also validated my own version of MobileNet-V3 in PyTorch. I trained from scratch back in May and reproduced the paper accuracy with a standard PyTorch data loader and preprocessing configuration. With the official Tensorflow release, I realized a few small differences and have updated mine to include a Tensorflow compatble version with weights from the official version. https://github.com/rwightman/gen-efficientnet-pytorch

Yangshen⚡Deng · Answer 5 · Fri Nov 29 2019 04:21:17 GMT+0800 (China Standard Time)

Wow, thanks a lot. I've tried to find it in official repo but I only saw mobilenetv2 at that time. You and your repo are really great help.

Rebeen Ali Hamad · Answer 6 · Fri Nov 29 2019 04:58:34 GMT+0800 (China Standard Time)

I have been looking at some implementation of mobilenetv3 but I have not seen the AutoML part in the codes, how does this work with mobilenetv3

Ross Wightman · Answer 7 · Sat Nov 30 2019 03:57:11 GMT+0800 (China Standard Time)

@rebeen I have not seen a full implementation of the Mobilenetv3 AutoML search (platform aware NAS (MnasNet) + NetAdapt) that would reproduce these networks. The platform aware NAS is a reinforcement learning based method, generally those are quite expensive to run, even with constraints on the architecture.

However, there are other search algorithms and bits and pieces out there that work with the same building blocks:

Rebeen Ali Hamad · Answer 8 · Sun Dec 01 2019 04:53:39 GMT+0800 (China Standard Time)

@rwightman Thank you very much for your detailed explanation. I will check the links you have provided.