xiaolai-sqlai / mobilenetv3

mobilenetv3 with pytorch,provide pre-train model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Validation accuracy for large model low, mistakes in model

rwightman opened this issue · comments

As with #5, the validation accuracy for the large model is also well below the stated. I was curious because the stated result, beating the official, with 1.4m less parameters would be impressive.

I only get: Prec@1 70.788 (29.212) Prec@5 89.410 (10.590)

Several things to fix in the model:

  • squeeze-excite layers should reduce the spatial dims with either a mean across spatial dim or avgpool. You have the avg pool in there but aren't using it.
  • should be no BN in SE module
  • SE module should be applied between the 3x3 DW conv and the 1x1 PWL, not after the PWL
  • as per paper, the reduction for the SE layer in mobilnet v3 should be applied to the expanded width
  • there were mistakes in the last block of 5x5 convs in the paper, those mistakes have been fixed with a new version, location of the last stride 2 changed and one of the 672 expansions should be 960
  • should be no batch norm after the linear before the classifier layer

I add some tricks, some important tricks like warmup and cosine learning rate are really useful,besides, I use DALI bu Nvidia to load the model.

I think the main cause is the dataloader, I will reproduce the model by dataloader in pytorch, instead of DALI.

加载模型之后,第一个epoch的验证精度大幅度降低,从第二个epoch开始恢复正常,请问是什么原因。

Revisiting this. Google finally released their official version of MobileNet-V3 a few weeks ago now. It confirmed the known issues mentioned here and several more: https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet

I have also validated my own version of MobileNet-V3 in PyTorch. I trained from scratch back in May and reproduced the paper accuracy with a standard PyTorch data loader and preprocessing configuration. With the official Tensorflow release, I realized a few small differences and have updated mine to include a Tensorflow compatble version with weights from the official version. https://github.com/rwightman/gen-efficientnet-pytorch

Wow, thanks a lot. I've tried to find it in official repo but I only saw mobilenetv2 at that time. You and your repo are really great help.

I have been looking at some implementation of mobilenetv3 but I have not seen the AutoML part in the codes, how does this work with mobilenetv3

@rebeen I have not seen a full implementation of the Mobilenetv3 AutoML search (platform aware NAS (MnasNet) + NetAdapt) that would reproduce these networks. The platform aware NAS is a reinforcement learning based method, generally those are quite expensive to run, even with constraints on the architecture.

However, there are other search algorithms and bits and pieces out there that work with the same building blocks:

@rwightman Thank you very much for your detailed explanation. I will check the links you have provided.