When using your ImageNet21K pretrained ResNet50 model in Detectron2, performance degrades

Question

When using your ImageNet21K pretrained ResNet50 model in Detectron2, performance degrades

miznchimaki opened this issue 2 years ago · comments

Thanks for your great work!
I have a question when using your ResNet50 model as pretrained weights of Faster R-CNN in Detectron2: your 21K pretrained weights gives 8 point lower mAP than MSRA 1K pretrained one. Before I loaded your 21K pretrained weights into the Faster R-CNN in Detectron2, I noticed that your ResNet50 was trained by input whose value is between 0 and 1 (this is achieved by dividing 255 in pixel-wise manner in your code), but the input in Detectron2 was normalized by substractig pixel mean value and dividing std value in ImageNet, so I set the pixel mean value to 0 and std value to 255 in Detectron2. Although I have done above steps, performance of Faster R-CNN based on your 21K pretrained model still lays far behind MSRA's 1K pretrained one. So I want to know is there some problems I ignored?
Sincerely waiting your response!

Hang Zhang · Answer 1 · Fri Jun 02 2023 08:28:22 GMT+0800 (China Standard Time)

Not sure if you've already solved the issue. Detectron2 ResNet is caffe style, which is slightly different architecture from the TorchVision version.