Why freeze BN in stem and bottleneck?

Question

Why freeze BN in stem and bottleneck?

feiyuhuahuo opened this issue 4 years ago · comments

Is that because train batch_size=2 per GPU is too small for using batch normalization (and you didn't use sync-BN)? I found that it can at most hold bs=6 for a 11GB memory GPU. So for this situation, using BN is also not a bad choice right?

Kang Kim · Answer 1 · Fri Aug 28 2020 10:05:18 GMT+0800 (China Standard Time)

Hi @feiyuhuahuo, the frozen BN is to follow the common practice of previous works for fair performance comparison. There are many ways to improve performance, e.g. by using sync-BN instead of frozen-BN, adding more deformable convs (PAA models only use dconvs to C4-C5 for ResNext models), increasing training/testing input sizes (usual multi-scale training range is (640-800), and increasing this range (e.g. (480~960) as in FreeAnchor) gives better performance), etc. I intentionally set up all the training/testing configurations same as previous works such as ATSS and FCOS for comparison purposes. You can change these settings to train better performing models. I thinks changing frozen-BN to sync-BN can give significant AP improvements.

Also BN of 6 images per GPU usually gives worse performance compared to frozen-BN. In my previous BN experience, the minimum per-GPU batch size for stable BN is 16. So I would recommend sync-BN instead of plain BN.