frank-xwang / RIDE-LongTailRecognition

[ICLR 2021 Spotlight] Code release for "Long-tailed Recognition by Routing Diverse Distribution-Aware Experts."

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question on the function of the normalized linear layer

PeiqinZhuang opened this issue · comments

Hi, I just read the released code and found the implementation of the normalized linear layer and the scale factor. I wonder if there exists a special purpose for this design, like training stability?

Hi @PeiqinZhuang,

Thanks for your question. We use normalized linear and scaling for simplified calculation of the margin in LDAM component of our framework. The normalized linear layer is first proposed by LDAM. Since we mainly demonstrated our methods with it, we use the normalized linear in LDAM. If you use our method with methods such as decouple (cRT, t-norm, LWS), you don't need to use normalized linear layer. We refer you to LDAM code for more details: https://github.com/kaidic/LDAM-DRW/blob/master/cifar_train.py#L106.