Difference on SE from paper
triangleCZH opened this issue · comments
Kevin Chen Zihao commented
- I am just curious, why do you add batch normalization inside SeModule? Is there a reference to do so?
- Please correct me if I make a mistake: I think SeModule should be added between dw and pw-linear, but your code seems to add that after pw-linear and right before residual connection
- Do you think it's necessary to consider expand_ratio = 1? When expand_channel == output_channel, I feel that pw might be redundant, since the shape won't change a bit after pw.
Thank you!
yehao commented
- In efficientnet, there is NO BN in SE.
- SE should be added between 3x3 dw and 1x1 pw.
- it is redundant. You can refer to mobilenet v2.