d-li14 / mobilenetv3.pytorch

74.3% MobileNetV3-Large and 67.2% MobileNetV3-Small model on ImageNet

Home Page:https://arxiv.org/abs/1905.02244

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Placement of Activation *after* SEBlock

bernardomig opened this issue · comments

Hi. In your code, you place the activation (hard-swish), after the SE Block

self.conv = nn.Sequential(
                # pw
                nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
                nn.BatchNorm2d(hidden_dim),
                h_swish() if use_hs else nn.ReLU(inplace=True),
                # dw
                nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim, bias=False),
                nn.BatchNorm2d(hidden_dim),
                # Squeeze-and-Excite
                SELayer(hidden_dim) if use_se else nn.Identity(),
                h_swish() if use_hs else nn.ReLU(inplace=True), ## <-- HERE!!!
                # pw-linear
                nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
                nn.BatchNorm2d(oup),
            )

Is it correct, or should it be placed before the SE Block?

I think it's correct, please refer to figure 4 of the paper. But in my practice, it seems that placing h-swish before SE brings marginal benefit.