gpleiss / efficient_densenet_pytorch

A memory-efficient implementation of DenseNets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dropout not in 3x3 convolutional layer

lizhenstat opened this issue · comments

Hi, thanks for your work.
I have a question here on dropout. In densenet.py, the block function is defined as follow:

    def forward(self, *prev_features):
        bn_function = _bn_function_factory(self.norm1, self.relu1, self.conv1)
        if self.efficient and any(prev_feature.requires_grad for prev_feature in prev_features):
            bottleneck_output = cp.checkpoint(bn_function, *prev_features)
        else:
            bottleneck_output = bn_function(*prev_features)
        new_features = self.conv2(self.relu2(self.norm2(bottleneck_output)))
        if self.drop_rate > 0:
            new_features = F.dropout(new_features, p=self.drop_rate, training=self.training)
        return new_features

while, the original densenet in torch version, the dropout is add after both 1x1-conv and
3x3-conv

function DenseConnectLayerStandard(nChannels, opt)
   local net = nn.Sequential()

   net:add(ShareGradInput(cudnn.SpatialBatchNormalization(nChannels), 'first'))
   net:add(cudnn.ReLU(true))   
   if opt.bottleneck then
      net:add(cudnn.SpatialConvolution(nChannels, 4 * opt.growthRate, 1, 1, 1, 1, 0, 0))
      nChannels = 4 * opt.growthRate
      if opt.dropRate > 0 then net:add(nn.Dropout(opt.dropRate)) end
      net:add(cudnn.SpatialBatchNormalization(nChannels))
      net:add(cudnn.ReLU(true))      
   end
   net:add(cudnn.SpatialConvolution(nChannels, opt.growthRate, 3, 3, 1, 1, 1, 1))
   if opt.dropRate > 0 then net:add(nn.Dropout(opt.dropRate)) end

   return nn.Sequential()
      :add(nn.Concat(2)
         :add(nn.Identity())
         :add(net))  
end

Is here a particular reason for not adding dropout layer for 3x3 convolutional layer?
Thanks in advance