Questions about bias in conv function

Question

Questions about bias in conv function

xiaobeikankan opened this issue 5 years ago · comments

I have some questions about the three functions :conv, deconv,predict_flow in /models/utils file, where the bias is set as False.Can you give some reasonable explanation for this?

Clément Pinard · Answer 1 · Tue Dec 03 2019 20:16:11 GMT+0800 (China Standard Time)

It's only when BatchNorm is applied.
Since spatial batchnorm applies a normalization (substract by mean and divide by std), the bias would be automatically cancelled.

This is problematic since the bias could then diverge without effect during traing since it is cancelled, depending on the regulaization you want to apply, and during inference, the batchnorm substracts a learned value during training. Even though weight decay usually prevents this kind of problemes, this could potentially lead to worse results compared to without bias.

Instead, what can be considered as the bias will be the beta parameter of the batchnorm, that can be learned.

More info on batchnorm and bias here : https://twitter.com/karpathy/status/1013245864570073090

yan_liu · Answer 2 · Wed Dec 04 2019 10:57:52 GMT+0800 (China Standard Time)

Thanks for your explanation!
may be there are two errors in models/utils

def predict_flow(in_planes):
return nn.Conv2d(in_planes,2,kernel_size=3,stride=1,padding=1,bias=False)

def deconv(in_planes, out_planes):
return nn.Sequential(
nn.ConvTranspose2d(in_planes, out_planes, kernel_size=4, stride=2, padding=1, bias=False),
nn.LeakyReLU(0.1,inplace=True)
)

Clément Pinard · Answer 3 · Wed Dec 04 2019 18:15:26 GMT+0800 (China Standard Time)

These are for compliance with the original paper, which was made with caffe.

The rationale behin not biasing predict_flow is that optical flow is more or less 0 centered (even more so if we flip images randomly). As such, the bias would make the netork output with a non zero mean, which is not what we want.

The rational behin deconv is probably only data driven, It worked better without it 🤷‍♂️