lmb-freiburg / flownet2

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Home Page:https://lmb.informatik.uni-freiburg.de/Publications/2017/IMKDB17/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

weight distribution for deconv layer

r0drigor opened this issue · comments

This might be a question that is more about caffe than FlowNet but I'll ask here anyway.
I'm doing some work with FlowNet2-s (the first iteration of the project), and I'm using MATLAB to get the weight values for each layer and I can't understand why the deconvolution layers' weights use a different structure than regular convolution.
(I'm using net.params('conv1').get_data()).

For regular convolutions the weight distribution is (h,w,c,n), while for deconv layers it's (h,w,n,c). Why is this the case?

This is just a quick guess, but "deconvolution" is implemented as transposed convolution (both layers directly use the cublasSgemm matrix multiplication).

I can't find any documentation on any of this, but maybe I'm just not looking hard enough.
There's probably some kind of transposition on the height and width parameters, right?
I'm having some difficulty getting reasonable values and it might be an explanation.

Thank you for the fast response.

(After your answer you can lock this issue)

I think it's common to implement these "deconvolutions" like that; see e.g. http://deeplearning.net/software/theano_versions/dev/tutorial/conv_arithmetic.html#transposed-convolution-arithmetic

You can check the Forward_cpu implementations in the conv and deconv layers, they should match the method in the link.

I can't tell you why exactly the parameters are structured differently; I can only assume that the transposed convolution plays a role.

(closed due to inactivity)