weight distribution for deconv layer
r0drigor opened this issue · comments
This might be a question that is more about caffe than FlowNet but I'll ask here anyway.
I'm doing some work with FlowNet2-s (the first iteration of the project), and I'm using MATLAB to get the weight values for each layer and I can't understand why the deconvolution layers' weights use a different structure than regular convolution.
(I'm using net.params('conv1').get_data()
).
For regular convolutions the weight distribution is (h,w,c,n), while for deconv layers it's (h,w,n,c). Why is this the case?
This is just a quick guess, but "deconvolution" is implemented as transposed convolution (both layers directly use the cublasSgemm
matrix multiplication).
I can't find any documentation on any of this, but maybe I'm just not looking hard enough.
There's probably some kind of transposition on the height and width parameters, right?
I'm having some difficulty getting reasonable values and it might be an explanation.
Thank you for the fast response.
(After your answer you can lock this issue)
I think it's common to implement these "deconvolutions" like that; see e.g. http://deeplearning.net/software/theano_versions/dev/tutorial/conv_arithmetic.html#transposed-convolution-arithmetic
You can check the Forward_cpu
implementations in the conv
and deconv
layers, they should match the method in the link.
I can't tell you why exactly the parameters are structured differently; I can only assume that the transposed convolution plays a role.
(closed due to inactivity)