vdumoulin / conv_arithmetic

A technical report on convolution arithmetic in the context of deep learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

clarify the kernel

junshi15 opened this issue · comments

Thanks for the great tutorial. The paper is informative and the animation is intuitive.

I would like to point out the kernel used in the deconvolution is not the same as the one used in the corresponding convolution, the former is a double-flipped version of the latter. For example, if the kernel used in the convolution is a 2x3 matrix [1,2,3; 4,5,6], i.e. first row is [1,2,3], and second row is [4,5,6], then the kernel used in the deconvolution is a double-flipped 2x3 matrix, (first flip along row-axis, then flip along column-axis). This results in [6,5,4;3,2,1], i.e. first row is [6,5,4], second row is [3,2,1].

If you agree with this, it would be nice to state it in the paper so that readers won't be confused.

Again, this is a great tutorial and I really appreciate it.

Thank you for your appreciation of the tutorial and for taking the time to open an issue.

We put attention - on purpose - to avoid the terms deconvolution and inverse convolution, as formally (from a signal processing point of view) a convolution is a multiplication in the Fourier domain and it's inverse is a division in the Fourier domain. Therefore to compute an (approximate) inverse convolution (when stride is 1) one would need to divide the FFT of the convolved image by the FFT of the kernel and then apply an inverse FFT to recover the original image.

What we focus on is the transposed convolution, i.e. a convolutional operation that allows to recover the original image shape with a connectivity pattern that is similar to the one of the inverse convolution. The filter of this operation is learned, so in theory it should be possible to learn to approximately invert the convolution.

I don't know if this answers your question. If this is the case please close the issue, otherwise if you can point me to the section of the text you are referring to I will surely look into it. Thank you!

Thanks for your answer, I had it in mind that transposed convolution is deconvolution, as it has been widely (mis-)used in the deep learning community. I failed to notice that your definition is different. I am now fine with your explanation.

It is easy to mis-interpret things in this field: the same operation is called with different names and sometimes - quite unfortunately - different operations are called with the same name. I am glad that my answer helped clarifying your doubt.