Why does CapsLayer version 2 equivalent to version 1?
AlexHex7 opened this issue · comments
Alex Hex commented
For the input feature map (batch_size, 20, 20, 256), the Conv of version 1 do 256x32x9x9 for each point in feature map, then concat each 8 output feature maps. And Conv of version version 2 do 256x(32x8)x9x9 for each point. That is to say, in version 1, the result of each point of input feature map is effected by only 32 kernels, but in version 2, it will be effected by 32*8 kernels.
Alex Hex commented
My understanding is wrong.