Why is the number of output channels of the first convolution in the first layer 32?

Question

Why is the number of output channels of the first convolution in the first layer 32?

niranjantdesai opened this issue 6 years ago · comments

Niranjan Thakurdesai commented 6 years ago

According to Appendix A in the paper, for the CIFAR datasets, the number of output channels of the three scales is set to 6, 12 and 24 respectively. However, num_channels is set to 32 in msdnet.py. This means that the number of output channels in the first layer for the three scales is 32, 64 and 128 respectively according to the default growth rate 1-2-4-4. Why is there a difference between the implementation details in the paper and the code?

Aviram Bar Haim · Answer 1 · Sat Jun 30 2018 06:32:28 GMT+0800 (China Standard Time)

Hi @niranjantdesai, sorry for the late reply.
If I understand correctly, the original implementation initializes the first layer to initChannels, while growth rate channels (layer's output) are being concatenated to these.
Note that the number of initChannels has been changed in the original implementation 11 days ago:
gaohuang/MSDNet@fc14920.
I didn't find initChannels description in the original paper, so please let me know if you understand this differently.

Niranjan Thakurdesai · Answer 2 · Wed Jul 04 2018 18:17:12 GMT+0800 (China Standard Time)

@avirambh You're right. In this discussion, the original author says that the first layer has a slightly different structure. It is usually set to be twice the width of the subsequent layers, following the design of DenseNet. This is not explicitly mentioned in the paper.