gpleiss / efficient_densenet_pytorch

A memory-efficient implementation of DenseNets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is bn_size?

w32zhong opened this issue · comments

commented

Hi, I looked through the code but I failed to understand what is the purpose of bn_size parameter. To my understanding, each layer adds additional k channels (and not more than that) in the "dense layer", it should be exactly the growth rate, but according to this implementation, it adds bn_size * growth_rate. Why?

Sorry for my confusion.

bn_size stands for "bottleneck size." Each "dense layer" consists of two convolutional layers. The first "bottlenecks" down the features to bn_size * growth_rate. The second goes from bn_size * growth_rate to growth_rate - and this is the new feature that is concatenated to the other features.

See page 4 of the DenseNet paper.

commented

@gpleiss Thank you for your answer. Now I get much better understanding. However, if the "bottleneck size" is fixed to 4, then the dense layer will have init_C -> 4k -> k channel (size unchanged through sublayers), does it require 4k to be smaller than init_C (by the name of "bottleneck")? And is the internal "4k" designed for gradually shrinking down the number of channels to k? Is that the purpose?

does it require 4k to be smaller than init_C

No it does not, though it usually is

And is the internal "4k" designed for gradually shrinking down the number of channels to k?

It's just a way to get more non-linearities, and therefore more capacity, from the network without using too many parameters. It's a trick used by other networks (e.g. ResNets).