liuzhuang13 / DenseNet

Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About a tensorflow implementation

jh-jeong opened this issue · comments

I've followed one of Tensorflow implementations of DenseNet (https://github.com/ikhlestov/vision_networks) to reproduce DenseNet-BC-100-12.
It seemed to me that the tensorflow implementation is nearly equivalent with one from this repo,
but I couldn't reach to ~4.5 % error (the best one was about ~4.8 %, by the way)
Could you give me any reasons why it is? I already compared two codes very carefully, but couldn't find.

@jh-jeong In my "Much more efficient caffe implementation", I also reach about 4.8% for DenseNet-BC-100-12. I am curious of the cause which seems to be common between Caffe and Tensorflow.

@Tongcheng Finally I could get 4.5% in Tensorflow. What I changed are as follows:

  1. Changing the momentum in each BN. In Tensorflow, batch normalization uses 0.999 as the default value, but torch uses 0.9.
  2. Applying weight decay for 'all' trainable variables, as fb.resnet.torch did, including beta/gamma variables in BN and all biases.

@jh-jeong can you share your tf-version code?