momentum in batchnorm function is wrong. It should be 0.9

Question

momentum in batchnorm function is wrong. It should be 0.9

dishank-b opened this issue 6 years ago · comments

I understand that you have copy the parameters value of original torch implementation. But you have to understand that in tensorflow momentum = 1-0.1, as there is difference in convention of moving average in torch and tensorflow.
To be more clear, read this https://stackoverflow.com/questions/48345857/batchnorm-momentum-convention-pytorch

This can adversely affects the results very much at test time

Thomas Germer · Answer 1 · Wed Aug 08 2018 03:44:38 GMT+0800 (China Standard Time)

For what it's worth, I ran the facade example with:

momentum=0.1
momentum=0.9
no batchnorm at all

Bran · Answer 2 · Wed Aug 08 2018 16:01:48 GMT+0800 (China Standard Time)

@99991 @dishank-b It seems momentum=0.1 works well

Dishank Bansal · Answer 3 · Wed Aug 08 2018 20:01:17 GMT+0800 (China Standard Time)

@99991.. Thanks for sharing the results. But I think testing on one dataset won't be enough. It may be the case, that these images are very similar hence moving average converged easily even with 0.1 momentum. But for other datasets, converging may be slow, or non-convergence to correct mean and variance. It will be good if can re-run the experiment with some large dataset. And just to make sure, these are from the testing set and not training set?

davesean · Answer 4 · Fri Oct 26 2018 23:12:31 GMT+0800 (China Standard Time)

@99991.. Thanks for sharing the results. But I think testing on one dataset won't be enough. It may be the case, that these images are very similar hence moving average converged easily even with 0.1 momentum. But for other datasets, converging may be slow, or non-convergence to correct mean and variance. It will be good if can re-run the experiment with some large dataset. And just to make sure, these are from the testing set and not training set?

Thank you for mentioning this! I have been working with Cityscapes and have had very bad results so far for the generator. Perhaps this might be one of the reasons. I'll be testing it out!

Tonmoy Borah · Answer 5 · Thu Nov 15 2018 20:39:42 GMT+0800 (China Standard Time)

Nice observation!
Have people tested this out?

mrgloom · Answer 6 · Wed Jan 15 2020 00:49:12 GMT+0800 (China Standard Time)

Default params for tensorflow 1.14, i.e. momentum=0.99

def batch_normalization(inputs,
                        axis=-1,
                        momentum=0.99,
                        epsilon=1e-3,
                        center=True,
                        scale=True,
                        beta_initializer=init_ops.zeros_initializer(),
                        gamma_initializer=init_ops.ones_initializer(),
                        moving_mean_initializer=init_ops.zeros_initializer(),
                        moving_variance_initializer=init_ops.ones_initializer(),
                        beta_regularizer=None,
                        gamma_regularizer=None,
                        beta_constraint=None,
                        gamma_constraint=None,
                        training=False,
                        trainable=True,
                        name=None,
                        reuse=None,
                        renorm=False,
                        renorm_clipping=None,
                        renorm_momentum=0.99,
                        fused=None,
                        virtual_batch_size=None,
                        adjustment=None):