affinelayer / pix2pix-tensorflow

Tensorflow port of Image-to-Image Translation with Conditional Adversarial Nets https://phillipi.github.io/pix2pix/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

momentum in batchnorm function is wrong. It should be 0.9

dishank-b opened this issue · comments

I understand that you have copy the parameters value of original torch implementation. But you have to understand that in tensorflow momentum = 1-0.1, as there is difference in convention of moving average in torch and tensorflow.
To be more clear, read this https://stackoverflow.com/questions/48345857/batchnorm-momentum-convention-pytorch

This can adversely affects the results very much at test time

For what it's worth, I ran the facade example with:

  • momentum=0.1
  • momentum=0.9
  • no batchnorm at all

batchnorm

commented

@99991 @dishank-b It seems momentum=0.1 works well

@99991.. Thanks for sharing the results. But I think testing on one dataset won't be enough. It may be the case, that these images are very similar hence moving average converged easily even with 0.1 momentum. But for other datasets, converging may be slow, or non-convergence to correct mean and variance. It will be good if can re-run the experiment with some large dataset. And just to make sure, these are from the testing set and not training set?

@99991.. Thanks for sharing the results. But I think testing on one dataset won't be enough. It may be the case, that these images are very similar hence moving average converged easily even with 0.1 momentum. But for other datasets, converging may be slow, or non-convergence to correct mean and variance. It will be good if can re-run the experiment with some large dataset. And just to make sure, these are from the testing set and not training set?

Thank you for mentioning this! I have been working with Cityscapes and have had very bad results so far for the generator. Perhaps this might be one of the reasons. I'll be testing it out!

Nice observation!
Have people tested this out?

Default params for tensorflow 1.14, i.e. momentum=0.99

def batch_normalization(inputs,
                        axis=-1,
                        momentum=0.99,
                        epsilon=1e-3,
                        center=True,
                        scale=True,
                        beta_initializer=init_ops.zeros_initializer(),
                        gamma_initializer=init_ops.ones_initializer(),
                        moving_mean_initializer=init_ops.zeros_initializer(),
                        moving_variance_initializer=init_ops.ones_initializer(),
                        beta_regularizer=None,
                        gamma_regularizer=None,
                        beta_constraint=None,
                        gamma_constraint=None,
                        training=False,
                        trainable=True,
                        name=None,
                        reuse=None,
                        renorm=False,
                        renorm_clipping=None,
                        renorm_momentum=0.99,
                        fused=None,
                        virtual_batch_size=None,
                        adjustment=None):