mingyuliutw / UNIT

Unsupervised Image-to-Image Translation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Difference between results with/without shared weights

Peony-Wang opened this issue · comments

Hi, thank you for your awesome work!

I notice that you removed the shared weight in the generator and discriminator in the latest implementation, which is organized with the format of MUNIT repo.

I'm wondering what is the necessity of this removal and how it will affect the final results, especially for the photorealistic transfer, e.g., summer to winter, day to night, and whether it will influence the clearness of the generated images. I assume that the change from the original implementation in NIPS to the one similar to MUNIT has some extra bonus since you mentioned in one closed issue that in many cases, the shared weights work well("I found for some tasks, discriminator weight sharing is quite useful. For example, for the SVHN to MNIST domain adaptation, the two adversarial discriminators share weights for several layers. I also found that for the face image translation, discriminator weight-sharing is helpful too (The yaml file I released actually use this setting.). But when the domains are quite different and a patch-based discriminator is used, which often only have few layers. Discriminator weight sharing could hurt the performance." )

Hi, I wonder what is the difference between the current version and the CycleGAN? please reply if you know, Thank you.

@Peony-Wang

Weight-sharing is an inductive bias. For faces where two domains are very similar, using more weight-sharing layers should help with the performance. But when two domains require some shape deformation, like dogs to cats. It hurts. In the original implementation, I made the number of weight-sharing layers a parameter that a user can specify. I meant to redo it in the new repos but unfortunately couldn't find time.

@ikourbane

Even when we don't do any weight-sharing, the model is still different to CycleGAN because of the autoencoding loss. We reconstruct the input using the same domain encoder and decoder. It is still based on the shared-latent space constraint.