mingyuliutw / UNIT

Unsupervised Image-to-Image Translation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Several questions regarding the VGG Loss

sanchitsgupta opened this issue · comments

Hello,

First of all, thanks you for making the code publicly available. Your work on Image-to-Image translation is really amazing.

I am trying to achieve domain translation between 2 datasets. For this I am working with the master branch of UNIT, and I noticed a difference between this branch and the version_02 branch. The latter is void of the VGG loss while the former has it. Regarding the same, I have several questions. (I tried going through the solved issues and apologize in advance if any of these have been answered before.)

  1. What is the reason for adding a VGG loss to the master branch, and not to the version_02 one. Is it because the latter has explicit shared layers in the generator?, or is it due to some other reason?

  2. How did you decide the weights of the various losses of the generator? Why is the default VGG loss weight so low compared to the cyclic and reconstruction loss?

  3. While carrying out my experiments, I noticed that, without the vgg loss objects start to disappear in the translated images. Why is this so? I ask this because vgg loss seems to have a huge impact on the quality of the generated images, despite of its low weight.

  4. While carrying out more experiments on a custom dataset, I noticed that the vgg loss for both the generators first decreases and then increases by a huge amount. More over vgg_a is much higher than vgg_b. It should be noted that my datasets were imbalanced with a ration of 7:1. Is this the reason behind the increase in loss values, if so, could you please explain your reasons for the same. Or do I need to change optimizer, learning rate, step size etc.

Thank you so much for taking your time to read this.
Any help is greatly appreciated :)

@sanchit199617

Regarding question 1

This is not the standard VGG loss. This is a variant called the domain-invariant VGG loss. The detailed is described in our follow up work on MUNIT (paper: https://arxiv.org/abs/1804.04732; code: https://github.com/NVlabs/MUNIT). Basically, we use a normalization technique to make the VGG loss less sensitive to the domain change. As minimizing the VGG loss leads to a regression problem, it tends to stabilize training for large resolution images.

Regarding question 2
We do not have a good way to determine the VGG loss. We test several values. The default one works reasonable for our test cases.

Regarding question 3 and 4,
The scene complexity affects image-to-image translation training a lot. Unfortunately, I am unaware of a general receipt to dealing with all the cases.

@mingyuliutw Thank you very much for the prompt response.