Several questions regarding the VGG Loss
sanchitsgupta opened this issue · comments
Hello,
First of all, thanks you for making the code publicly available. Your work on Image-to-Image translation is really amazing.
I am trying to achieve domain translation between 2 datasets. For this I am working with the master branch of UNIT, and I noticed a difference between this branch and the version_02 branch. The latter is void of the VGG loss while the former has it. Regarding the same, I have several questions. (I tried going through the solved issues and apologize in advance if any of these have been answered before.)
-
What is the reason for adding a VGG loss to the master branch, and not to the version_02 one. Is it because the latter has explicit shared layers in the generator?, or is it due to some other reason?
-
How did you decide the weights of the various losses of the generator? Why is the default VGG loss weight so low compared to the cyclic and reconstruction loss?
-
While carrying out my experiments, I noticed that, without the vgg loss objects start to disappear in the translated images. Why is this so? I ask this because vgg loss seems to have a huge impact on the quality of the generated images, despite of its low weight.
-
While carrying out more experiments on a custom dataset, I noticed that the vgg loss for both the generators first decreases and then increases by a huge amount. More over vgg_a is much higher than vgg_b. It should be noted that my datasets were imbalanced with a ration of 7:1. Is this the reason behind the increase in loss values, if so, could you please explain your reasons for the same. Or do I need to change optimizer, learning rate, step size etc.
Thank you so much for taking your time to read this.
Any help is greatly appreciated :)
@sanchit199617
Regarding question 1
This is not the standard VGG loss. This is a variant called the domain-invariant VGG loss. The detailed is described in our follow up work on MUNIT (paper: https://arxiv.org/abs/1804.04732; code: https://github.com/NVlabs/MUNIT). Basically, we use a normalization technique to make the VGG loss less sensitive to the domain change. As minimizing the VGG loss leads to a regression problem, it tends to stabilize training for large resolution images.
Regarding question 2
We do not have a good way to determine the VGG loss. We test several values. The default one works reasonable for our test cases.
Regarding question 3 and 4,
The scene complexity affects image-to-image translation training a lot. Unfortunately, I am unaware of a general receipt to dealing with all the cases.
@mingyuliutw Thank you very much for the prompt response.