AlamiMejjati / Unsupervised-Attention-guided-Image-to-Image-Translation

Unsupervised Attention-Guided Image to Image Translation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FID

daishuolove opened this issue · comments

I could ask how do you use FID for evaluation?

I run the code that the author of cyclegan provide,using the official FID of pytorch version ,did not change their source code,The default is pool3, the resulting FID for generating zebra in total 120 and real zebra is 130.5. The result in your the paper is 151.52.
I want to know the number of generating zebra image for evaluation in your test?
I want to know whether the pool3 of Inception network is different from the last hidden layer of the Inception architecture ?

In our work, we want to measure the ability to perform image to image translation only on the foreground of the input image without altering its background. Consequently, we wish the foreground of our generated image to share visual properties of the target domain, while its background should share visual properties of the source domain.

As such, the FID value we report is the mean value between --the generated image compared to the source domain-- and --the generated image compared to the target domain--.

In the case of horse => zebra for example, we will calculate the FID value between generated zebra images (120 images) and real zebra images (140 images) (using test inputs for both), then we will calculate the FID value between generated zebra images (120) and real horse images (120) (using test inputs for both), Finally we take the mean of these two values.

N.B: We use the tensorflow FID implementation in https://github.com/bioinf-jku/TTUR

I know your meaning.You wish the foreground of our generated image to share visual properties of the target domain, while its background should share visual properties of the source domain.Therefore, you did so. whether this methor is good for your wish,and what is the the theoretical basis ?

Unfortunately we do not have a theoretical study to show that such method is consistent with what wish to measure. But at least it gives us an indication.

If we have had the ground truth masks, one option would be to measure the FID only between foreground pixels and Target, then only between background pixels and Source. Unfortunately we do not have such ground truth masks, hence we took the mean which we believe to be consistent with the visual results obtained.

Thank you