google / compare_gan

Compare GAN code.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How could you get FID on Fashion-mnist??

HuMingqi opened this issue · comments

commented

The image of Fashion is gray image which is 28x28x1, but the inception-v3 requires the channels of input image is 3. Thus you transform the size of images from Fashion? If so, how to do it in your code?
Thanks, please.

commented

For channels, we tile:

# In case we use a 1-channel dataset (like mnist) - convert it to 3 channel.

For resolution, it's resized by tfgan library:

fn=tfgan_eval.preprocess_image,

commented

Thanks for the details by you, so the gray scale image was transformed to rgb by repeating channels. and you mean that the Fashion images' resolution(28x28) were resized? I don't think it needs.

commented

The input to the Inception is not of size 28x28, so it's needed to resized. How would you compute the embedding otherwise?

commented

Inception doesn't limit the input size just acquires the input channel is three. I got the Inception Score without resize op on Fashion.
https://github.com/openai/improved-gan/blob/0a09faccb45088695228bbf50435ee71e94eb2ce/inception_score/model.py#L77
input_tensor = tf.placeholder(tf.float32, shape=[None, None, None, 3]
the FID is the wassertain distance of two Gaussian distribution estimated by the Inception pool3, the embedding dimensions will be consistent only if the input size is the same for real images and fake images. Moreover, if it resized, the data distribution will be distorted.
And I don't find the resize needed in the lib by authors:
https://github.com/bioinf-jku/TTUR
...Maybe I'm wrong based my shallow knowledge, thanks.

commented

OpenAI and tf.gan code differ at which place they input the data to Inception (before or after preprocessing). tf.gan uses input node 'Mul:0' while OpenAI uses input node 'ExpandDims:0'.

As far as I remember this explained the difference, because the resizing to 299x299 happens between those 2 nodes.