Where are images resized?

Question

Where are images resized?

unixpickle opened this issue 4 years ago · comments

It appears that the code never resizes images to be the correct 299x299 for the inception model. Is it the case that all of the results on 64x64 images are obtained by feeding smaller images into the convolutional network and simply assuming that the outputs are meaningful? Or is there a resize somewhere I'm not seeing?

I also observed that resolution mattered immensely when comparing to the precomputed npz matrices in this repository. In particular, if the images were not 64x64, the FID was extremely high, so I'm assuming those npz matrices were computed by feeding 64x64 images directly into the inception graph.

Bahjat Kawar · Answer 1 · Sat Aug 28 2021 15:16:09 GMT+0800 (China Standard Time)

It looks like the model resizes the images to 299x299.

I am also facing the issue of extremely high FID when comparing to the precomputed npz matrices. Did you find a solution for this? Is there perhaps a different set of precomputed statistics for comparison on other resolutions (128, 256, 512)?

Martin Heusel · Answer 2 · Sat Aug 28 2021 15:57:33 GMT+0800 (China Standard Time)

Hi,
other datasets, this includes different resolutions, produce different activations in the coding layer (e.g. the last pooling layer in the inception network) and this implies different statistics, therefore you need to precompute the reference statistics for this dataset for yourself. HTH