More information about VQGAN

Question

More information about VQGAN

gaopengpjlab opened this issue a year ago · comments

Can you release pretrained VQGAN with more parameters and high resolution? By the way, can you share the FID score the your pretrained VQGAN?

Tianhong Li · Answer 1 · Sat Feb 25 2023 22:30:49 GMT+0800 (China Standard Time)

We do not use VQGANs with different architectures in our paper, and the VQGAN is trained on ImageNet256x256. You can get the pre-trained VQGAN here. The FID of the VQGAN's reconstructed images is around 3.

gaopengpjlab · Answer 2 · Sat Feb 25 2023 22:34:00 GMT+0800 (China Standard Time)

Can you release stronger tokenizer and high-resolution tokenizer ?

gaopengpjlab · Answer 3 · Sat Feb 25 2023 22:35:42 GMT+0800 (China Standard Time)

Such as VQGAN with ImageNet 512 * 512?

Tianhong Li · Answer 4 · Sat Feb 25 2023 22:52:36 GMT+0800 (China Standard Time)

The MAGE paper only contains results on ImageNet 256x256. We do not train a tokenizer with a resolution of 512x512.

For the "stronger" tokenizer, can you specify which one you refer to? We only have two tokenizers. One is trained with "strong augmentation", which is the tokenizer we used in this repo. The other one is trained with "weak augmentation". That one is in JAX and you can get the pre-trained weights here.

gaopengpjlab · Answer 5 · Sat Feb 25 2023 23:12:30 GMT+0800 (China Standard Time)

Stronger means tokenizer with a resolution of 512x512. I am planning to scale MAEG to larger resolution, namely, 512x512.

Tianhong Li · Answer 6 · Sat Feb 25 2023 23:15:09 GMT+0800 (China Standard Time)

I see. Unfortunately we do not train a tokenizer on 512x512. You can also check MaskGIT. In the MaskGIT repo, they release a tokenizer trained on ImageNet 512x512.

gaopengpjlab · Answer 7 · Sat Feb 25 2023 23:22:03 GMT+0800 (China Standard Time)

Thank you so much for your kind reply.

gaopengpjlab · Answer 8 · Mon Feb 27 2023 14:59:43 GMT+0800 (China Standard Time)

The FID of the VQGAN's reconstructed images is around 3.

The FID score is reported on ImagenNet train split or val split?

Tianhong Li · Answer 9 · Mon Feb 27 2023 15:01:23 GMT+0800 (China Standard Time)

All FID scores in the paper are reported on ImageNet val split.

gaopengpjlab · Answer 10 · Mon Feb 27 2023 15:03:44 GMT+0800 (China Standard Time)

Thank you very much. I asked this question because original VQGAN report both train and val FID score.

Tianhong Li · Answer 11 · Mon Feb 27 2023 15:04:53 GMT+0800 (China Standard Time)

No worries