About the latent.shape

Question

About the latent.shape

ruanjiyang opened this issue 3 years ago · comments

ruanjiyang commented 3 years ago

Hi Bryandlee

thanks for your greate work!

I tried your Domain-Guided Encoder and get the latent (z0) with the shape [1,14,512]

but when I tried the offical stylegan2-pytorch project.py and get the latent with the shape[1,512].

so I really confused. from my understanding, the latent shape is always like [1,512]. so why this 14 coming in your encoder? thanks!

bryandlee · Answer 1 · Sat May 01 2021 16:44:31 GMT+0800 (China Standard Time)

Hi, the [1, 512]-shaped latent corresponds to the original W space where [1, 14, 512]-shaped latent corresponds to the expanded W+ space. W+ space uses different w vectors for each conv layer for more representation power. The stylegan2-pytorch project.py has --w_plus flag for it.

ruanjiyang · Answer 2 · Sun May 02 2021 08:39:51 GMT+0800 (China Standard Time)

Hi, the [1, 512]-shaped latent corresponds to the original W space where [1, 14, 512]-shaped latent corresponds to the expanded W+ space. W+ space uses different w vectors for each conv layer for more representation power. The stylegan2-pytorch project.py has --w_plus flag for it.

Thanks very much for your feedback.

According to my understanding, The stylegan2-pytorch project.py has --w_plus flag for it and the shape is [1,18,512], not [1,14,512].

Is that possible to generate normal [1,512] latent from your stylegan2-encoder-pytorch？

Thanks again!

bryandlee · Answer 3 · Mon May 03 2021 12:01:19 GMT+0800 (China Standard Time)

The exact shape of w+ depends on the number of layers, which is determined by the image size. I think [1, 18, 512] is for 1024x1024 images.

You can modify the output shape of the encoder to be [1, 512] in model.py, but the model should be trained again.

I would suggest checking out other recent models (psp, e4e) as well, although these also use w+ latents instead of the original w.

Hope this helps!