eladrich / pixel2style2pixel

Official Implementation for "Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation" (CVPR 2021) presenting the pixel2style2pixel (pSp) framework

Home Page:https://eladrich.github.io/pixel2style2pixel/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about encoding image

Eric07110904 opened this issue · comments

Thanks for your awesome works, I have a question about GAN inversion.
I used the psp to do GAN inversion in anime domain(512x512 300k images), and used pre-trained anime StyleGAN2(512x512 ).
image
After training 100,000 iteration with batch_size=4, I observed two problems.

  1. detaied structure of anime face(it seems that my model didn't capture the part of mouth、wink)
    image
    image

  2. output is blured
    image

Do you have any suggentstion about solving two problems?
I am wondering if any parameter is set wrong or should i do more iterations or add w_norm loss?
Thanks for your reply!

Using the ID loss is a bit strange here since it was trained for real face images and your anime dataset is out of domain for this.
Other than that, it could be that pSp is not able to fully capture all the details here. You could try other more advanced encoders such as ReStyle and Hyperstyle, or try optimization-based approaches like PTI.