PDillis / stylegan3-fun

Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi-Modal Based Truncation

49xxy opened this issue · comments

commented

Hi!Thank you for your excellent work and summary!

I would like to know how to use multi-modal Based Truncation.

Thank you! Multi-modal truncation code is a skeleton for now (gotta make it nicer so you can use your models), but it works like so:

  • network_pkl is the path or URL to your saved model or the model you want to use
  • description will be the description of the folder that will be created in outdir, where the images will be saved at
  • num_clusters can be changed to whatever you want, but the paper says 64 is good enough

So basically we generate 60,000 random dlatents that will be standardized and then clusterized with KMeans. I have verbose=1 in the KMeans algo, but you can set it to 0 to stop printing stuff. This will take some minutes as you use the CPU, but perhaps there's a possibility of using the GPU in the future.

In the end, 3 images are saved:

  • A grid with: the pure dlatent that uses the seed at the beginning, the global average image or w_avg, and the pure dlatent truncated towards this global average
  • A grid of all the 64 (or num_clusters you have used) that are found in the latent space W
  • The pure dlatent truncated towards each of these 64 new centers

Also, the 64 new centers are saved as .npy files, so you can use them in other parts of the code. For example, generate.py lets you use a new center to use instead of w_avg by passing --new-center' and then the path to any of these 64 new centers/clusters.

Let me know if you have more questions, but this code needs to be redone/make prettier so it's easier to use and clearer as well.