Multi-Modal Based Truncation

Question

Multi-Modal Based Truncation

49xxy opened this issue 2 years ago · comments

Hi！Thank you for your excellent work and summary！

I would like to know how to use multi-modal Based Truncation.

Diego Porres · Answer 1 · Tue Jul 12 2022 01:53:48 GMT+0800 (China Standard Time)

Thank you! Multi-modal truncation code is a skeleton for now (gotta make it nicer so you can use your models), but it works like so:

network_pkl is the path or URL to your saved model or the model you want to use
description will be the description of the folder that will be created in outdir, where the images will be saved at
num_clusters can be changed to whatever you want, but the paper says 64 is good enough

So basically we generate 60,000 random dlatents that will be standardized and then clusterized with KMeans. I have verbose=1 in the KMeans algo, but you can set it to 0 to stop printing stuff. This will take some minutes as you use the CPU, but perhaps there's a possibility of using the GPU in the future.

In the end, 3 images are saved:

A grid with: the pure dlatent that uses the seed at the beginning, the global average image or w_avg, and the pure dlatent truncated towards this global average
A grid of all the 64 (or num_clusters you have used) that are found in the latent space W
The pure dlatent truncated towards each of these 64 new centers

Also, the 64 new centers are saved as .npy files, so you can use them in other parts of the code. For example, generate.py lets you use a new center to use instead of w_avg by passing --new-center' and then the path to any of these 64 new centers/clusters.

Let me know if you have more questions, but this code needs to be redone/make prettier so it's easier to use and clearer as well.