PDillis / stylegan3-fun

Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Start from pretrained at different resolution

EnricoBeltramo opened this issue · comments

Is your feature request related to a problem? Please describe.
Is it possible to load a pretrained model at different resolution? I have a pretrained at 512x512 and I would start from it to train a new one at 256x256.

Describe the solution you'd like
Automatic recovery of previously trained layers, when they match

Describe alternatives you've considered
Resize images, but train at 512 require too much time

This is an interesting problem, but not too straight forward I'm afraid. What I think could be done is to start a new network, copy the weights at the resolution you want, leave the rest (if any) randomly initialized, then continue training hence. Note that this might not work in StyleGAN3, as both the number of channels and shapes of layers at the same 'resolution' will change depending on the final output resolution. For example, using FFHQU, the layers and shapes at 256x256 are (python generate.py images --network=ffhqu256 --cfg=stylegan3-r --seeds=0 --available-layers):

Name: input => Channels: 1024 => Size: [36, 36]
Name: L0_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L1_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L2_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L3_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L4_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L5_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L6_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L7_148_724 => Channels: 724 => Size: [148, 148]
Name: L8_148_512 => Channels: 512 => Size: [148, 148]
Name: L9_148_362 => Channels: 362 => Size: [148, 148]
Name: L10_276_256 => Channels: 256 => Size: [276, 276]
Name: L11_276_181 => Channels: 181 => Size: [276, 276]
Name: L12_276_128 => Channels: 128 => Size: [276, 276]
Name: L13_256_128 => Channels: 128 => Size: [256, 256]
Name: L14_256_3 => Channels: 3 => Size: [256, 256]
Name: output => Channels: 3 => Size: [256, 256]

At 1024x1024 resolution, these are (python generate.py images --network=ffhqu1024 --cfg=stylegan3-r --seeds=0 --available-layers):

Name: input => Channels: 1024 => Size: [36, 36]
Name: L0_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L1_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L2_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L3_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L4_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L5_148_1024 => Channels: 1024 => Size: [148, 148]
Name: L6_148_1024 => Channels: 1024 => Size: [148, 148]
Name: L7_276_645 => Channels: 645 => Size: [276, 276]
Name: L8_276_406 => Channels: 406 => Size: [276, 276]
Name: L9_532_256 => Channels: 256 => Size: [532, 532]
Name: L10_1044_161 => Channels: 161 => Size: [1044, 1044]
Name: L11_1044_102 => Channels: 102 => Size: [1044, 1044]
Name: L12_1044_64 => Channels: 64 => Size: [1044, 1044]
Name: L13_1024_64 => Channels: 64 => Size: [1024, 1024]
Name: L14_1024_3 => Channels: 3 => Size: [1024, 1024]
Name: output => Channels: 3 => Size: [1024, 1024]

So the number of channels at layer 7 already differ by 79 that the smaller model needs, not to mention the shape. So perhaps this is easier to do in --cfg=stylegan2 (as each block is of a power of 2 and has a torgb operation), which is why @aydao has it in his repo here. I'd be happy to port it over to Pytorch, but let me know if this is what you want before delving into it.

Originally I was planning to use stylegan3, but I have same pretrained for stylegan2, so I guess I could use it too! It should be very helpful in my opinion if you can do that feature! Thank you!