Start from pretrained at different resolution

Question

Start from pretrained at different resolution

EnricoBeltramo opened this issue 2 years ago · comments

Is your feature request related to a problem? Please describe.
Is it possible to load a pretrained model at different resolution? I have a pretrained at 512x512 and I would start from it to train a new one at 256x256.

Describe the solution you'd like
Automatic recovery of previously trained layers, when they match

Describe alternatives you've considered
Resize images, but train at 512 require too much time

Diego Porres · Answer 1 · Sun Sep 04 2022 19:36:28 GMT+0800 (China Standard Time)

This is an interesting problem, but not too straight forward I'm afraid. What I think could be done is to start a new network, copy the weights at the resolution you want, leave the rest (if any) randomly initialized, then continue training hence. Note that this might not work in StyleGAN3, as both the number of channels and shapes of layers at the same 'resolution' will change depending on the final output resolution. For example, using FFHQU, the layers and shapes at 256x256 are (python generate.py images --network=ffhqu256 --cfg=stylegan3-r --seeds=0 --available-layers):

Name: input => Channels: 1024 => Size: [36, 36]
Name: L0_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L1_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L2_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L3_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L4_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L5_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L6_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L7_148_724 => Channels: 724 => Size: [148, 148]
Name: L8_148_512 => Channels: 512 => Size: [148, 148]
Name: L9_148_362 => Channels: 362 => Size: [148, 148]
Name: L10_276_256 => Channels: 256 => Size: [276, 276]
Name: L11_276_181 => Channels: 181 => Size: [276, 276]
Name: L12_276_128 => Channels: 128 => Size: [276, 276]
Name: L13_256_128 => Channels: 128 => Size: [256, 256]
Name: L14_256_3 => Channels: 3 => Size: [256, 256]
Name: output => Channels: 3 => Size: [256, 256]

At 1024x1024 resolution, these are (python generate.py images --network=ffhqu1024 --cfg=stylegan3-r --seeds=0 --available-layers):

Name: input => Channels: 1024 => Size: [36, 36]
Name: L0_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L1_36_1024 => Channels: 1024 => Size: [36, 36]
Name: L2_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L3_52_1024 => Channels: 1024 => Size: [52, 52]
Name: L4_84_1024 => Channels: 1024 => Size: [84, 84]
Name: L5_148_1024 => Channels: 1024 => Size: [148, 148]
Name: L6_148_1024 => Channels: 1024 => Size: [148, 148]
Name: L7_276_645 => Channels: 645 => Size: [276, 276]
Name: L8_276_406 => Channels: 406 => Size: [276, 276]
Name: L9_532_256 => Channels: 256 => Size: [532, 532]
Name: L10_1044_161 => Channels: 161 => Size: [1044, 1044]
Name: L11_1044_102 => Channels: 102 => Size: [1044, 1044]
Name: L12_1044_64 => Channels: 64 => Size: [1044, 1044]
Name: L13_1024_64 => Channels: 64 => Size: [1024, 1024]
Name: L14_1024_3 => Channels: 3 => Size: [1024, 1024]
Name: output => Channels: 3 => Size: [1024, 1024]

So the number of channels at layer 7 already differ by 79 that the smaller model needs, not to mention the shape. So perhaps this is easier to do in --cfg=stylegan2 (as each block is of a power of 2 and has a torgb operation), which is why @aydao has it in his repo here. I'd be happy to port it over to Pytorch, but let me know if this is what you want before delving into it.

EnricoBeltramo · Answer 2 · Mon Sep 05 2022 00:14:46 GMT+0800 (China Standard Time)

Originally I was planning to use stylegan3, but I have same pretrained for stylegan2, so I guess I could use it too! It should be very helpful in my opinion if you can do that feature! Thank you!