akanimax / pro_gan_pytorch

Unofficial PyTorch implementation of the paper titled "Progressive growing of GANs for improved Quality, Stability, and Variation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Help restarting after GPU out of memory error at 3 days

djproc opened this issue · comments

Hi, I'm too much of a noob to figure this out myself, and would love to know if there is a simple answer.

I was training for 3 days on my own dataset and was loving the results at 256x256, unfortunately as soon as we progressed up, the batch size was too large to be handled by my GPU. I guess i'll have to make it bs=2 or 1 (currently 4).

Is there a way to restart the training from the end point of 256x256? I don't want to start all over again...

THANK YOU!

Djproc

p.s. this is some fantastic work you have done and i'm really appreciative that you've made it so easy to get started!

@djproc,

Yes, in order to restart training from 256 x 256, you need to:
1.) Set the start depth = 7
2.) Provide all the five .pth files to the training script, viz. generator_weights, discriminator_weights, stable_generator_weights, generator_optimizer and discriminator_optimizer.
3.) Ensure that the fade-in alpha is working properly.

Please let me know if you are facing any more problems.

Cheers 🍻!
@akanimax