Training on Colab - CUDA out of memory
Mancio98 opened this issue · comments
Hi, I would like to ask if someone have tried to train the model on Colab. Yesterday I tried to launch training with GPU, but it runs out of memory. It instantly fills almost all 15Gbs. I tried with smaller batches (6 and 8) but same problem.
Also I replaced the model inside the training of VQGAN , with the same used as inference to the transformer (vq_f16).
Additionally, If @dome272 could upload pretrained weights for both model I would be grateful (I need for my exam project at uni help ahaha)
Many Thanks
Hi. I am also facing the same problem. Have you solved it now? I am going to try the pre-trained models provided in the https://github.com/CompVis/taming-transformers/tree/master.
Hi, unfortunately not. A last thing I would like to try is gradient accumulation but I don't think it will solve the problem.
By the way I was looking to do the same as you. If I succeed I can post my solution here.
You can find also here a pretrained VQGAN and also MaskGIT: https://huggingface.co/llvictorll/Maskgit-pytorch/tree/main
Their GitHub: https://github.com/valeoai/Maskgit-pytorch.
They modified some parameters and other few things of transformer, so I suggest to use only their VQGAN if you want to follow the original implementation of MaskGIT.