dome272 / MaskGIT-pytorch

Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Training on Colab - CUDA out of memory

Mancio98 opened this issue · comments

Hi, I would like to ask if someone have tried to train the model on Colab. Yesterday I tried to launch training with GPU, but it runs out of memory. It instantly fills almost all 15Gbs. I tried with smaller batches (6 and 8) but same problem.
Also I replaced the model inside the training of VQGAN , with the same used as inference to the transformer (vq_f16).

Additionally, If @dome272 could upload pretrained weights for both model I would be grateful (I need for my exam project at uni help ahaha)

Many Thanks

Hi. I am also facing the same problem. Have you solved it now? I am going to try the pre-trained models provided in the https://github.com/CompVis/taming-transformers/tree/master.

Hi, unfortunately not. A last thing I would like to try is gradient accumulation but I don't think it will solve the problem.
By the way I was looking to do the same as you. If I succeed I can post my solution here.
You can find also here a pretrained VQGAN and also MaskGIT: https://huggingface.co/llvictorll/Maskgit-pytorch/tree/main
Their GitHub: https://github.com/valeoai/Maskgit-pytorch.
They modified some parameters and other few things of transformer, so I suggest to use only their VQGAN if you want to follow the original implementation of MaskGIT.