CUDA out of memory

Question

CUDA out of memory

Tetsuo7945 opened this issue 4 years ago · comments

I successfully tested your inpainting algorithm for the kate.png and peppers.png on my own image (I changed only this:)
elif ('kate.png' in img_path) or ('peppers.png' in img_path) or ('normal.png' in img_path)

Unfortunately on trying again with another image, I'm getting this error on the main loop:

Starting optimization with ADAM

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-13-a43cee3b5493> in <module>()
     31 
     32 p = get_params(OPT_OVER, net, net_input)
---> 33 optimize(OPTIMIZER, p, closure, LR, num_iter)

10 frames

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   2014     return torch.batch_norm(
   2015         input, weight, bias, running_mean, running_var,
-> 2016         training, momentum, eps, torch.backends.cudnn.enabled
   2017     )
   2018 

RuntimeError: CUDA out of memory. Tried to allocate 684.00 MiB (GPU 0; 15.75 GiB total capacity; 13.88 GiB already allocated; 124.88 MiB free; 14.47 GiB reserved in total by PyTorch)

I have tried restarting the runtime twice and ran torch.cuda.empty_cache() but apparently the memory is still allocated. Would you mind telling a newbie what's going on and how to resolve this?

KirmTwinty · Answer 1 · Wed Dec 22 2021 17:51:39 GMT+0800 (China Standard Time)

First of all, thanks to the authors of the paper.

Regarding the CUDA memory issue, the error comes from the size of the variables to be stored on the GPU memory.
It seems that the code could be improved to allocate less memory on the GPU (even though time performances would be decreased).

Your actual options right now IMHO:

Find a GPU with more memory

Modify the parameters of the algorithm. I actually got pretty fair results with the Kate.png example and
the following parameters for defining the skip model:

             num_channels_down = [16] * 5,
             num_channels_up =   [16] * 5,
             num_channels_skip =    [16] * 5,

Mauricio Jorge Cordeiro Garrido · Answer 2 · Fri Dec 16 2022 19:42:17 GMT+0800 (China Standard Time)

@Tetsuo7945 if this fixed your problem, can you close it?

zcy5417 · Answer 3 · Fri Dec 16 2022 19:42:50 GMT+0800 (China Standard Time)

已收到！祝每天开开心心！！

Tetsuo7945 · Answer 4 · Tue Dec 20 2022 18:04:32 GMT+0800 (China Standard Time)

Pretty sure I was testing this on colab. I'm no longer using colab, nor have I been attempting to use the software. I'm happy to close the issue.

@KirmTwinty thank you for your input 🙂