DmitryUlyanov / deep-image-prior

Image restoration with neural networks but without learning.

Home Page:https://dmitryulyanov.github.io/deep_image_prior

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA out of memory

Tetsuo7945 opened this issue · comments

I successfully tested your inpainting algorithm for the kate.png and peppers.png on my own image (I changed only this:)
elif ('kate.png' in img_path) or ('peppers.png' in img_path) or ('normal.png' in img_path)

Unfortunately on trying again with another image, I'm getting this error on the main loop:

Starting optimization with ADAM

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-13-a43cee3b5493> in <module>()
     31 
     32 p = get_params(OPT_OVER, net, net_input)
---> 33 optimize(OPTIMIZER, p, closure, LR, num_iter)

10 frames

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   2014     return torch.batch_norm(
   2015         input, weight, bias, running_mean, running_var,
-> 2016         training, momentum, eps, torch.backends.cudnn.enabled
   2017     )
   2018 

RuntimeError: CUDA out of memory. Tried to allocate 684.00 MiB (GPU 0; 15.75 GiB total capacity; 13.88 GiB already allocated; 124.88 MiB free; 14.47 GiB reserved in total by PyTorch)

I have tried restarting the runtime twice and ran torch.cuda.empty_cache() but apparently the memory is still allocated. Would you mind telling a newbie what's going on and how to resolve this?

First of all, thanks to the authors of the paper.

Regarding the CUDA memory issue, the error comes from the size of the variables to be stored on the GPU memory.
It seems that the code could be improved to allocate less memory on the GPU (even though time performances would be decreased).

Your actual options right now IMHO:

  • Find a GPU with more memory
  • Modify the parameters of the algorithm. I actually got pretty fair results with the Kate.png example and
    the following parameters for defining the skip model:
                 num_channels_down = [16] * 5,
                 num_channels_up =   [16] * 5,
                 num_channels_skip =    [16] * 5,  

@Tetsuo7945 if this fixed your problem, can you close it?

Pretty sure I was testing this on colab. I'm no longer using colab, nor have I been attempting to use the software. I'm happy to close the issue.

@KirmTwinty thank you for your input 🙂