Yuxiang1995 / ICDAR2021_MFD

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection(公式检测冠军方案)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RuntimeError: CUDA OOM / Bad Training Performance after reducing crop_size

AaaJrwp4 opened this issue · comments

Hello @Yuxiang1995 ,

I am attempting to train your Network on a single gpu.
But then I get this Error:

RuntimeError: CUDA out of memory. Tried to allocate 1.86 GiB   
(GPU 0; 4.00 GiB total capacity; 1.48 GiB already allocated; 319.91 MiB free; 1.88   
 GiB reserved in total by PyTorch) 

I also tried it with a 10 GiB GPU but still the same error.
... well I am able to train it when I reduce the crop_size in conigs/_base_/datasets/formula_detection.py
But it seem the model doesnt learn anything, since the loss doesnt get smaller.
... and I saw in your presentation that the large crop_size is a feature of your model.

Can you give me hint how to sucessfully train the model on a single gpu, i.e get rid of the CUDA OOM Error ?

Hi, I tried reducing the input image size, which will make it run but now I have problems because the loss is not converging (as others said in several issues)