WXinlong / SOLO

SOLO and SOLOv2 for instance segmentation, ECCV 2020 & NeurIPS 2020.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RuntimeError: CUDA out of memory occurred when testing

zhuaiyi opened this issue · comments

command

python tools/test_ins.py configs/solov2/solov2_light_448_r34_fpn_8gpu_3x.py  work_dirs/solov2_light_release_r34_fpn_8gpu_3x/epoch_36.pth --show --out  results_solo.pkl
 --eval segm

bug
[>>>>>>>>>>>>> ] 20/76, 0.3 task/s, elapsed: 59s, ETA: 165s
Traceback (most recent call last):
...
RuntimeError: CUDA out of memory. Tried to allocate 3.30 GiB (GPU 0; 8.00 GiB total capacity; 973.14 MiB already allocated; 2.13 GiB free; 3.74 GiB reserved in total by PyTorch)
Then I shrinked my test set to 14 images, same error occurred when [>> ] 2/14.

Environment
python 3.7
CUDA 11.1
PyTorch 1.7.0+cu110

Supplement
The epoch_36.pth file is generated from the training on my own dataset. And performed pretty good when single-tested by inference_demo.py but fail with this batch-test command.

@zhuaiyi You can reduce the number of objects in post-process, e.g., to set a smaller MODEL.SOLOV2.NMS_PRE. Or, move this sort and select part ahead, after the model prediction. For example, move Line440~448 to Line 410 of solov2.py, with minimal modifications, e.g., make sure you modify the names and didn't miss any variables.

@zhuaiyi You can reduce the number of objects in post-process, e.g., to set a smaller MODEL.SOLOV2.NMS_PRE. Or, move this sort and select part ahead, after the model prediction. For example, move Line440~448 to Line 410 of solov2.py, with minimal modifications, e.g., make sure you modify the names and didn't miss any variables.

Thanks very much! I'll get to work on it.

@zhuaiyi You can reduce the number of objects in post-process, e.g., to set a smaller MODEL.SOLOV2.NMS_PRE. Or, move this sort and select part ahead, after the model prediction. For example, move Line440~448 to Line 410 of solov2.py, with minimal modifications, e.g., make sure you modify the names and didn't miss any variables.

Thanks very much! I'll get to work on it.
你好,请问你解决了吗

@zhuaiyi You can reduce the number of objects in post-process, e.g., to set a smaller MODEL.SOLOV2.NMS_PRE. Or, move this sort and select part ahead, after the model prediction. For example, move Line440~448 to Line 410 of solov2.py, with minimal modifications, e.g., make sure you modify the names and didn't miss any variables.

Thanks very much! I'll get to work on it.
你好,请问你解决了吗

我回去看了下,作者给出的解决方式是基于AdelaiDet框架的,我用的是mmdet,后来通过修改配置文件的test_pipeline下的img_scale之后测试成功了

@zhuaiyi You can reduce the number of objects in post-process, e.g., to set a smaller MODEL.SOLOV2.NMS_PRE. Or, move this sort and select part ahead, after the model prediction. For example, move Line440~448 to Line 410 of solov2.py, with minimal modifications, e.g., make sure you modify the names and didn't miss any variables.

Thanks very much! I'll get to work on it.
你好,请问你解决了吗

我回去看了下,作者给出的解决方式是基于AdelaiDet框架的,我用的是mmdet,后来通过修改配置文件的test_pipeline下的img_scale之后测试成功了

您好,想问一下您在采用inference_demo.py来批量推理图片时GPU占用特别大,您有什么解决办法吗?