ufoym/deepo:pytorch-py36-cu101 cuda.max_memory_allocated crash

Question

ufoym/deepo:pytorch-py36-cu101 cuda.max_memory_allocated crash

qrsforever opened this issue 4 years ago · comments

>>> import torch
>>> torch.__version__
'1.6.0.dev20200607+cu101'
>>> from torch.cuda import max_memory_allocated
>>> max_memory_allocated(0)
Segmentation fault (core dumped)

大地小神 · Answer 1 · Wed Jun 10 2020 22:46:52 GMT+0800 (China Standard Time)

it's my fault, i run docker without --runtime nvidia.

大地小神 · Answer 2 · Thu Jun 11 2020 10:24:50 GMT+0800 (China Standard Time)

reopen, it's not caused by missing the "--runtime nvidia", but caused by "torch" with version 1.6

torch1.5:

>>> import torch
>>> torch.__version__
'1.5.0.dev20200319'
>>> torch.version.cuda
'10.1'
>>> torch.cuda.max_memory_reserved(0)
0
>>>

torch 1.6:

>>> import torch
>>> torch.__version__
'1.6.0.dev20200609+cu101'
>>> torch.version.cuda
'10.1'
>>> torch.cuda.max_memory_reserved(0)
Segmentation fault (core dumped)

Ming · Answer 3 · Mon Dec 27 2021 21:51:03 GMT+0800 (China Standard Time)

Should be OK @ latest deepo images:

>>> import torch
>>> torch.__version__
'1.11.0.dev20211224+cu111'
>>> torch.version.cuda
'11.1'
>>> torch.cuda.max_memory_reserved(0)
'0'