dataloader memory leak issue

Question

dataloader memory leak issue

wuqianliang opened this issue 2 years ago · comments

Hi, thank you for your excellent work.
When I run your code, it seems always to have the data loader worker killed after running for a while. Is there a possible memory leak?
I have set the load workers to 4 and batch size to 2 and have a TITAN card and 72G memory.

Yang Li · Answer 1 · Mon Apr 25 2022 22:24:06 GMT+0800 (China Standard Time)

I have never seen this on my machine.
It's possibly caused by memory size.
Maybe you can try to use a smaller num_worker, say 1, 2 or even 0.
also see this: pytorch/pytorch#8976 (comment)

chalth · Answer 2 · Mon Apr 24 2023 11:05:03 GMT+0800 (China Standard Time)

Hi， I meet the same problem, did you slove it?

ACuOoOoO · Answer 3 · Fri Jul 14 2023 14:56:03 GMT+0800 (China Standard Time)

I find that the problem is caused by AverageMeter.update() in a training step. I solve the problem by detach the input tensor during the accumulation in AverageMeter.update().