More RAM is consumed as training epoch increases

Question

More RAM is consumed as training epoch increases

hongvin opened this issue 3 years ago · comments

I have repeated the experiment for many times and eventually at the 14th epoch, the process is killed, mainly due to my CPU RAM is exhausted. My current configuration is 8 core CPUs, 30GB CPU RAM, P5000 16GB GPU.

I have tried to reduce memory bank size, but it's still the same.

SeonWoo-Lee · Answer 1 · Fri Apr 01 2022 19:44:45 GMT+0800 (China Standard Time)

reduce num_worker thread and batch size.

CreatedTRYNA · Answer 2 · Tue Sep 06 2022 20:25:34 GMT+0800 (China Standard Time)

I have the same problem when I try to train this model, and anything can I do?