How much memory needed for training?
hwd8868 opened this issue · comments
I used tensorflow-CPU for training dataset with 48G memory. But I get memory fault after training for several hours :
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
I train by GPU with 6G memory.
what is your batchsize for training?
Batch size is 5, the default value.
Now I decrease batch size to 2, also get the same error.
I just run it first, didn't modify your code,
I don't know how to resolve this issue.
Please refer other pages about this.
And probably this could help you.
tensorflow/tensorflow#9487
Ok, thank you. I just found 48G memory was fully used.
Hi hwd8868!
I also run into the same error as you did. I was using CPU with memory around 20 GB, but my maximum usage was only reached to about 45%. I wonder how did you solve your issue, and if you possibly have any suggestions?
Thank you!