GPU memory issue
KL4805 opened this issue · comments
Dear author,
Thanks for your opensource code!
I currently try to run
python step/run.py --cfg='step/STEP_METR-LA.py' --gpus='0,1'
on two V100 GPUs (with 32GB memory). However, the program fails to run and reported memory overflow. I don't think that this should be the correct case. Could you please tell me what is the normal GPU memory usage? Thanks.
That's strange. I directly use the code cloned from GitHub without any modification, i.e. batch_size is set to 32.
I will first try to figure out the problem myself. Thanks for the reference gpu usage given.
You're welcome. In addition, my PyTorch version is 1.10.0 with CUDA 11.1.
It turns out that I mistakenly setup something in loading the pretrained transformer. After I fixed it, the memory becomes fine.
Thank you for your help.
Could you tell me what's you have done to make the memory become fine? What setup you have set? Thank you very much.