Cuda error during training step.

Question

Cuda error during training step.

AndresFRJ98 opened this issue 5 years ago · comments

Hello I was running the training step as mentioned python main.py --mode train --para_limit 2250 --batch_size 24 --init_lr 0.1 --keep_prob 1.0 --sp_lambda 1.0

I encountered this error. I am running this on GPU , is it possible that this is a memory allocation issue? Any idea how much memory is required to perform the training step?

Any pointers towards resolving this would be much appreciated.

AndresFRJ98 · Answer 1 · Wed Jan 29 2020 20:04:07 GMT+0800 (China Standard Time)

Updated my cuda version and it worked out :)

njucsxh · Answer 2 · Thu Apr 22 2021 23:06:29 GMT+0800 (China Standard Time)

how to do it? my environment is
torch 0.3
cuda 9.0
while my cuda version in "nvidia-smi" is 10.2,cudatookit is 10.1
i find that there is no higher cuda version for torch 0.3 while readme said they use torch=0.3

thank!