hotpotqa / hotpot

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cuda error during training step.

AndresFRJ98 opened this issue · comments

image

Hello I was running the training step as mentioned python main.py --mode train --para_limit 2250 --batch_size 24 --init_lr 0.1 --keep_prob 1.0 --sp_lambda 1.0

I encountered this error. I am running this on GPU , is it possible that this is a memory allocation issue? Any idea how much memory is required to perform the training step?

Any pointers towards resolving this would be much appreciated.

Updated my cuda version and it worked out :)

how to do it? my environment is
torch 0.3
cuda 9.0
while my cuda version in "nvidia-smi" is 10.2,cudatookit is 10.1
i find that there is no higher cuda version for torch 0.3 while readme said they use torch=0.3

thank!