RuntimeError: CUDA error: an illegal memory access was encountered
gpravi opened this issue · comments
falcontune generate --interactive --model falcon-40b-instruct-4bit
--weights gptq_model-4bit--1g.safetensors --max_new_tokens=50
--use_cache --do_sample
--instruction "Who was the first person on the moon?"
...
RuntimeError: CUDA error: an illegal memory access was encountered
While trying to generate on a multiple GPU machine encountered the above error
So just using 1 GPU with the following, export CUDA_VISIBLE_DEVICES=1
Same here, two 3090 and falcontune crashes with the same CUDA illegal memory access error.