Question about gen_model_answer.py with gpt2-large
QiyaoWei opened this issue · comments
Qiyao Wei commented
If I run
python gen_model_answer.py --model-path openai-community/gpt2-large --model-id gpt2-large
under the directory FastChat/fastchat/llm_judge
I get this error
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [86,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [87,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [88,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [89,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [90,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [91,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [92,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [93,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [94,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [95,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
0%| | 0/80 [00:10<?, ?it/s]
Traceback (most recent call last):
File "/FastChat/fastchat/llm_judge/gen_model_answer.py", line 294, in <module>
run_eval(
File "/FastChat/fastchat/llm_judge/gen_model_answer.py", line 55, in run_eval
get_answers_func(
File "/anaconda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/anaconda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 800, in to
self.data = {k: v.to(device=device) for k, v in self.data.items()}
File "/anaconda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 800, in <dictcomp>
self.data = {k: v.to(device=device) for k, v in self.data.items()}
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Is this error expected?