lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about gen_model_answer.py with gpt2-large

QiyaoWei opened this issue · comments

If I run

python gen_model_answer.py --model-path openai-community/gpt2-large --model-id gpt2-large

under the directory FastChat/fastchat/llm_judge I get this error

/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [86,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [87,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [88,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [89,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [90,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [91,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [92,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [93,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [94,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1711403380909/work/aten/src/ATen/native/cuda/Indexing.cu:1237: indexSelectSmallIndex: block: [8,0,0], thread: [95,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
  0%|                                                                                                                                                                                | 0/80 [00:10<?, ?it/s]
Traceback (most recent call last):
  File "/FastChat/fastchat/llm_judge/gen_model_answer.py", line 294, in <module>
    run_eval(
  File "/FastChat/fastchat/llm_judge/gen_model_answer.py", line 55, in run_eval
    get_answers_func(
  File "/anaconda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/anaconda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 800, in to
    self.data = {k: v.to(device=device) for k, v in self.data.items()}
  File "/anaconda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 800, in <dictcomp>
    self.data = {k: v.to(device=device) for k, v in self.data.items()}
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Is this error expected?