Too long input texts cuase device-side assert triggered

Question

Too long input texts cuase device-side assert triggered

li-aolong opened this issue 6 months ago · comments

<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [58,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [59,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [60,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [61,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [62,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [63,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
...
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

When I use llama-2-7b-hf and input ten samples, each with a length of around 2048 after tokenization, I still encounter the above error, even though I have set the block size to 4096.

If I shorten the length, it works.

KAIXIANG LIN · Answer 1 · Wed Mar 20 2024 11:48:17 GMT+0800 (China Standard Time)

Any idea on how to use this for long context example? seems max_seq_len >2048 triggered above error