pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Device-side assertions’ error when speculative decoding with different length of prompts.

ZipECHO opened this issue · comments

commented

I am running the speculative sampling task with the ‘compile’ mode of the generate.py script. The original speculative decoding version of gpt-fast decodes one prompt several times, but I want to decode different prompts. I have observed that when I decode with prompts of different lengths, I encounter an ‘enable device-side assertions’ error. The following are the error messages:

unknown:0: unknown: block: [33,0,0], thread: [0,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.
unknown:0: unknown: block: [33,0,0], thread: [1,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.
unknown:0: unknown: block: [33,0,0], thread: [2,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.
unknown:0: unknown: block: [33,0,0], thread: [3,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.
...
unknown:0: unknown: block: [59,0,0], thread: [47,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.
unknown:0: unknown: block: [59,0,0], thread: [48,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.
unknown:0: unknown: block: [59,0,0], thread: [49,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.
unknown:0: unknown: block: [59,0,0], thread: [50,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.
unknown:0: unknown: block: [59,0,0], thread: [51,0,0] Assertion `index out of bounds: 0 <= tmp62 < 216` failed.