SafeAILab / EAGLE

Official Implementation of EAGLE

Home Page:https://arxiv.org/abs/2401.15077

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is the rejection and adjusting probability implementation different from normal speculative sampling?

AlvL1225 opened this issue · comments

commented
image

However, in other implementations: like GPT-fast, or lucidrains implementation, the probability (GTP - Q )should be subtracted elementwisely but not only the rejected element?
image

Thanks! The correct approach should be to subtract the two distributions rather than adjust the value of the rejected elements. We have already adjusted the non-greedy code. Since sampling without replacement is performed here, a mask is used to adjust the draft distribution. The rest is consistent with the code in the screenshot you provided.
image

All the experimental results we provided were under the greedy decoding setting and are not affected.