Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool