dilab-zju / self-speculative-decoding

Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dilab-zju/self-speculative-decoding Watchers