Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Home Page:https://sites.google.com/view/medusa-llm
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool