Code for QuaRot, an end-to-end 4-bit inference of large language models.
Home Page:https://arxiv.org/abs/2404.00456
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool