intel / auto-round

SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"

Home Page:https://arxiv.org/abs/2309.05516

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

intel/auto-round Stargazers