[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
Home Page:https://arxiv.org/abs/2310.08041
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool