spcl / QuaRot

Code for QuaRot, an end-to-end 4-bit inference of large language models.

Home Page:https://arxiv.org/abs/2404.00456

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

questions about the rotate

Gloria2tt opened this issue · comments

image
Thanks for the wonderful work, however, i have some problem with the code.

I've encountered a problem with the code implementation as described in the Introduction of your paper. The rotation process, as stated, should not alter the model. However, when I used the provided code, I obtained results that contradict this claim.
After removing the quantization part and retaining only the rotation function, I tested the model on the wikitext dataset and obtained significantly degraded performance. Also I check the model's output like others' method in other issues like this:
image
The model began to generate nonsensical words. This conclusion has also been confirmed by others, as you can see in the issues section of your repository. Could you please explain this discrepancy?

Thanks @Gloria2tt for your issue.

I'm not sure that I got your problem. Can you please share your code/config to re-produce the issue?

Thanks