mit-han-lab / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Home Page:https://arxiv.org/abs/2211.10438

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to reproduce the result with lm-evaluation-harness

Ther-nullptr opened this issue · comments

commented

I noticed that in the newest commit, you mentioned that all the results are implemented based on the lm-evaluation-harness, so can you show how to evaluate the model using this framework?
I tried to directly evaluate the smoothquant model on lm-evaluation, like this:

python main.py --model hf-causal --model_args pretrained=mit-han-lab/opt-1.3b-smoothquant --tasks lambada_openai --device cuda:0

only get the result:

RuntimeError: Error(s) in loading state_dict for OPTForCausalLM:
        size mismatch for model.decoder.layers.0.self_attn.k_proj.bias: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([2048]).
        size mismatch for model.decoder.layers.0.self_attn.v_proj.bias: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([2048]).
        size mismatch for model.decoder.layers.0.self_attn.q_proj.bias: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([2048]).
        size mismatch for model.decoder.layers.0.self_attn.out_proj.bias: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([2048]).
...