How to reproduce the result with lm-evaluation-harness

Question

How to reproduce the result with lm-evaluation-harness

Ther-nullptr opened this issue a year ago · comments

I noticed that in the newest commit, you mentioned that all the results are implemented based on the lm-evaluation-harness, so can you show how to evaluate the model using this framework?
I tried to directly evaluate the smoothquant model on lm-evaluation, like this:

python main.py --model hf-causal --model_args pretrained=mit-han-lab/opt-1.3b-smoothquant --tasks lambada_openai --device cuda:0

only get the result:

RuntimeError: Error(s) in loading state_dict for OPTForCausalLM:
        size mismatch for model.decoder.layers.0.self_attn.k_proj.bias: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([2048]).
        size mismatch for model.decoder.layers.0.self_attn.v_proj.bias: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([2048]).
        size mismatch for model.decoder.layers.0.self_attn.q_proj.bias: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([2048]).
        size mismatch for model.decoder.layers.0.self_attn.out_proj.bias: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([2048]).
...