Error(s) in loading state_dict for Transformer
Nikita-Sherstnev opened this issue · comments
Nikita commented
I am running preparation script for CodeLlama: ./scripts/prepare.sh codellama/CodeLlama-13b-Instruct-hf
And I got following error:
RuntimeError: Error(s) in loading state_dict for Transformer:
size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
size mismatch for output.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
Horace He commented
I think model.py
just doesn't have a config for CodeLlama-13B
. You probably just need to add a config here: https://github.com/pytorch-labs/gpt-fast/blob/main/model.py#L53
Nikita commented
Thanks you, this helped.