Error(s) in loading state_dict for Transformer

Question

Error(s) in loading state_dict for Transformer

Nikita-Sherstnev opened this issue 6 months ago · comments

I am running preparation script for CodeLlama: ./scripts/prepare.sh codellama/CodeLlama-13b-Instruct-hf
And I got following error:

RuntimeError: Error(s) in loading state_dict for Transformer:
	size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
	size mismatch for output.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).

Horace He · Answer 1 · Wed Dec 06 2023 05:50:11 GMT+0800 (China Standard Time)

I think model.py just doesn't have a config for CodeLlama-13B. You probably just need to add a config here: https://github.com/pytorch-labs/gpt-fast/blob/main/model.py#L53

Nikita · Answer 2 · Wed Dec 06 2023 22:31:16 GMT+0800 (China Standard Time)

Thanks you, this helped.