pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error(s) in loading state_dict for Transformer

Nikita-Sherstnev opened this issue · comments

I am running preparation script for CodeLlama: ./scripts/prepare.sh codellama/CodeLlama-13b-Instruct-hf
And I got following error:

RuntimeError: Error(s) in loading state_dict for Transformer:
	size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
	size mismatch for output.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).

I think model.py just doesn't have a config for CodeLlama-13B. You probably just need to add a config here: https://github.com/pytorch-labs/gpt-fast/blob/main/model.py#L53

Thanks you, this helped.