epfLLM / meditron

Meditron is a suite of open-source medical Large Language Models (LLMs).

Home Page:https://huggingface.co/epfl-llm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

load model size mismatch error

aaronlyt opened this issue · comments

operations

I download the model file from https://huggingface.co/epfl-llm/meditron-7b/tree/main
then load the model using:
model = transformers.AutoModelForCausalLM.from_pretrained('./meditron-7b/', trust_remote_code=True, use_cache=True)

get the error:

size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([32000, 4096]) from checkpoint, the shape in current model is torch.Size([32017, 4096]).

package

transformer version is 4.25.2

Hi there! Thank you for sharing your error and info with us.
I tried to replicate your error, but the shape looks consistent on my end:

model.get_input_embeddings().weight.shape: torch.Size([32017, 4096])
model.get_input_embeddings().embedding_dim: 4096
model.get_input_embeddings().num_embeddings: 32017

Using the following code:

model_id = "epfl-llm/meditron-7b"

tokenizer = AutoTokenizer.from_pretrained(model_id, use_cache=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id, use_cache=True, 
    trust_remote_code=True, 
    device_map="auto")

Did you try deleting the HF cache and re-downloading the model weights from HF?

Thanks for you replay, I re-downloading the weights using AutoModelForCausalLM.from_pretrained, It works.
Supplementary explanation: my previous oprations was directly donwloading the pytorch bin weights from HF, then loading, not working