Mismatch in vocab_size between .bin files and .safetensors files
noahboegli opened this issue · comments
Hey !
I'm sorry if this is not an issue and it's just me not understanding the problem, I'm not an expert, rather a novice, in this field.
I'm trying to deploy the project according to your deployment guide.
However, since I don't have access to enough memory for the -70B version of the model, I want to use the --load-8bit
parameter to enable model compression. (I shall specify that I run the model using the CPU, with the --device cpu
flag)
When I use this, I get the following error:
ValueError: Trying to set a tensor of shape torch.Size([32000, 8192]) in "weight" (which has shape torch.Size([32017, 8192])), this look incorrect
If I look in the HF's upload log, I see that there were two main upload of the model:
- The first one with the .bin files, including the
vocab_size
value set to 32000 - The seconde one with the .safetensors files, including the
vocab_size
value set to 32017
My understanding is that to enable model compression, the .bin
files are needed, which do not match to the model configuration anymore.
This is supported by a manual edit of the config.json file to set vocab_size
back to 32000, which allows the model to load properly using --load-8bit
.