Merged models giving error during evaluation

Question

Merged models giving error during evaluation

monk1337 opened this issue 2 months ago · comments

Aaditya Ura (looking for PhD Fall’24) commented 2 months ago

I have merged the models and while using lm-evaluation I am getting following error:

Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/miniconda3/envs/py3.10/bin/lm_eval", line 8, in <module> sys.exit(cli_evaluate()) File "/workspace/axolotl/llm_exp/lm-evaluation-harness/lm_eval/__main__.py", line 341, in cli_evaluate results = evaluator.simple_evaluate( File "/workspace/axolotl/llm_exp/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper return fn(*args, **kwargs) File "/workspace/axolotl/llm_exp/lm-evaluation-harness/lm_eval/evaluator.py", line 180, in simple_evaluate lm = lm_eval.api.registry.get_model(model).create_from_arg_string( File "/workspace/axolotl/llm_exp/lm-evaluation-harness/lm_eval/api/model.py", line 134, in create_from_arg_string return cls(**args, **args2) File "/workspace/axolotl/llm_exp/lm-evaluation-harness/lm_eval/models/huggingface.py", line 202, in __init__ self._create_model( File "/workspace/axolotl/llm_exp/lm-evaluation-harness/lm_eval/models/huggingface.py", line 540, in _create_model self._model = self.AUTO_MODEL_CLASS.from_pretrained( File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained return model_class.from_pretrained( File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3562, in from_pretrained ) = cls._load_pretrained_model( File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3989, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/transformers/modeling_utils.py", line 813, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 345, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([32000, 4096]) in "weight" (which has shape torch.Size([32002, 4096])), this look incorrect.

Hailey Schoelkopf · Answer 1 · Wed Apr 17 2024 06:10:43 GMT+0800 (China Standard Time)

Hi!

Might need some more information to help here but it looks like there are 2 added tokens in your model's vocabulary, and either your saved model hasn't retained these 2 extra tokens, or its config file isn't set up to expect the extra tokens.

Aaditya Ura (looking for PhD Fall’24) · Answer 2 · Fri Apr 19 2024 17:33:12 GMT+0800 (China Standard Time)

Hi, the error was due to a padded token and occurred during the merging of the models. I have removed those tokens and set the vocab size to 32000, it's working fine now.