THUDM / GLM

GLM (General Language Model)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

使用GLM-10B-Chinese模型跑seq2seq的finetune脚本报错word_embeddings.weight维度不对

webYFDT opened this issue · comments

commented

使用GLM-10B-Chinese作为热启动模型,跑README中的Seq2Seq finetune demo脚本报错。
模型:
image

启动脚本:
image

bash scripts/ds_finetune_seq2seq.sh \ 
   config_tasks/model_blocklm_10B.sh \ 
   config_tasks/seq_cnndm_org.sh

报错信息:
Traceback (most recent call last): File "finetune_glm.py", line 470, in <module> main(args) File "/root/workspace/env_run/GLM_code_model/GLM-main/tasks/seq2seq/finetune.py", line 147, in main finetune(args, train_valid_datasets_provider, {}, end_of_epoch_callback_provider=metrics_func_provider, File "/root/workspace/env_run/GLM_code_model/GLM-main/finetune_glm.py", line 379, in finetune load_pretrained(model, args.load_pretrained, args, task_tokens=task_tokens) File "/root/workspace/env_run/GLM_code_model/GLM-main/train_utils.py", line 55, in load_pretrained missing_keys, unexpected_keys = model.load_state_dict(sd['module'], strict=False) File "/root/workspace/env_run/GLM_code_model/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for GLMModel: size mismatch for word_embeddings.weight: copying a param with shape torch.Size([50048, 4096]) from checkpoint, the shape in current model is torch.Size([50304, 4096]).
这个报错好像是GLM-10B-Chinese模型embedding层和模型组网维度不一致导致模型无法加载。请问组网中的[50304, 4096]值是由什么计算得到呀~全局没有搜到50304这个值,是不是和vocab有关呀?是有地方没配置对吗,请问如何解决这个报错呢?

请问这个问题后面怎么解决的呢