Unexpected key(s) in state_dict: "cond_stage_model.transformer.text_model.embeddings.position_ids".
sooxp opened this issue · comments
Hi,
Thanks for making the project code open source!
when I executed scripts/inference_any_image_pose.sh, I received the following error message:
Loaded model config from [model_lib/ControlNet/models/cldm_v15_reference_only_pose.yaml]
Total base parameters 2288.11M
find model state dict from pretrained_weights/model_state-110000.th ...
Loading model state dict from pretrained_weights/model_state-110000.th ...
Traceback (most recent call last):
File "/data/workstation/MagicDance/test_any_image_pose.py", line 577, in <module>
main(args)
File "/data/workstation/MagicDance/test_any_image_pose.py", line 371, in main
load_state_dict(model, args.image_pretrain_dir,strict=True)
File "/data/workstation/MagicDance/test_any_image_pose.py", line 126, in load_state_dict
model.load_state_dict(state_dict, strict=strict)
File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ControlLDMReferenceOnlyPose:
Unexpected key(s) in state_dict: "cond_stage_model.transformer.text_model.embeddings.position_ids".
[2024-02-18 05:16:42,066] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 34957) of binary: /home/user01/miniconda3/envs/dpe/bin/python3
Traceback (most recent call last):
File "/home/user01/miniconda3/envs/dpe/bin/torchrun", line 8, in <module>
sys.exit(main())
File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/user01/miniconda3/envs/dpe/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
test_any_image_pose.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2024-02-18_05:16:42
host : user01-wt
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 34957)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
What's wrong?
Thx.
Hi,
Can u double-check if the package version of your diffusers and transformers matches the environment.yml?
If so, you can also set the value of strict to False, it shouldn't affect the result since we actually didn't use any text as input to the "text_model" in our pipeline.
Let me know if you have any further questions.
solved. the problem is with the version of transformers.
Thx
Hi,
Can u double-check if the package version of your diffusers and transformers matches the environment.yml?
If so, you can also set the value of strict to False, it shouldn't affect the result since we actually didn't use any text as input to the "text_model" in our pipeline.
Let me know if you have any further questions.
I have checked the versions of diffusers and transformers, and also tried to change strict to False, but the above error still appears
Hi,
Can u double-check if the package version of your diffusers and transformers matches the environment.yml?
If so, you can also set the value of strict to False, it shouldn't affect the result since we actually didn't use any text as input to the "text_model" in our pipeline.
Let me know if you have any further questions.I have checked the versions of diffusers and transformers, and also tried to change strict to False, but the above error still appears
Same question. My versions of the transformers, diffusers and pytorch are following.
(magicpose) 10-76-1-24% pip show transformers
Name: transformers
Version: 4.22.1
(magicpose) 10-76-1-24% pip show diffusers
Name: diffusers
Version: 0.11.1
(magicpose) 10-76-1-24% pip show torch
Name: torch
Version: 1.13.1
When I delete all and download from scratch, it works. Thank you.
Hi,
Can u double-check if the package version of your diffusers and transformers matches the environment.yml?
If so, you can also set the value of strict to False, it shouldn't affect the result since we actually didn't use any text as input to the "text_model" in our pipeline.
Let me know if you have any further questions.
Mentioning environment.yml in the readme could be helpful,