RuntimeError: Error(s) in loading state_dict for ft_net_swin

Question

RuntimeError: Error(s) in loading state_dict for ft_net_swin

saminheydarian97 opened this issue 10 months ago · comments

Drowsiness_team_EH commented 10 months ago

Hi.
I used the ft_net_swin for loading the model. When I run my code in my device there is no problem but in the kaggle I got this error.

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2041, in Module.load_state_dict(self, state_dict, strict)
   2036         error_msgs.insert(
   2037             0, 'Missing key(s) in state_dict: {}. '.format(
   2038                 ', '.join('"{}"'.format(k) for k in missing_keys)))
   2040 if len(error_msgs) > 0:
-> 2041     raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
   2042                        self.__class__.__name__, "\n\t".join(error_msgs)))
   2043 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for ft_net_swin:
	Missing key(s) in state_dict: "model.layers.3.downsample.norm.weight", "model.layers.3.downsample.norm.bias", "model.layers.3.downsample.reduction.weight". 
	Unexpected key(s) in state_dict: "model.layers.0.downsample.norm.weight", "model.layers.0.downsample.norm.bias", "model.layers.0.downsample.reduction.weight". 
	size mismatch for model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
	size mismatch for model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
	size mismatch for model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]).
	size mismatch for model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
	size mismatch for model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
	size mismatch for model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]).

Zhedong Zheng · Answer 1 · Tue Sep 05 2023 19:27:41 GMT+0800 (China Standard Time)

It seems different model structure in downsample layers.
The reason can be different timm package like #334, which updates swin model.

pip install git+https://github.com/rwightman/pytorch-image-models.git

I suggest you may consider to re-train your model under the same environment.
It usually works.