LTH14 / mage

A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

error

luodi789567890 opened this issue · comments

I trained a model using imagent100, but when I try to load the model, I get a dimension mismatch error. Can I add you on QQ to ask for your help? My QQ number is 2715128882.

Unfortunately, I don't have a QQ. You can post your problem here and I'll try my best to help you.

My WeChat 18885257528
error:

(mage) root@autodl-container-58da118cfa-5e89f4b9:~/autodl-tmp/mage-main# python gen_img_uncond.py --temp 6.0 --num_iter 20 --ckpt /root/autodl-tmp/mage-main/output_dir/checkpoint-80.pth --batch_size 32 --num_images 500 --model mage_vit_base_patch16 --output_dir /root/autodl-tmp/mage-main/output_dir/fid/gen/mage-vitb
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
Strict load
Restored from vqgan_jax_strongaug.ckpt
Traceback (most recent call last):
File "gen_img_uncond.py", line 123, in
model.load_state_dict(checkpoint['model'],strict=False)
File "/root/miniconda3/envs/mage/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for MaskedGenerativeEncoderViT:
size mismatch for cls_token: copying a param with shape torch.Size([1, 1, 1024]) from checkpoint, the shape in current model is torch.Size([1, 1, 768]).
size mismatch for pos_embed: copying a param with shape torch.Size([1, 257, 1024]) from checkpoint, the shape in current model is torch.Size([1, 257, 768]).
size mismatch for mask_token: copying a param with shape torch.Size([1, 1, 1024]) from checkpoint, the shape in current model is torch.Size([1, 1, 768]).
size mismatch for decoder_pos_embed: copying a param with shape torch.Size([1, 257, 1024]) from checkpoint, the shape in current model is torch.Size([1, 257, 768]).
size mismatch for decoder_pos_embed_learned: copying a param with shape torch.Size([1, 257, 1024]) from checkpoint, the shape in current model is torch.Size([1, 257, 768]).
size mismatch for token_emb.word_embeddings.weight: copying a param with shape torch.Size([2025, 1024]) from checkpoint, the shape in current model is torch.Size([2025, 768]).
size mismatch for token_emb.position_embeddings.weight: copying a param with shape torch.Size([257, 1024]) from checkpoint, the shape in current model is torch.Size([257, 768]).
size mismatch for token_emb.LayerNorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for token_emb.LayerNorm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for patch_embed.proj.weight: copying a param with shape torch.Size([1024, 3, 16, 16]) from checkpoint, the shape in current model is torch.Size([768, 3, 16, 16]).
size mismatch for patch_embed.proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for blocks.0.norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for blocks.0.norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for blocks.0.attn.qkv.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([2304, 768]).
size mismatch for blocks.0.attn.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([2304]).
size mismatch for blocks.0.attn.proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
size mismatch for blocks.0.attn.proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for blocks.0.norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for blocks.0.norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for blocks.0.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072,

I think you are copying a ViT-Large pre-trained model to a ViT-Base model. Try to set --model mage_vit_large_patch16 in your gen_img_uncond.py argument.

Why isn’t there a contrastive loss and a total loss in your code model?