Weights not matched

Question

Weights not matched

gshuangchun opened this issue 9 months ago · comments

The link to low res weights for the pytorch Cholect45 (crossval k1) seems to not match the model (https://s3.unistra.fr/camma_public/github/rendezvous/rendezvous_l8_cholect45_crossval_k1_layernorm_lowres.pth):

size mismatch for decoder.mhma.0.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.0.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.1.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.1.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.2.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.2.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.3.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.3.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.4.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.4.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.5.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.5.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.6.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.6.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.7.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.mhma.7.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.0.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.0.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.1.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.1.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.2.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.2.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.3.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.3.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.4.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.4.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.5.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.5.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.6.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.6.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.7.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for decoder.ffnet.7.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).

Esther-qian commented 7 months ago

同问

sabinakaminska95 · Answer 1 · Mon Jun 10 2024 22:12:52 GMT+0800 (China Standard Time)

the same question

sabinakaminska95 · Answer 2 · Tue Jun 11 2024 16:06:13 GMT+0800 (China Standard Time)

I solved it, add --use_ln because you are using layernorm

Nwoye Chinedu · Answer 3 · Tue Jun 11 2024 21:32:38 GMT+0800 (China Standard Time)

Dear user,

Information about matching the right weights and models is provided in the README.md file.
The weight filenames are descriptive about their respective configs.

Thanks