The given teacher checkpoint (vgg) doesn't fit with the current version of VGG in the repo.

Question

The given teacher checkpoint (vgg) doesn't fit with the current version of VGG in the repo.

TheLostIn opened this issue 3 years ago · comments

size mismatch for loc.0.weight: copying a param with shape torch.Size([16, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 1024, 3, 3]).
        size mismatch for loc.1.weight: copying a param with shape torch.Size([24, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([24, 512, 3, 3]).
        size mismatch for loc.2.weight: copying a param with shape torch.Size([24, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([24, 256, 3, 3]).
        size mismatch for loc.4.weight: copying a param with shape torch.Size([16, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([24, 256, 3, 3]).
        size mismatch for loc.4.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([24]).
        size mismatch for loc.5.weight: copying a param with shape torch.Size([16, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([24, 256, 3, 3]).
        size mismatch for loc.5.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([24]).
        size mismatch for conf.0.weight: copying a param with shape torch.Size([84, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([84, 1024, 3, 3]).
        size mismatch for conf.1.weight: copying a param with shape torch.Size([126, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([126, 512, 3, 3]).
        size mismatch for conf.2.weight: copying a param with shape torch.Size([126, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([126, 256, 3, 3]).
        size mismatch for conf.4.weight: copying a param with shape torch.Size([84, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([126, 256, 3, 3]).
        size mismatch for conf.4.bias: copying a param with shape torch.Size([84]) from checkpoint, the shape in current model is torch.Size([126]).
        size mismatch for conf.5.weight: copying a param with shape torch.Size([84, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([126, 256, 3, 3]).
        size mismatch for conf.5.bias: copying a param with shape torch.Size([84]) from checkpoint, the shape in current model is torch.Size([126]).

SsisyphusTao · Answer 1 · Tue Aug 24 2021 17:59:47 GMT+0800 (China Standard Time)

yep.. the provided model is the original model which has 8732 anchors.., you need to finetune first and get a vgg model with 3000 anchors then start distillation.
View my blog https://zhuanlan.zhihu.com/p/260370225 to check more details.

Eden BELOUADAH · Answer 2 · Mon Feb 07 2022 16:50:57 GMT+0800 (China Standard Time)

Please, for how many epochs we should fine tine the vgg model before running the distillation?

Thank you