Finetune with VOC2007

Question

Finetune with VOC2007

Asif6511 opened this issue 5 years ago · comments

Mohamed Asif Hassan commented 5 years ago

Hi!

I trained from scratch completely a network with COCO2017 dataset with:

python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml --use_tfboard --bs=2

I now used the checkpoint (pth file) created from the train from scratch to finetune with VOC2007 dataset.

As expected I ran into trouble because the number of classes in VOC (21) and COCO (81) are different. I understand its possible to finetune as there are steps given to finetune custom dataset with different number of classes. I would like to know how to do this?

The command i used:
python tools/train_net_step.py --dataset voc2007 --cfg configs/baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml --load_ckpt=/home/deep/data/asif/Detectron/Detectron.pytorch/Outputs/e2e_mask_rcnn_R-50-FPN_1x/Mar14-14-59-32_deeppc_step/ckpt/model_step719999.pth --use_tfboard --bs=2

The errors I got:
Traceback (most recent call last):
File "tools/train_net_step.py", line 471, in
main()
File "tools/train_net_step.py", line 331, in main
net_utils.load_ckpt(maskRCNN, checkpoint['model'])
File "/home/deep/data/asif/Detectron/Detectron.pytorch/lib/utils/net.py", line 163, in load_ckpt
model.load_state_dict(state_dict, strict=False)
File "/home/deep/anaconda3/envs/detectron/lib/python3.7/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Generalized_RCNN:
size mismatch for Box_Outs.cls_score.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([21, 1024]).
size mismatch for Box_Outs.cls_score.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([21]).
size mismatch for Box_Outs.bbox_pred.weight: copying a param with shape torch.Size([324, 1024]) from checkpoint, the shape in current model is torch.Size([84, 1024]).
size mismatch for Box_Outs.bbox_pred.bias: copying a param with shape torch.Size([324]) from checkpoint, the shape in current model is torch.Size([84]).
size mismatch for Mask_Outs.classify.weight: copying a param with shape torch.Size([81, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([21, 256, 1, 1]).
size mismatch for Mask_Outs.classify.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([21])

Mohamed Asif Hassan · Answer 1 · Tue Mar 19 2019 16:32:31 GMT+0800 (China Standard Time)

update:
Changed the number of classes in config file (Same as that of number of classes in dataset)
Remove the output layers from the pretrained model. Used this script https://gist.github.com/wangg12/aea194aa6ab6a4de088f14ee193fd968

Works fine now!

SeaBaby · Answer 2 · Fri May 24 2019 17:26:33 GMT+0800 (China Standard Time)

update:
Changed the number of classes in config file (Same as that of number of classes in dataset)
Remove the output layers from the pretrained model. Used this script https://gist.github.com/wangg12/aea194aa6ab6a4de088f14ee193fd968

Works fine now!
Can you provide this script?