Input size can not be dynamic?

Question

Input size can not be dynamic?

lucasjinreal opened this issue 3 years ago · comments

I tried something like this:

 python demo.py --resume weights/yolos_s_dWr.pth --data_file ../yolov7/images/COCO_val2014_000000001856.jpg --mid_pe_size 800 864 --init_pe_size 800 864
Not using distributed mode
Namespace(backbone_name='small_dWr', batch_size=2, bbox_loss_coef=5, clip_max_norm=0.1, coco_panoptic_path=None, coco_path=None, data_file='../yolo/images/COCO_val2014_000000001856.jpg', dataset_file='coco', decay_rate=0.1, det_token_num=100, device='cuda', dice_loss_coef=1, dist_url='env://', distributed=False, eos_coef=0.1, epochs=150, eval=False, eval_size=800, giou_loss_coef=2, init_pe_size=[800, 864], lr=0.0001, lr_backbone=1e-05, lr_drop=100, mid_pe_size=[800, 864], min_lr=1e-07, num_workers=2, output_dir='', pre_trained='', remove_difficult=False, resume='weights/yolos_s_dWr.pth', sched='warmupcos', seed=42, set_cost_bbox=5, set_cost_class=1, set_cost_giou=2, start_epoch=0, use_checkpoint=False, warmup_epochs=0, warmup_lr=1e-06, weight_decay=0.0001, world_size=1)

Got:

torch1.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Detector:
	size mismatch for backbone.pos_embed: copying a param with shape torch.Size([1, 1829, 330]) from checkpoint, the shape in current model is torch.Size([1, 2801, 330]).
	size mismatch for backbone.mid_pos_embed: copying a param with shape torch.Size([13, 1, 1829, 330]) from checkpoint, the shape in current model is torch.Size([13, 1, 2801, 330]).

Yuxin Fang (方羽新) · Answer 1 · Sat Oct 30 2021 13:43:54 GMT+0800 (China Standard Time)

The input sizes can be dynamic, we did't test demo.py, please directly try to train or inference YOLOS using the provided scripts.

MagicSource · Answer 2 · Sat Oct 30 2021 19:30:59 GMT+0800 (China Standard Time)

@Yuxin-CV how? I occured above errors by specific different input size. demo.py is just copied from your coco_visualizexx.py, same args.

Yuxin Fang (方羽新) · Answer 3 · Sat Oct 30 2021 21:17:29 GMT+0800 (China Standard Time)

You should set --init_pe_size 512 864 & --mid_pe_size 512 864, note that these two params are irrelevant with input size, they are with pre-trained YOLOS weight.

MagicSource · Answer 4 · Sun Oct 31 2021 11:03:17 GMT+0800 (China Standard Time)

@Yuxin-CV if I set input to 800x800, what will happen? Does it means it's miss aligned with training?

Yuxin Fang (方羽新) · Answer 5 · Sun Oct 31 2021 11:06:34 GMT+0800 (China Standard Time)

Of course it will be miss aligned with training, but the model can still process it, with degenerated accuracy.

Yuxin Fang (方羽新) · Answer 6 · Sun Oct 31 2021 11:14:10 GMT+0800 (China Standard Time)

I don't konw why you would like to change the aspect ratio, but if you prefer H : W = 1 : 1 inputs, I suggest you re-train the model with scale jittering.