[URGENT] Eval results are much lower than what's reported

Question

[URGENT] Eval results are much lower than what's reported

encounter1997 opened this issue 3 years ago · comments

Hi, thanks for the excellent work!

I follow the instructions in README to evaluate the models provided in your repo. However, the AP I got for yolos_ti .pth, yolos_s_200_pre.pth, yolos_s_300_pre.pth, yolos_s_dWr.pth, and yolos_base.pth are 28.7, 12.5, 12.7, 13.2, and 13.8, respectively. While yolos_ti.pth matches the performance in your paper and log, other four models are significantly lower than what's expected.
Any idea why this would happen? Thanks in advance!

For example, when evaluating the base model, I ran

python  -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --coco_path ../data/coco --batch_size 2 --backbone_name base --eval --eval_size 800 --init_pe_size 800 1344 --mid_pe_size 800 1344 --resume ../trained_weights/yolos/yolos_base.pth

and was expected to obtain a 42.0 AP performance, as shown in your paper and log. However, the result is only 13.8 AP.

The complete evaluation output is shown below.

*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
| distributed init (rank 0): env://
| distributed init (rank 2): env://
| distributed init (rank 3): env://
| distributed init (rank 1): env://
| distributed init (rank 6): env://
| distributed init (rank 5): env://
| distributed init (rank 7): env://
| distributed init (rank 4): env://
Namespace(backbone_name='base', batch_size=2, bbox_loss_coef=5, clip_max_norm=0.1, coco_panoptic_path=None, coco_path='../data/coco', dataset_file='coco', decay_rate=0.1, det_token_num=100, device='cuda', dice_loss_coef=1, dist_backend='nccl', dist_url='env://', distributed=True, eos_coef=0.1, epochs=150, eval=True, eval_size=800, giou_loss_coef=2, gpu=0, init_pe_size=[800, 1344], lr=0.0001, lr_backbone=1e-05, lr_drop=100, mid_pe_size=[800, 1344], min_lr=1e-07, num_workers=2, output_dir='', pre_trained='', rank=0, remove_difficult=False, resume='../trained_weights/yolos/yolos_base.pth', sched='warmupcos', seed=42, set_cost_bbox=5, set_cost_class=1, set_cost_giou=2, start_epoch=0, use_checkpoint=False, warmup_epochs=0, warmup_lr=1e-06, weight_decay=0.0001, world_size=8)
Has mid pe
number of params: 127798368
loading annotations into memory...
Done (t=23.52s)
creating index...
index created!
800
loading annotations into memory...
Done (t=3.00s)
creating index...
index created!
Test:  [  0/313]  eta: 0:39:39  class_error: 29.21  loss: 2.1542 (2.1542)  loss_bbox: 0.4245 (0.4245)  loss_ce: 0.7761 (0.7761)  loss_giou: 0.9535 (0.9535)  cardinality_error_unscaled: 5.3750 (5.3750)  class_error_unscaled: 29.2100 (29.2100)  loss_bbox_unscaled: 0.0849 (0.0849)  loss_ce_unscaled: 0.7761 (0.7761)  loss_giou_unscaled: 0.4768 (0.4768)  time: 7.6030  data: 0.5298  max mem: 3963
Test:  [256/313]  eta: 0:00:26  class_error: 17.22  loss: 2.5668 (2.6435)  loss_bbox: 0.5639 (0.5792)  loss_ce: 0.8598 (0.8386)  loss_giou: 1.1904 (1.2257)  cardinality_error_unscaled: 3.8750 (4.2398)  class_error_unscaled: 28.7817 (28.6160)  loss_bbox_unscaled: 0.1128 (0.1158)  loss_ce_unscaled: 0.8598 (0.8386)  loss_giou_unscaled: 0.5952 (0.6129)  time: 0.4406  data: 0.0137  max mem: 10417
Test:  [312/313]  eta: 0:00:00  class_error: 16.29  loss: 2.8745 (2.6626)  loss_bbox: 0.5974 (0.5833)  loss_ce: 0.8791 (0.8461)  loss_giou: 1.3012 (1.2332)  cardinality_error_unscaled: 3.8750 (4.2370)  class_error_unscaled: 26.2946 (28.7748)  loss_bbox_unscaled: 0.1195 (0.1167)  loss_ce_unscaled: 0.8791 (0.8461)  loss_giou_unscaled: 0.6506 (0.6166)  time: 0.4251  data: 0.0134  max mem: 10417
Test: Total time: 0:02:25 (0.4663 s / it)
Averaged stats: class_error: 16.29  loss: 2.8745 (2.6626)  loss_bbox: 0.5974 (0.5833)  loss_ce: 0.8791 (0.8461)  loss_giou: 1.3012 (1.2332)  cardinality_error_unscaled: 3.8750 (4.2370)  class_error_unscaled: 26.2946 (28.7748)  loss_bbox_unscaled: 0.1195 (0.1167)  loss_ce_unscaled: 0.8791 (0.8461)  loss_giou_unscaled: 0.6506 (0.6166)
Accumulating evaluation results...
DONE (t=15.78s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.13810
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.26766
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.11832
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.05146
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.13066
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.23324
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.18115
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.29001
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.31740
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.12520
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.31154
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.49446

Yuxin Fang (方羽新) · Answer 1 · Sun Oct 03 2021 22:26:11 GMT+0800 (China Standard Time)

Hi~@encounter1997, thanks for your interest in YOLOS and thanks for pointing out this issue :)

The codebase of YOLOS is built upon DETR's codebase, so there is a "bug" inherit from DETR: you need to set the num_GPU and batchsize_per_GPU during evaluation the same as during training. E.g., the num_GPU = 8 & batchsize_per_GPU = 1 for YOLOS-Small & YOLOS-Base.

It seems that you set batchsize_per_GPU = 2 during evaluation, which results in AP degeneration.

Try

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --coco_path /path/to/coco --batch_size 1 --backbone_name small --eval --eval_size 800 --init_pe_size 512 864 --mid_pe_size 512 864 --resume /path/to/YOLOS-Small

to reproduce YOLOS-Small AP, which should be 36.1.

Wen Wang · Answer 2 · Sun Oct 03 2021 23:05:37 GMT+0800 (China Standard Time)

Thanks for your timely reply! I followed your advice and the problem was solved~