megvii-research / Iter-E2EDET

Official implementation of the paper "Progressive End-to-End Object Detection in Crowded Scenes"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about the number of detected box 关于检测出来的框的数量的疑问

NingYuanLin opened this issue · comments

Hi,author. Thank for your work!
I train a model and validate it based on my datasets, which are around 30 object per image, and get a perfect mAP value(~0.90). However, I find the val log(projects/crowd-e2e-sparse-rcnn/output/inference/coco_instances_results.json) shows that it generate about 500 bouding box per image. I'm very confused about why it have a better mAP but there are high gap between the number of bounding box and the ground truth.

作者您好,首先感谢您的工作。
我在我自己的数据集(平均每张图像上有30个目标左右)上进行了模型的训练并验证,并且map值比较不错(~0.90)。但是我发现验证的输出文件(projects/crowd-e2e-sparse-rcnn/output/inference/coco_instances_results.json)里对每个图片都输出了500个左右的检测框,我不明白为啥输出的box与groud truth的数量差距那么多,但是map值却可以这么高。
谢谢~

commented

Please refer to DETR and Anchor DETR for the analysis of queries. There are 500 queries for sparse objects in an image. Thus there must be several queries with a low confidence score predicted by the query-based detector since they may locate the background, which can be filtered by a confidence score threshold easily.

Thanks for your detailed explanation.
I will check out these two articles.
Thanks.