class-specific object detection

Question

class-specific object detection

fushh opened this issue 6 months ago · comments

In this paper, the authors provide an insight idea that the high-level information provided by the language descriptions helps learn fairly generalizable properties of universal object categories. However, I find that the training dataset is class-specific. Thus, I'm curious about whether MViT can perform class-specific object detection. Following OV-DETR, also similar to GLIP and the way to extract high-quality class-specific proposals using image-level labels in object-centric-ovd, we can use prompts like 'every {category}' and forward the prompts multiple times to get top-score predictions for each class. Thus, we can perform open-vocabulary object detection with MViT. However, my experiments show extremely poor results: 5.4 AP50 novel and 3.8 AP50 base. I'm confused about the results. Could you give me some advice?

Hanoona Rasheed · Answer 1 · Fri Jan 19 2024 21:26:46 GMT+0800 (China Standard Time)

Hi @fushh,

Please refer to this. Thank you.