Do you also maintain the base / novel splits during pretraining?

Question

Do you also maintain the base / novel splits during pretraining?

mlzxy opened this issue a year ago · comments

Just out of curiosity. Thanks!

Yiwu Zhong · Answer 1 · Sun May 28 2023 20:54:25 GMT+0800 (China Standard Time)

During pre-training, there are no so-called "base" or "novel" categories. Our model is pre-trained by a diverse set of object concepts that are parsed from image captions.

Xinyu Zhang · Answer 2 · Sun May 28 2023 22:31:56 GMT+0800 (China Standard Time)

Thx for your clarification. What about the localizer used in pretraining? Is it a rpn trained on base images or a sliding window based proposal generator?

Yiwu Zhong · Answer 3 · Tue Jun 06 2023 13:27:18 GMT+0800 (China Standard Time)

By default, we use RPN trained by the boxes in LVIS dataset (see Implementation details in the paper). Note that, for RPN, there are no base/novel images, since we didn't use categorical labels of these boxes. Also, random boxes can perform closely with RPN (see ablation study in paper).