Setting of VideoCutLER's baseline

Question

Setting of VideoCutLER's baseline

so45jj45 opened this issue 2 months ago · comments

Thank you for your excellent research.

In the paper for VideoCutLER, the description for the baseline is as follows:
‡: "We train a CutLER [35] model with Mask2Former as a detector on ImageNet-1K, following CutLER’s official training recipe, and use it as a strong baseline."

Could you please clarify if the "strong baseline" mentioned here involves training Mask2Former at the image level only once, or if it involves multi-round self-training? Also, could you specify whether droploss was used or not?

Thanks.