Setting of VideoCutLER's baseline
so45jj45 opened this issue · comments
so45jj45 commented
Thank you for your excellent research.
In the paper for VideoCutLER, the description for the baseline is as follows:
‡: "We train a CutLER [35] model with Mask2Former as a detector on ImageNet-1K, following CutLER’s official training recipe, and use it as a strong baseline."
Could you please clarify if the "strong baseline" mentioned here involves training Mask2Former at the image level only once, or if it involves multi-round self-training? Also, could you specify whether droploss was used or not?
Thanks.