Very Nice Paper
lxtGH opened this issue · comments
Hi! Dear authors:
After I read this paper, I feel very excited and convinced by the way you did.
The insights of your paper are very similar to our work: Video K-Net
https://github.com/lxtGH/Video-K-Net
The difference is that you directly use the query (kernel in our paper) for temporal association, while ours are learned by a sparse triplet loss to learn such embedding.
I wonder would you consider cite our work. Thanks a lot!
Moreover, I would ask several questions.
1, Would the conclusion still be hold if you use a weaker Instance Segmentation model (DETR as VISTR)?
Because I apply K-Net for online learning. However, on YT-VIS-2019, the performance is not good.
2, I could not understand why OVIS improve a lot than YT-VIS.
Thanks Again!
Best Regards!