NVlabs / MinVIS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Very Nice Paper

lxtGH opened this issue · comments

Hi! Dear authors:

After I read this paper, I feel very excited and convinced by the way you did.

The insights of your paper are very similar to our work: Video K-Net
https://github.com/lxtGH/Video-K-Net

The difference is that you directly use the query (kernel in our paper) for temporal association, while ours are learned by a sparse triplet loss to learn such embedding.

I wonder would you consider cite our work. Thanks a lot!

Moreover, I would ask several questions.

1, Would the conclusion still be hold if you use a weaker Instance Segmentation model (DETR as VISTR)?
Because I apply K-Net for online learning. However, on YT-VIS-2019, the performance is not good.

2, I could not understand why OVIS improve a lot than YT-VIS.

Thanks Again!

Best Regards!