google-research / scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the best way to do one-shot image-conditioned in Owl-v2

dienhoa opened this issue · comments

In Owl-ViT v1, to identify the optimal embedding vector representing the query image, as stated in the paper:

We search for the most dissimilar class embedding within the group of class embeddings whose corresponding box has IoU > 0.65 with Q

This method aims to identify the foreground object.

However, in Owl-ViT v2, we now have the Objectness score for this purpose. I'm wondering if it would be better to use the Objectness score instead of the class embedding to identify the foreground object.

Yes, in OWLv2 I'd use the objectness score. This is how the example in the colab does it.