mehrdad-dev / SBO

Search between the objects of an image using YOLO and CLIP (Web App)

Home Page:https://share.streamlit.io/mehrdad-dev/sbo/main

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Search Between the Objects - SBO

Search between the objects in an image, and cut the region of the detected object.

About this project

CLIP model was proposed by the OpenAI company, to understand the semantic similarity between images and texts. It's used for preform zero-shot learning tasks, to find objects in an image based on an input query. Mehrdad Mohammadian

CLIP pre-trains an image encoder and a text encoder to predict which images were paired with which texts in our dataset. We then use this behavior to turn CLIP into a zero-shot classifier. We convert all of a dataset’s classes into captions such as “a photo of a dog” and predict the class of the caption CLIP estimates best pairs with a given image.

Also, YOLOv5 was used in the first step of the method, to detect the location of the objects in an image.

Demo

Demo is ready!

Streamlit App

(Sometimes, the Streamlit website may crash! because models are heavy for it.)

Notebook

Run this notebook on Google Colab and test on your images! (It works both on CPU and GPU)

Open in nbviewer

Limitations

Obviously object detector model only can find object classes learned from the COCO dataset. So if your results are not related to your query, maybe the object you want is not in the COCO classes.

Example

Sorted from left based on similarity.

Query: wine glass

Mehrdad Mohammadian

Mehrdad Mohammadian

Query: woman with blue pants

Mehrdad Mohammadian

Mehrdad Mohammadian

License

MIT license

Based on

Buy Me A Coffee

About

Search between the objects of an image using YOLO and CLIP (Web App)

https://share.streamlit.io/mehrdad-dev/sbo/main

License:MIT License


Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%