Doubly Right Object Recognition: A Why Prompt for Visual Rationales (CVPR 2023)

Chengzhi Mao · Revant Teotia · Amrutha Sundar · Sachit Menon · Junfeng Yang · Xin Wang · Carl Vondrick

arxiv

Doubly Right Object Recognition Benchmark

Defence Method	Submitted By	DR Accuracy (CIFAR10)	DR Accuracy (CIFAR100)	DR Accuracy (Food101)	DR Accuracy (Caltech101)	DR Accuracy (SUN)	DR Accuracy (ImageNet)	Submission Date
Why Prompt	(initial entry)	70.82	22.27	25.27	23.64	6.70	3.63	Mar 1, 2023
CLIP	(initial entry)	42.57	6.43	5.73	5.99	0.94	0.68	Mar 1, 2023

We welcome people to submit new results to this leaderboard.

Data Collection

We provide the data collected through this procedure. For convenience, we provide a shortcut to those downloaded images here. However, we do not own any of the images, and individual image will be deleted upon request.

You can also rerun the pipeline and recollect your own version of dataset.

Language Rationales

Run GPT3/retrieve_language_rationale.py, this requires the OpenAI API which need to pay.

We also include the retrieved rationales in the GPT3 folder.

Language to Visual Rationales

google_image_serach_url.py, which ask Google what the language description look like visually, and provides images downloaded as well as the url.

This requires the Google Image Search API which requires to pay.

Train why prompt

See README.md in whyprompt.

About

Languages

Language:Python 98.4%Language:Jupyter Notebook 1.6%