Doubly Right Object Recognition: A Why Prompt for Visual Rationales (CVPR 2023)
Chengzhi Mao · Revant Teotia · Amrutha Sundar · Sachit Menon · Junfeng Yang · Xin Wang · Carl Vondrick
Doubly Right Object Recognition Benchmark
Defence Method | Submitted By | DR Accuracy (CIFAR10) |
DR Accuracy (CIFAR100) |
DR Accuracy (Food101) |
DR Accuracy (Caltech101) |
DR Accuracy (SUN) |
DR Accuracy (ImageNet) |
Submission Date |
---|---|---|---|---|---|---|---|---|
Why Prompt | (initial entry) | 70.82 | 22.27 | 25.27 | 23.64 | 6.70 | 3.63 | Mar 1, 2023 |
CLIP | (initial entry) | 42.57 | 6.43 | 5.73 | 5.99 | 0.94 | 0.68 | Mar 1, 2023 |
We welcome people to submit new results to this leaderboard.
Data Collection
We provide the data collected through this procedure. For convenience, we provide a shortcut to those downloaded images here. However, we do not own any of the images, and individual image will be deleted upon request.
You can also rerun the pipeline and recollect your own version of dataset.
- Language Rationales
Run GPT3/retrieve_language_rationale.py
, this requires the OpenAI API which need to pay.
We also include the retrieved rationales in the GPT3
folder.
- Language to Visual Rationales
google_image_serach_url.py
, which ask Google what the language description look like visually, and
provides images downloaded as well as the url.
This requires the Google Image Search API which requires to pay.
Train why prompt
See README.md in whyprompt.