Exploring robustness and consistency of multimodal VQA models

Minxuan Qin, Yijie Tong, Han Xi, Ziqing Chang

course project of CSNLP at ETH Zurich, FS 2023

We have two hugging face demos in this link and this link.

Set up

For running on ETH Euler cluster, please run first env_setup.sh to install required packages. If needed, hf_env_script.sh contains definitions of environment variables for Hugging Face cache management.

For setting up CARETS, please read first README.md under CARETS directory. The directory also contains an example script run_eval.sh for running on ETH Euler cluster.

Baseline evaluation

See python scripts under baseline directory.

Dataset exploration and visualization

The json file under stats directory contains all questions from the CARETS dataset. We also provide the code for the visualization plot of the performance on CARETS under visualization.

Citation

Kudos to the authors for their amazing results:

@inproceedings{jimenez2022carets,
   title={CARETS: A Consistency And Robustness Evaluative Test Suite for VQA},
   author={Carlos E. Jimenez and Olga Russakovsky and Karthik Narasimhan},
   booktitle={60th Annual Meeting of the Association for Computational Linguistics (ACL)},
   year={2022}
}

About

Course project of the course "Computational Semantics for Natural Language Processing" at ETH, FS2023

MIT License

Languages

Language:Jupyter Notebook 93.6%Language:Python 6.3%Language:Shell 0.1%