This repository holds the code for the paper.
ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos, Zhenfang Chen, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan, (Under review)
git clone https://github.com/zfchenUnique/executor_comphy.git
pip install -r requirements
- Download videos, video annotation, questions from the official website.
- Download the regional proposals with attribute and physical property prediction from Google drive
- Download the dynamic predictions from Google drive
- Run executor for factual questions.
sh scripts/test_oe_release.sh
- Run executor for multiple-choice questions.
sh scripts/test_mc_release.sh
Please refer to this repo for property learning and dynamics prediction.
This module uses the NS-VQA's perception module object detection and visual attribute extraction.
This module uses the NS-VQA's program parser module to tranform language into executable programs.