minghangz / 3D-QA

Replication Instruction

Installation

Follow the requirement installation for [BLIP repository](GitHub - salesforce/BLIP: PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation).
Download checkpoints and put into ckpts path.
Paste code in the repository to the cloned BLIP path.
Follow [ScanQA repository](GitHub - ATR-DBI/ScanQA). Download and prepeocess data.
Replace the ScanQA data path in the code to yours.

Scene Views Generation

Replace the Scannet data path in the render_scenes.py to yours.
Run render_scenes.py.

Zero-Shot Eval

Run eval_scene_best_views.py to zero-shot evaluate BLIP with ScanQA.
A result json will be generated, indicating matched views w.r.t. questions.

Train BLIP with Views

Run train_scene_view_vqa.py.

About

Languages

Language:Python 100.0%