jihanyang / PLA

(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

1The University of Hong Kong  2ByteDance
*equal contribution  +corresponding author

CVPR 2023

TL;DR: PLA leverages powerful VL foundation models to construct hierarchical 3D-text pairs for 3D open-world learning.

working space piano vending machine

TODO

  • Release caption processing code

Getting Started

Installation

Please refer to INSTALL.md for the installation.

Dataset Preparation

Please refer to DATASET.md for dataset preparation.

Training & Inference

Please refer to MODEL.md for training and inference scripts and pretrained models.

Citation

If you find this project useful in your research, please consider cite:

@inproceedings{ding2022language,
    title={PLA: Language-Driven Open-Vocabulary 3D Scene Understanding},
    author={Ding, Runyu and Yang, Jihan and Xue, Chuhui and Zhang, Wenqing and Bai, Song and Qi, Xiaojuan},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2023}
}

Acknowledgement

Code is partly borrowed from OpenPCDet, PointGroup and SoftGroup.

About

(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

License:Apache License 2.0


Languages

Language:Python 88.3%Language:C++ 5.7%Language:Cuda 4.3%Language:C 1.5%Language:Shell 0.2%