Commonsense Prototype for Outdoor Unsupervised 3D Object Detection (CVPR 2024)

This is the codebase of our CVPR 2024 paper.

Overview

Abstract
Environment
Prepare Dataset
Getting Started
Model Zoo
Citation

Abstract

CPD (Commonsense Prototype-based Detector) is a high-performance unsupervised 3D object detection framework. CPD first constructs Commonsense Prototype (CProto) characterized by high-quality bounding box and dense points, based on commonsense intuition. Subsequently, CPD refines the low-quality pseudo-labels by leveraging the size prior from CProto. Furthermore, CPD enhances the detection accuracy of sparsely scanned objects by the geometric knowledge from CProto. CPD outperforms state-of-the-art unsupervised 3D detectors on the Waymo Open Dataset (WOD), and KITTI datasets by a large margin.

Environment

conda create -n spconv2 python=3.9
conda activate spconv2
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 waymo-open-dataset-tf-2-5-0 nuscenes-devkit==1.0.5 spconv-cu111 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open3d pandas future pybind11 tensorboardX tensorboard Cython prefetch-generator

Environment we tested:

Ubuntu 18.04
Python 3.9.13
PyTorch 1.8.1
Numba 0.53.1
Spconv 2.1.22 # pip install spconv-cu111
NVIDIA CUDA 11.1
4x 3090 GPUs

Prepare Dataset

Waymo Dataset

Please download the official Waymo Open Dataset, including the training data training_0000.tar~training_0031.tar and the validation data validation_0000.tar~validation_0007.tar.
Unzip all the above xxxx.tar files to the directory of data/waymo/raw_data as follows (You could get 798 train tfrecord and 202 val tfrecord ):

CPD
├── data
│   ├── waymo
│   │   │── ImageSets
│   │   │── raw_data
│   │   │   │── segment-xxxxxxxx.tfrecord
|   |   |   |── ...
|   |   |── waymo_processed_data_train_val_test
│   │   │   │── segment-xxxxxxxx/
|   |   |   |── ...
│   │   │── pcdet_waymo_track_dbinfos_train_cp.pkl
│   │   │── waymo_infos_test.pkl
│   │   │── waymo_infos_train.pkl
│   │   │── waymo_infos_val.pkl
├── pcdet
├── tools

Then, generate dataset information:

python3 -m pcdet.datasets.waymo_unsupervised.waymo_unsupervised_dataset --cfg_file tools/cfgs/dataset_configs/waymo_unsupervised/waymo_unsupervised_cproto.yaml

KITTI Dataset

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows (the road planes could be downloaded from [road plane], which are optional for data augmentation in the training):

CasA
├── data
│   ├── kitti
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │── testing
│   │   │   ├──calib & velodyne & image_2
├── pcdet
├── tools

Run following command to create dataset infos:

python3 -m pcdet.datasets.kitti.kitti2waymo_dataset create_kitti_infos tools/cfgs/dataset_configs/waymo_unsupervised/kitti2waymo_dataset.yaml

Training

Train using scripts

cd tools
sh dist_train.sh {cfg_file}

The log infos are saved into log-test.txt You can run cat log.txt to view the test results.

or run directly

cd tools
python train.py

Evaluation

cd tools
sh dist_test.sh {cfg_file}

The log infos are saved into log-test.txt You can run cat log-test.txt to view the test results.

Model Zoo

Model	Vehicle 3D AP		Pedestrian 3D AP		Cyclist 3D AP		Download
Model	L1	L2	L1	L2	L1	L2	Download
DBSCAN-single-train	2.65	2.29	0	0	0.25	0.20	---
OYSTER-single-train	7.91	6.78	0.03	0.02	4.65	4.05	oyster_pretrained
CPD	38.74	33.37	16.53	13.72	4.28	4.13	cpd_pretrained

The thresholds for evaluating these three categories are respectively set to $IoU_{0.7}$, $IoU_{0.5}$, and $IoU_{0.5}$.

Citation

@inproceedings{CPD,
    title={Commonsense Prototype for Outdoor Unsupervised 3D Object Detection},
    author={Wu, Hai and Zhao, Shijia and Huang, Xun and Wen, Chenglu and Li, Xin and Wang, Cheng},
    booktitle={CVPR},
    year={2024}
}

About

Commonsense Prototype for Outdoor Unsupervised 3D Object Detection (CVPR 2024)

Languages

Language:Python 79.3%Language:Cuda 12.9%Language:C++ 7.5%Language:C 0.3%Language:Shell 0.0%