hailanyi / CPD

Commonsense Prototype for Outdoor Unsupervised 3D Object Detection (CVPR 2024)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Commonsense Prototype for Outdoor Unsupervised 3D Object Detection (CVPR 2024)

This is the codebase of our CVPR 2024 paper.

Overview

Abstract

CPD (Commonsense Prototype-based Detector) is a high-performance unsupervised 3D object detection framework. CPD first constructs Commonsense Prototype (CProto) characterized by high-quality bounding box and dense points, based on commonsense intuition. Subsequently, CPD refines the low-quality pseudo-labels by leveraging the size prior from CProto. Furthermore, CPD enhances the detection accuracy of sparsely scanned objects by the geometric knowledge from CProto. CPD outperforms state-of-the-art unsupervised 3D detectors on the Waymo Open Dataset (WOD), and KITTI datasets by a large margin. image

Environment

conda create -n spconv2 python=3.9
conda activate spconv2
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 waymo-open-dataset-tf-2-5-0 nuscenes-devkit==1.0.5 spconv-cu111 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open3d pandas future pybind11 tensorboardX tensorboard Cython prefetch-generator

Environment we tested:

Ubuntu 18.04
Python 3.9.13
PyTorch 1.8.1
Numba 0.53.1
Spconv 2.1.22 # pip install spconv-cu111
NVIDIA CUDA 11.1
4x 3090 GPUs

Prepare Dataset

Waymo Dataset

  • Please download the official Waymo Open Dataset, including the training data training_0000.tar~training_0031.tar and the validation data validation_0000.tar~validation_0007.tar.
  • Unzip all the above xxxx.tar files to the directory of data/waymo/raw_data as follows (You could get 798 train tfrecord and 202 val tfrecord ):
CPD
├── data
│   ├── waymo
│   │   │── ImageSets
│   │   │── raw_data
│   │   │   │── segment-xxxxxxxx.tfrecord
|   |   |   |── ...
|   |   |── waymo_processed_data_train_val_test
│   │   │   │── segment-xxxxxxxx/
|   |   |   |── ...
│   │   │── pcdet_waymo_track_dbinfos_train_cp.pkl
│   │   │── waymo_infos_test.pkl
│   │   │── waymo_infos_train.pkl
│   │   │── waymo_infos_val.pkl
├── pcdet
├── tools

Then, generate dataset information:

python3 -m pcdet.datasets.waymo_unsupervised.waymo_unsupervised_dataset --cfg_file tools/cfgs/dataset_configs/waymo_unsupervised/waymo_unsupervised_cproto.yaml

KITTI Dataset

  • Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows (the road planes could be downloaded from [road plane], which are optional for data augmentation in the training):
CasA
├── data
│   ├── kitti
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │── testing
│   │   │   ├──calib & velodyne & image_2
├── pcdet
├── tools

Run following command to create dataset infos:

python3 -m pcdet.datasets.kitti.kitti2waymo_dataset create_kitti_infos tools/cfgs/dataset_configs/waymo_unsupervised/kitti2waymo_dataset.yaml

Training

Train using scripts

cd tools
sh dist_train.sh {cfg_file}

The log infos are saved into log-test.txt You can run cat log.txt to view the test results.

or run directly

cd tools
python train.py 

Evaluation

cd tools
sh dist_test.sh {cfg_file}

The log infos are saved into log-test.txt You can run cat log-test.txt to view the test results.

Model Zoo

Model Vehicle 3D AP Pedestrian 3D AP Cyclist 3D AP Download
L1 L2 L1 L2 L1 L2
DBSCAN-single-train 2.65 2.29 0 0 0.25 0.20 ---
OYSTER-single-train 7.91 6.78 0.03 0.02 4.65 4.05 oyster_pretrained
CPD 38.74 33.37 16.53 13.72 4.28 4.13 cpd_pretrained

The thresholds for evaluating these three categories are respectively set to $IoU_{0.7}$, $IoU_{0.5}$, and $IoU_{0.5}$.

Citation

@inproceedings{CPD,
    title={Commonsense Prototype for Outdoor Unsupervised 3D Object Detection},
    author={Wu, Hai and Zhao, Shijia and Huang, Xun and Wen, Chenglu and Li, Xin and Wang, Cheng},
    booktitle={CVPR},
    year={2024}
}

About

Commonsense Prototype for Outdoor Unsupervised 3D Object Detection (CVPR 2024)


Languages

Language:Python 79.3%Language:Cuda 12.9%Language:C++ 7.5%Language:C 0.3%Language:Shell 0.0%