bargul / AutoShape

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection


Auto-labeled Car Shape for KITTI

We release our Auto-labeled car shape data for KITTI with COCO formate. Each car instance has been assigned a 3D model. Trainset and Valset with 3000 vertexes 3D models annotations can be downloaded from Google Drive.

Data Formate

# we add 2D/3D keypoints in KITTI car instance annotations
annotations: [
    '2dkeypoints': list # (3000 + 9) * 3 (u, v, visiblity),
    '3dkeypoints': list # (3000 + 9) * 3 (x, y, z in model local coordinate)
    ], ...

Paddle Implement(incomplete)


  • Ubuntu 18.04
  • Python 3.7
  • PaddlePaddle 2.1.0
  • CUDA 10.2

PaddlePaddle installation

conda create -n paddle_latest python=3.7

conda actviate paddle_latest

pip install paddlepaddle -i

pip install -r requirement.txt

Pytorch Implement


  • Ubuntu 18.04
  • Python 3.6


  1. Install pytorch1.0.0:
    conda install pytorch=1.0.0 torchvision==0.2.1 cuda100 -c pytorch
  2. Install the requirements
    pip install -r requirements.txt
  3. Compile deformable convolutional (from DCNv2).
    cd $AutoShape_ROOT/pytorch/src/lib/models/networks/ 
    cd DCNv2
  4. Compile iou3d (from pointRCNN).
    cd $AutoShape_ROOT/pytorch/src/lib/utiles/iou3d
    python install

Dataset preparation

Please download the official KITTI 3D object detection dataset and AutoShape keypoints annotations organize the downloaded files as follows:

├── kitti_format
│   ├── data
│   │   ├── kitti
│   │   |   ├── annotations_48 / kitti_train.json .....
│   │   |   ├── annotations_16 / kitti_train.json .....
│   │   │   ├── calib /000000.txt .....
│   │   │   ├── image(left[0-7480] right[7481-14961] for data augmentation)
│   │   │   ├── label /000000.txt .....
|   |   |   ├── train.txt val.txt trainval.txt
├── src
├── requirements.txt


Run following command to train model with DLA-34 backbone and 57(48+9) keypoints with 2 GPUs.

cd pytorch
python ./src/ --data_dir ./kitti_format --exp_id AutoShape_dla34_trainval_rightaug --arch dla_34 --num_joints 57 --sample_pc 48 --batch_size 16 --master_batch_size 8 --lr 1.5e-4 --gpus 0,1 --num_epochs 200 --stereo_aug


python ./src/ --demo  test_image_dir_path --calib_dir calib_dir_path --load_model trained_model_path --gpus 0 --arch dla_34 --num_joints 57 --sample_pc 48

Kitti TestServer Evaluation Model

  • Training on KITTI trainval split and evaluation on test server.
    • Backbone: DLA-34
    • Num Keypoints: 48 + 9
    • Model: (Google Drive)
Class Easy Moderate Hard
Car 22.47 14.17 11.36



AutoShape is released under the MIT License (refer to the LICENSE file for details). Some of the code are borrowed from, RTM3D, CenterNet, dla (DLA network), DCNv2(deformable convolutions), iou3d and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects.


If you find this project useful for your research, please use the following BibTeX entry.

  title={AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection},
  author={Liu, Zongdai and Zhou, Dingfu and Lu, Feixiang and Fang, Jin and Zhang, Liangjun},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},


ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

License:MIT License


Language:Python 76.5%Language:C++ 15.1%Language:Cuda 7.6%Language:C 0.6%Language:Shell 0.2%