SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

Video

This repository is the official implementation of our paper SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation. For more details, please see our paper.

Introduction

SMOKE is a real-time monocular 3D object detector for autonomous driving. The runtime on a single NVIDIA TITAN XP GPU is ~30ms. Part of the code comes from CenterNet, maskrcnn-benchmark, and Detectron2.

The performance on KITTI 3D detection (3D/BEV) is as follows:

	Easy	Moderate	Hard
Car	14.17 / 21.08	9.88 / 15.13	8.63 / 12.91
Pedestrian	5.16 / 6.22	3.24 / 4.05	2.53 / 3.38
Cyclist	1.11 / 1.62	0.60 / 0.98	0.47 / 0.74

The pretrained weights can be downloaded here.

Requirements

All codes are tested under the following environment:

Ubuntu 16.04
Python 3.7
Pytorch 1.3.1
CUDA 10.0

Dataset

We train and test our model on official KITTI 3D Object Dataset. Please first download the dataset and organize it as following structure:

kitti
│──training
│    ├──calib 
│    ├──label_2 
│    ├──image_2
│    └──ImageSets
└──testing
     ├──calib 
     ├──image_2
     └──ImageSets

Setup

We use conda to manage the environment:

conda create -n SMOKE python=3.7

Clone this repo:

git clone https://github.com/lzccccc/SMOKE

Build codes:

python setup.py build develop

Link to dataset directory:

mkdir datasets
ln -s /path_to_kitti_dataset datasets/kitti

Getting started

First check the config file under configs/.

We train the model on 4 GPUs with 32 batch size:

python tools/plain_train_net.py --num-gpus 4 --config-file "configs/smoke_gn_vector.yaml"

For single GPU training, simply run:

python tools/plain_train_net.py --config-file "configs/smoke_gn_vector.yaml"

We currently only support single GPU testing:

python tools/plain_train_net.py --eval-only --config-file "configs/smoke_gn_vector.yaml"

Acknowledgement

CenterNet

maskrcnn-benchmark

Detectron2

Citations

Please cite our paper if you find SMOKE is helpful for your research.

@article{liu2020SMOKE,
  title={{SMOKE}: Single-Stage Monocular 3D Object Detection via Keypoint Estimation},
  author={Zechen Liu and Zizhang Wu and Roland T\'oth},
  journal={arXiv preprint arXiv:2002.10111},
  year={2020}
}

About

SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

3d-object-detection autonomous-driving

MIT License

Languages

Language:Python 68.6%Language:Cuda 22.8%Language:C++ 6.2%Language:C 2.3%