THU-DA-6D-Pose-Group / GDR-Net

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

Home Page:https://github.com/THU-DA-6D-Pose-Group/GDR-Net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GDR-Net

This repo provides the PyTorch implementation of the work:

Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji. GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. In CVPR 2021. [Paper][ArXiv][Video][bibtex]

News

  • [2023/10] An enhanced version of this work, GPose2023 by Zhang et al., won BOP Challenge @ ICCV 2023. Congratulations!
  • [2022/10] An enhanced version of this work, GDRNPP by Liu et al., won most of the awards on BOP Challenge @ ECCV 2022. Congratulations!
  • [2021/08] An extension of this work, SO-Pose by Di et al. (ICCV 2021), has been released (SO-Pose code, mirror).

Overview

Requirements

  • Ubuntu 16.04/18.04, CUDA 10.1/10.2, python >= 3.6, PyTorch >= 1.6, torchvision
  • Install detectron2 from source
  • sh scripts/install_deps.sh
  • Compile the cpp extension for farthest points sampling (fps):
    sh core/csrc/compile.sh
    

Datasets

Download the 6D pose datasets (LM, LM-O, YCB-V) from the BOP website and VOC 2012 for background images. Please also download the image_sets and test_bboxes from here (BaiduNetDisk, password: qjfk | Cloud.THU, password: fMNOASFHW0E8R72357T6mn9).

The structure of datasets folder should look like below:

# recommend using soft links (ln -sf)
datasets/
├── BOP_DATASETS
    ├──lm
    ├──lmo
    ├──ycbv
├── lm_imgn  # the OpenGL rendered images for LM, 1k/obj
├── lm_renders_blender  # the Blender rendered images for LM, 10k/obj (pvnet-rendering)
├── VOCdevkit
  • lm_imgn comes from DeepIM, which can be downloaded here (BaiduNetDisk, password: vr0i | Cloud.THU, password: A097fN8a07ufn0u70).

  • lm_renders_blender comes from pvnet-rendering, note that we do not need the fused data.

Training GDR-Net

./core/gdrn_modeling/train_gdrn.sh <config_path> <gpu_ids> (other args)

Example:

./core/gdrn_modeling/train_gdrn.sh configs/gdrn/lm/a6_cPnP_lm13.py 0  # multiple gpus: 0,1,2,3
# add --resume if you want to resume from an interrupted experiment.

Our trained GDR-Net models can be found here (BaiduNetDisk, password: kedv | Cloud.THU, password: 0M,oadfu9uIU).
(Note that the models for BOP setup in the supplement were trained using a refactored version of this repo (not compatible), they are slightly better than the models provided here.)

Evaluation

./core/gdrn_modeling/test_gdrn.sh <config_path> <gpu_ids> <ckpt_path> (other args)

Example:

./core/gdrn_modeling/test_gdrn.sh configs/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e.py 0 output/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e/gdrn_lmo_real_pbr.pth

Citation

If you find this useful in your research, please consider citing:

@InProceedings{Wang_2021_GDRN,
    title     = {{GDR-Net}: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation},
    author    = {Wang, Gu and Manhardt, Fabian and Tombari, Federico and Ji, Xiangyang},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {16611-16621}
}

About

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

https://github.com/THU-DA-6D-Pose-Group/GDR-Net

License:Apache License 2.0


Languages

Language:Python 97.3%Language:C 1.7%Language:GLSL 0.5%Language:C++ 0.4%Language:Shell 0.2%