venkatramnank/tr3d-3dof

TR3D-3dof: Towards Real-Time Indoor 3D Object Detection for 3-dof rotation

This repository contains an implementation of TR3D, a 3D object detection method introduced in the paper:

TR3D: Towards Real-Time Indoor 3D Object Detection
Danila Rukhovich, Anna Vorontsova, Anton Konushin
Samsung Research
https://arxiv.org/abs/2302.02858

The following implementation of TR3D accounts for all three rotation in all 3 dimensions (axes), i.e. yaw, pitch and roll.

Installation

You can install all required packages manually. This implementation is based on mmdetection3d framework. Please refer to the original installation guide getting_started.md, including MinkowskiEngine installation, replacing open-mmlab/mmdetection3d with samsunglabs/tr3d.

Most of the TR3D-related code locates in the following files: detectors/mink_single_stage.py, detectors/tr3d_ff.py, dense_heads/tr3d_head.py, necks/tr3d_neck.py.

Click to expand installation trials

Software	Version	Status
CUDA	11.3	Cannot Install : Due to the driver version mismatch
CUDA	11.7	Cannot Install : Due to the driver version mismatch
CUDA	12.3	Working
Pytorch	1.12.1 + cu 11.3	Working
cudatoolkit	11.7	Working
cudatoolkit	11.3	Cannot Install X : Due to unmet dependencies
Minkowski Engine	0.5.3	Working
gcc and g++ (important)	9.5.0	Working

Steps to install TR3D package and Minkowski engine

Create a conda env with python 3.8

conda create -n tr3d python=3.8

Use the above version of packages and install them. You may refer to https://github.com/SamsungLabs/tr3d/blob/main/docker/Dockerfile for the versions of all the packages.

Set the gcc and g++ to be of version 9.5.0 (or upto 10)

Make sure nvcc is installed

nvcc --version

Clone the Minkowski Engine repository and make sure to follow the below instructions

export CUDA_HOME=/usr/local/cuda-11.x
pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps --install-option="--blas_include_dirs=${CONDA_PREFIX}/include" --install-option="--blas=openblas"
# Or if you want local MinkowskiEngine
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas

The code that worked for my system (Ubuntu 20.04 with CUDA 12.1)

pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps

Getting Started

Please see getting_started.md for basic usage examples. We follow the mmdetection3d data preparation protocol described in scannet, sunrgbd, and s3dis.

For physion data preperation refer to physion.

Setting up config files

One can refer to configs/tr3d/tr3d_physion_fly_config.py as a template.

Training

To start training, run train with TR3D configs:

python tools/train.py configs/tr3d/tr3d_physion_fly_config.py

Note: In order to control the interval of evaluation, change interval at this line.

To train without validation:

python tools/train.py configs/tr3d/tr3d_physion_fly_config.py --no-validate

Testing

Test pre-trained model using test with TR3D configs:

python tools/test.py configs/tr3d/tr3d_physion_fly_config.py \
    work_dirs/<location of trained model>.pth --eval mAP

Visualization

Visualizations can be created with test script. For better visualizations, you may set score_thr in configs to 0.3:

python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
    work_dirs/tr3d_scannet-3d-18class/latest.pth --show \
    --show-dir work_dirs/<location to store output visualizations>

The above command stores the output in the location provided.

To view the output visualization:

python physion/output_visualizer.py pilot_towers_nb2_fr015_SJ010_mono0_dis0_occ0_tdwroom_unstable_0014<(location of file)>

Models

The metrics are obtained in 5 training runs which utilizes the support type of videos. The runs are on a single Nvidia RTX 3080Ti (12GB) GPU. The following table is the result of validation runs. The mAP and mAR calcuations depend on the exact 3D IOU that has been slightly modified to handle certain edge cases

Note : The access to the models will be shortly updated. Results for dominoes will also be updated.

TR3D 3D Detection

Dataset	Loss	Class	AP@0.25	AR@0.25	AP@0.50	AR@0.50	Download
Physion Support	Corners + Huber loss	object	0.1608	0.5322	0.0803	0.3606	TBA
Physion Support	Corners + Chamfer loss	object	0.7278	0.8733	0.4813	0.7212	TBA
Physion Support	12 direct + Huber loss	object	0.6609	0.9669	0.2871	0.6472	TBA

corners regards to usage of loss wrt the 8 corners obtained from the 12d values learnt. 12d direct regards to directly using thos values with the loss.

venkatramnank / tr3d-3dof

TR3D-3dof: Towards Real-Time Indoor 3D Object Detection for 3-dof rotation

Installation

Steps to install TR3D package and Minkowski engine

Getting Started

Models

Example Detections

About

Languages