Self-supervised surround-view depth estimation with volumetric feature fusion

Jung-Hee Kim*, Junwha Hur*, Tien Nguyen, and Seong-Gyun Jeong - NeurIPS 2022
Link to the paper: Link

@inproceedings{kimself,
  title={Self-supervised surround-view depth estimation with volumetric feature fusion},
  author={Kim, Jung Hee and Hur, Junhwa and Nguyen, Tien Phuoc and Jeong, Seong-Gyun},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)}
  year = {2022},
}

Introduction

We introduce a volumetric feature representation for self-supervised surround-view depth approach, which not only outputs metric-scale depth and canonical camera motion, but also synthesizes a depth map at a novel viewpoint.

Installation

Install required libraries using the requirements.txt file.
(Note that we leverage packent-sfm, dgp as submodules and therefore need to install required libraries related to the submodules.)
To install both required library and submodules, you need to follow instruction below:

git submodule init
git submodule update
pip install -r requirements.txt

Datasets

DDAD dataset

DDAD dataset can be downloaded by running:

curl -s https://tri-ml-public.s3.amazonaws.com/github/DDAD/datasets/DDAD.tar

Dataset path needs to be specified in the config/<config-name>.
For initial path: data/<dataset-name>
We manually created mask image for scene of ddad dataset and are provided in dataset/ddad_mask

Main Results

Model	Scale	Abs.Rel.	Sq.Rel.	RMSE	RMSElog	d_1.25	d_1.25²	d_1.25³
VFDepth	Metric	0.221	4.001	13.406	0.340	0.688	0.868	0.932
VFDepth	Median	0.221	3.884	13.225	0.328	0.692	0.877	0.939

Get Started

Training

Surround-view fusion depth estimation model can be trained from scratch.

By default results are saved under results/<config-name> with trained model and tensorboard file for both training and validation.

Single-GPU
Training the model using single-GPU:
(Note that, due to usage of packnet-sfm submodule, userwarning repetitively occurs and therefore ignored while training.)

python -W ignore train.py --config_file='./configs/surround_fusion.yaml'

Multi-GPU
Training the model using Multi-GPU:

Enable distributed data parallel(DDP), by setting ddp:ddp_enable to True in the config file
- Gpus and the worldsize(number of gpus) must be specified (ex. gpus = [0, 1, 2, 3], worldsize= 4)
DDP address and port setting can be configured in ddp.py

python -W ignore train.py --config_file='./configs/ddp/surround_fusion_ddp.yaml'

Evaluation

To evaluate the trained model from scratch, run:

python -W ignore eval.py --config_file='./configs/<config-name>'

The model weights need to be specified in load: weights of the config file.

Evaluation results using the pretrained model can be obtained by using the following command:

python -W ignore eval.py --config_file='./configs/ddp/surround_fusion.yaml' \
                         --weight_path='<pretrained-weight-path>'

Depth Synthesis

To obtain synthesized depth results, train the model from scratch by running:

python -W ignore train.py --config_file='./configs/surround_fusion_augdepth.yaml'

Then evaluate the model by running:

python -W ignore eval.py --config_file='./configs/surround_fusion_augdepth.yaml'

The synthesized results are stored results/<config-name>/syn_results

License

This repository is released under the Apach 2.0 license.

daydreamer2023 / VFDepth