feature_metric_depth

This is offical codes for the methods described in

Feature-metric Loss for Self-supervised Learning of Depth and Egomotion

ECCV 2020

If you find our work useful in your research please consider citing our paper:

@inproceedings{shu2020featdepth,
  title={Feature-metric Loss for Self-supervised Learning of Depth and Egomotion},
  author={Shu, Chang and Yu, Kun and Duan, Zhixiang and Yang, Kuiyuan},
  booktitle={ECCV},
  year={2020}
}

Setup

Our codes are based on mmcv for distributed learning. To make it convenient for you to train and test our codes, we provide our anaconda environment, you only need to download it and extract it to the folder of your anaconda environments, and use the python in it to run our codes. Besides, a computer installed with CUDA10 is required.

KITTI training data

Our training data is the same with other self-supervised monocular depth estimation methods, please refer to monodepth2 to prepare the training data.

pretrained weights

We provide weights for autoencoder, our model trained on kitti raw data, our refined model by using online refinement on test split of kitti raw data , our model train on kitti odometry, our model trained on Euroc, and our model trained on NYU.

API

We provide an API interface for you to predict depth and pose from an image sequence and visulize some results. They are stored in folder 'scripts'.

draw_odometry.py is used to provide several analytical curves and obtain standard kitti odometry evaluation results.

eval_pose.py is used to obtain kitti odometry evaluation results.

eval_depth.py is used to obtain kitti depth evaluation results.

infer.py is used to generate depth maps from given models.

Training

You can use following command to launch distributed learning of our model:

/path/to/python -m torch.distributed.launch --master_port=9900 --nproc_per_node=1 train.py --config /path/to/cfg_kitti_fm.py --work_dir /dir/for/saving/weights/and/logs'

Here nproc_per_node refers to GPU number you want to use.

Configurations

Our provide a variety of config files for training on different datasets. They are stored in config folder.

For example, 'cfg_kitti_fm.py' is used to train our model on kitti dataset, where the weights of autoencoder are loaded from the pretrained weights we provide and fixed during the traing. This mode is prefered when your GPU memory is lower than 16 GB; 'cfg_kitti_fm_joint.spy' is used to train our model on kitti dataset, where the autoencoder is jointly trained with depthnet and posenet. And we rescale the input resolution of our model to ensure training with 12 GB GPU memory, slightly reducing the performance. You can modify the input resolution according to your computational resource.

For modifying config files, please refer to cfg_kitti_fm.py.

Online refinement

We provide cfg file for online refinement, you can use cfg_kitti_fm_refine.py to refine your model trained on kitti raw data by keeping training on test data. For settings of online refinement, please refer to details in cfg_kitti_fm_refine.py in the folder config.

Finetuning

If you want to finetune on a given weights, you can modify the 'resume_from' term from 'None' to an existing path to a pre-trained weight in the config files.

Notes

Our model predicts inverse depths. If you want to get real depth when training stereo model, you have to convert inverse depth to depth, and then multiply it by 36.

adizhol / FeatDepth