By Yikai Wang, Fuchun Sun, Ming Lu, Anbang Yao.
For semantic segmentation task on NYUDv2 (official dataset), we provide a link to download the dataset here. The provided dataset is originally preprocessed in this repository, and we add depth data in it. Please modify the data paths in the codes, where we add comments 'Modify data path'.
python==3.6.2
pytorch==1.0.0
torchvision==0.2.2
imageio==2.4.1
numpy==1.16.2
scikit-learn==0.20.2
scipy==1.1.0
opencv-python==4.0.0
First,
cd semantic_segmentation
Training script for segmentation with RGB and Depth input, the default setting uses RefineNet (ResNet101),
python main.py --gpu 0 -c exp_name # or --gpu 0 1 2
Evaluation script,
python main.py --gpu 0 --resume path_to_pth --evaluate # optionally use --save-img to visualize results
AsymFusion is released under MIT License.
If you find our work useful for your research, please consider citing the following paper.
@inproceedings{wang2020asymfusion,
title={Learning Deep Multimodal Feature Representation with Asymmetric Multi-layer Fusion},
author={Wang, Yikai and Sun, Fuchun and Lu, Ming and Yao, Anbang},
booktitle={ACM International Conference on Multimedia (ACM MM)},
year={2020}
}