Directional-Deep-Embedding-and-Appearance-Learning-for-Fast-Video-Object-Segmentation

##Paper:

Please refer to the paper:

Yingjie Yin, De Xu, Xingang Wang, and Lei Zhang, Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation, http://arxiv.org/abs/2002.06736.

Dependencies:

python (>= 3.5 or 3.6)
numpy
pytorch (>= 0.5 probably)
torchvision
pillow
tqdm

Datasets utilized:

DAVIS

YouTubeVOS

How to setup:

Install dependencies
Clone this repo:

git clone https://github.com/YingjieYin/Directional-Deep-Embedding-and-Appearance-Learning-for-Fast-Video-Object-Segmentation.git

Download datasets
Set up local_config.py to point to appropriate directories for saving and reading data
Move the ytvos_trainval_split/ImageSets directory into your YouTubeVOS data directory. The directory structure should look like

/...some_path.../youtube_vos
-- train
---- Annotations
---- JPEGImages
-- valid
---- Annotations
---- JPEGImages
-- ImageSets
---- train.txt
---- train_joakim.txt
---- val_joakim.txt

How to run method on DAVIS and YouTubeVOS with pre-trained weights:

Download weights from Link：https://pan.baidu.com/s/1fcsHWNmE1e5xL5k9AJjmnw Password：y5xx （This UNITED TRAINED MODEL CAN ACHIEVE THE RESULTS ON ALL TESTING DATSETS OF DAVIS2016, DAVIS2017,YOUTUBE-VOSIN IN OUR PAPER!!）
Put the weights at the path pointed out by config['workspace_path'] in local_config.py.
Run

python3 -u runfiles/main_runfile.py --test

How to train (and test) a new model:

Run

python3 -u runfiles/main_runfile.py --train --test

Most settings used for training and evaluation are set in your runfiles. Each runfile should correspond to a single experiment. I supplied an example runfile.

Experimental results on DAVIS2016, DAVIS2017,Youtube-VOS

DAVIS16_val.rar Link：https://pan.baidu.com/s/1JetbLKoZSmT0IzHrqVPNjA Password：ca63

DAVIS17_val.rar Link：https://pan.baidu.com/s/1G7zIwzOF3-Z25R6w4riWfA Password：y7d4

YTVOS_val.rar Link：https://pan.baidu.com/s/10tHuZxnis5R7mZmhOIZH0w Password：tdsl

Compared results:

Demo1_DAVIS2016.avi Link：https://pan.baidu.com/s/19cMdbxU2ggOyGzZl0MwMWA Password：jdxq

Demo2_DAVIS2017.avi Link：https://pan.baidu.com/s/1raT-G2Jc-hJljyubPodpXA Password：0oqv

Demo3_YouTube-VOS.avi Link：https://pan.baidu.com/s/1ymDMrblZ8P_d0byOwnJW8Q Password：qtkc

About

We propose a directional deep embedding and appearance learning (DDEAL) method, which is free of the online fine-tuning process, for fast VOS. DDEAL achieves a J & F mean score of 74.8% on DAVIS 2017 dataset and an overall score G of 71.3% on the large-scale YouTube-VOS dataset, while retaining a speed of 25 fps with a single NVIDIA TITAN Xp GPU. Furthermore, our faster version runs 31 fps with only a little accuracy loss.

Languages

Language:Python 100.0%