This repository contains the source code for the paper: Video salient object detection via self-attention-guided multilayer cross-stack fusion
- torch == 1.8+
- scipy == 1.2.2
- To train the spatial feature branch, download
TrainSet_RGB
Google Drive Link and put itin ./data
. Runtrain.py --train_type=pretrain_rgb
to start the training of the first stage. The generatedSMCF-19epoch.pth
file will be stored in the./snapshot/SMCF_rgb
directory for the third stage of training.
python train.py --train_type=pretrain_rgb
- To train the motion feature branch, download
TrainSet_Video
Google Drive Link and put it in./data
. Runtrain.py --train_type=pretrain_flow
to start the training of the second stage. The generatedSMCF-19epoch.pth
file will be stored in the./snapshot/SMCF_flow
directory for the third stage of training.
python train.py --train_type=pretrain_flow
- To train the whole model, download
TrainSet_Video
(same dataset as in the second stage) and put it in./data
. Runtrain.py --train_type=finetune
to start the training of the third stage. The generatedSMCF-19epoch.pth
file will be stored in the./snapshot/SMCF
directory as the final training model
python train.py --train_type=finetune
- The test dataset can be downloaded from:
Datasets | Links |
---|---|
DAVIS | Google Drive Link |
DAVSOD | Google Drive Link |
DAVSOD-Normal | Google Drive Link |
FBMS | Google Drive Link |
MCL | Google Drive Link |
- One can download our trained model
SMCF-19epoch.pth
Google Drive Link, and place it in./snapshot/SMCF
. Runtest.py
to start the testing.
python test.py
- The predictions of our model can be downloaded from:
Datasets | Links |
---|---|
DAVIS | Google Drive Link |
DAVSOD | Google Drive Link |
DAVSOD-Normal | Google Drive Link |
FBMS | Google Drive Link |
MCL | Google Drive Link |
If you find our code useful for your research, please cite our paper:
H. Yang, N. Mu, J. Guo, Y. Hu, and R. Wang, "Video salient object detection via self-attention-guided multilayer cross-stack fusion", Multimedia Tools and Applications, 2023.
In case of any questions, please contact the corresponding author N. Mu at nanmu@sicnu.edu.cn