Video salient object detection via self-attention-guided multilayer cross-stack fusion

This repository contains the source code for the paper: Video salient object detection via self-attention-guided multilayer cross-stack fusion

Usage

Dependencies

torch == 1.8+
scipy == 1.2.2

Training

To train the spatial feature branch, download TrainSet_RGB Google Drive Link and put it in ./data. Run train.py --train_type=pretrain_rgb to start the training of the first stage. The generated SMCF-19epoch.pth file will be stored in the ./snapshot/SMCF_rgb directory for the third stage of training.

python train.py --train_type=pretrain_rgb

To train the motion feature branch, download TrainSet_Video Google Drive Link and put it in ./data. Run train.py --train_type=pretrain_flow to start the training of the second stage. The generated SMCF-19epoch.pth file will be stored in the ./snapshot/SMCF_flow directory for the third stage of training.

python train.py --train_type=pretrain_flow

To train the whole model, download TrainSet_Video (same dataset as in the second stage) and put it in ./data. Run train.py --train_type=finetune to start the training of the third stage. The generated SMCF-19epoch.pth file will be stored in the ./snapshot/SMCF directory as the final training model

python train.py --train_type=finetune

3. Testing

The test dataset can be downloaded from:

Datasets	Links
DAVIS	Google Drive Link
DAVSOD	Google Drive Link
DAVSOD-Normal	Google Drive Link
FBMS	Google Drive Link
MCL	Google Drive Link

One can download our trained model SMCF-19epoch.pthGoogle Drive Link, and place it in ./snapshot/SMCF. Run test.py to start the testing.

python test.py

3. Results

The predictions of our model can be downloaded from:

Datasets	Links
DAVIS	Google Drive Link
DAVSOD	Google Drive Link
DAVSOD-Normal	Google Drive Link
FBMS	Google Drive Link
MCL	Google Drive Link

Method Detials

Cite

If you find our code useful for your research, please cite our paper:

H. Yang, N. Mu, J. Guo, Y. Hu, and R. Wang, "Video salient object detection via self-attention-guided multilayer cross-stack fusion", Multimedia Tools and Applications, 2023.

In case of any questions, please contact the corresponding author N. Mu at nanmu@sicnu.edu.cn

NanMu-SICNU / SMCF