NanMu-SICNU / SMCF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Video salient object detection via self-attention-guided multilayer cross-stack fusion

This repository contains the source code for the paper: Video salient object detection via self-attention-guided multilayer cross-stack fusion

Usage

Dependencies

  • torch == 1.8+
  • scipy == 1.2.2

Training

  • To train the spatial feature branch, download TrainSet_RGB Google Drive Link and put it in ./data. Run train.py --train_type=pretrain_rgb to start the training of the first stage. The generated SMCF-19epoch.pth file will be stored in the ./snapshot/SMCF_rgb directory for the third stage of training.
python train.py --train_type=pretrain_rgb
  • To train the motion feature branch, download TrainSet_Video Google Drive Link and put it in ./data. Run train.py --train_type=pretrain_flow to start the training of the second stage. The generated SMCF-19epoch.pth file will be stored in the ./snapshot/SMCF_flow directory for the third stage of training.
python train.py --train_type=pretrain_flow
  • To train the whole model, download TrainSet_Video (same dataset as in the second stage) and put it in ./data. Run train.py --train_type=finetune to start the training of the third stage. The generated SMCF-19epoch.pth file will be stored in the ./snapshot/SMCF directory as the final training model
python train.py --train_type=finetune

3. Testing

  • The test dataset can be downloaded from:
Datasets Links
DAVIS Google Drive Link
DAVSOD Google Drive Link
DAVSOD-Normal Google Drive Link
FBMS Google Drive Link
MCL Google Drive Link
  • One can download our trained model SMCF-19epoch.pthGoogle Drive Link, and place it in ./snapshot/SMCF. Run test.py to start the testing.
python test.py

3. Results

  • The predictions of our model can be downloaded from:
Datasets Links
DAVIS Google Drive Link
DAVSOD Google Drive Link
DAVSOD-Normal Google Drive Link
FBMS Google Drive Link
MCL Google Drive Link

Method Detials

Cite

If you find our code useful for your research, please cite our paper:

H. Yang, N. Mu, J. Guo, Y. Hu, and R. Wang, "Video salient object detection via self-attention-guided multilayer cross-stack fusion", Multimedia Tools and Applications, 2023.

In case of any questions, please contact the corresponding author N. Mu at nanmu@sicnu.edu.cn

About


Languages

Language:Python 100.0%