bo-miao/MAMP

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation (MAMP)

This repository contains the source code (PyTorch) for our paper:

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

Requirements

The code has been trained and tested with PyTorch 1.9 (1.9.0a0+gitc91c4a0), Python 3.9, and Cuda 11.2.

Other dependencies could be installed by running:

pip install -r requirements.txt

Required Data

To evaluate/train MAMP, you will need to download the required datasets.

You can create symbolic links to wherever the datasets were downloaded in the datasets folder

├── datasets
    ├── DEMO
        ├── valid_demo
            ├── Annotations
            ├── JPEGImages       
    ├── DAVIS
        ├── JPEGImages
        ├── Annotations
        ├── ImageSets
    ├── YOUTUBE
        ├── train
        ├── valid
        ├── all (the data is from train_all_frames)
            ├── videos
                ├── consecutive frames

Demo

Use the following command in scripts folder to run a basic demo to visualize the segmentation results of MAMP.
```
sh test_demo.sh
```

Train

Use the following command in scripts folder to train.
```
sh train.sh
```

Test and evaluation

The pre-trained model can be downloaded at Google drive.
Use the following commands in scripts folder to evaluate on DAVIS and YouTube-VOS, separately. An approximate performance on DAVIS can be directly obtained from the output logs, or you can evaluate MAMP on DAVIS with the official evaluation code. The performance on YouTube-VOS need to be evaluated on the official server.
```
sh test_davis.sh
```
```
sh test_ytb.sh
```

Citation

If you find the paper, code, or pre-trained models useful, please cite our papers:

@InProceedings{Miao2022mamp,
  author        = {Bo Miao and Mohammed Bennamoun and Yongsheng Gao and Ajmal Mian},
  title         = {Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation},
  booktitle     = {IEEE International Conference on Multimedia and Expo (ICME)},
  year          = {2022},
  organization  = {IEEE}
}

(Optional)

You can optionally use --is_amp to enable Automatic Mixed Precision in the evaluation of DAVIS and YouTube-VOS
torch.cuda.empty_cache() could help reduce fragmentation of GPU memory in the evaluation process.

Results

Comparison with other methods on DAVIS-2017

Results on DAVIS-2017	Results on YouTube-VOS

Licenses

This repo contains third party code. It is your responsibility to ensure you comply with license here and conditions of any dependent licenses.

bo-miao / MAMP