LeonHLJ / MMSD

The official implementation of Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MMSD

Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization
Linjiang Huang (CUHK), Liang Wang (CASIA), Hongsheng Li (CUHK)

Paper: TIP

Overview

We propose a pseudo-label-based methods by taking full advantages of multiple modalities, i.e., RGB and optical flow sequences, to generate high quality pseudo labels. The experimental results on THUMOS14 are as below.

Method \ mAP(%) @0.1 @0.2 @0.3 @0.4 @0.5 @0.6 @0.7 AVG
UntrimmedNet 44.4 37.7 28.2 21.1 13.7 - - -
STPN 52.0 44.7 35.5 25.8 16.9 9.9 4.3 27.0
W-TALC 55.2 49.6 40.1 31.1 22.8 - 7.6 -
AutoLoc - - 35.8 29.0 21.2 13.4 5.8 -
CleanNet - - 37.0 30.9 23.9 13.9 7.1 -
MAAN 59.8 50.8 41.1 30.6 20.3 12.0 6.9 31.6
CMCS 57.4 50.8 41.2 32.1 23.1 15.0 7.0 32.4
BM 60.4 56.0 46.6 37.5 26.8 17.6 9.0 36.3
RPN 62.3 57.0 48.2 37.2 27.9 16.7 8.1 36.8
DGAM 60.0 54.2 46.8 38.2 28.8 19.8 11.4 37.0
TSCN 63.4 57.6 47.8 37.7 28.7 19.4 10.2 37.8
EM-MIL 59.1 52.7 45.5 36.8 30.5 22.7 16.4 37.7
BaS-Net 58.2 52.3 44.6 36.0 27.0 18.6 10.4 35.3
A2CL-PT 61.2 56.1 48.1 39.0 30.1 19.2 10.6 37.8
ACM-BANet 64.6 57.7 48.9 40.9 32.3 21.9 13.5 39.9
HAM-Net 65.4 59.0 50.3 41.1 31.0 20.7 11.1 39.8
ACSNet - - 51.4 42.7 32.4 22.0 11.7 -
WUM 67.5 61.2 52.3 43.4 33.7 22.9 12.1 41.9
AUMN 66.2 61.9 54.9 44.4 33.3 20.5 9.0 41.5
CoLA 66.2 59.5 51.5 41.9 32.2 22.0 13.1 40.9
ASL 67.0 - 51.8 - 31.1 - 11.4 -
MMSD (Ours) 69.7 64.3 54.6 45.0 36.4 23.0 12.3 43.6

Prerequisites

Recommended Environment

  • Python 3.6
  • Pytorch 1.2
  • Tensorboard Logger
  • CUDA 10.0

Data Preparation

  1. Prepare THUMOS'14 dataset.

    • We recommend using features and annotations provided by this repo.
  2. Place the features and annotations inside a dataset/Thumos14reduced/ folder.

Usage

Training

You can easily train the model by running the provided script.

  • Refer to train_options.py. Modify the argument of dataset-root to the path of your dataset folder.

  • Run the command below.

$ python train_main.py --run-type 0 --model-id 1

Models are saved in ./ckpt/dataset_name/model_id/

Evaulation

The trained model can be found here. Please put it into ./ckpt/dataset_name/model_id/.

  • Run the command below.
$ python train_main.py --pretrained --run-type 1 --model-id 1 --load-epoch 240

load-epoch refers to the epoch of the best model. The best model would not always occur at 240 epoch, please refer to the log in the same folder of saved models to set the load epoch of the best model. Make sure you set the right model-id that corresponds to the model-id during training.

References

We referenced the repos below for the code.

Contact

If you have any question or comment, please contact the first author of the paper - Linjiang Huang (ljhuang524@gmail.com).

About

The official implementation of Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization

License:MIT License


Languages

Language:Python 100.0%