deep-learning pytorch weakly-supervised-learning temporal-action-localization

MMSD

Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization
Linjiang Huang (CUHK), Liang Wang (CASIA), Hongsheng Li (CUHK)

Paper: TIP

Overview

We propose a pseudo-label-based methods by taking full advantages of multiple modalities, i.e., RGB and optical flow sequences, to generate high quality pseudo labels. The experimental results on THUMOS14 are as below.

Method \ mAP(%)	@0.1	@0.2	@0.3	@0.4	@0.5	@0.6	@0.7	AVG
UntrimmedNet	44.4	37.7	28.2	21.1	13.7	-	-	-
STPN	52.0	44.7	35.5	25.8	16.9	9.9	4.3	27.0
W-TALC	55.2	49.6	40.1	31.1	22.8	-	7.6	-
AutoLoc	-	-	35.8	29.0	21.2	13.4	5.8	-
CleanNet	-	-	37.0	30.9	23.9	13.9	7.1	-
MAAN	59.8	50.8	41.1	30.6	20.3	12.0	6.9	31.6
CMCS	57.4	50.8	41.2	32.1	23.1	15.0	7.0	32.4
BM	60.4	56.0	46.6	37.5	26.8	17.6	9.0	36.3
RPN	62.3	57.0	48.2	37.2	27.9	16.7	8.1	36.8
DGAM	60.0	54.2	46.8	38.2	28.8	19.8	11.4	37.0
TSCN	63.4	57.6	47.8	37.7	28.7	19.4	10.2	37.8
EM-MIL	59.1	52.7	45.5	36.8	30.5	22.7	16.4	37.7
BaS-Net	58.2	52.3	44.6	36.0	27.0	18.6	10.4	35.3
A2CL-PT	61.2	56.1	48.1	39.0	30.1	19.2	10.6	37.8
ACM-BANet	64.6	57.7	48.9	40.9	32.3	21.9	13.5	39.9
HAM-Net	65.4	59.0	50.3	41.1	31.0	20.7	11.1	39.8
ACSNet	-	-	51.4	42.7	32.4	22.0	11.7	-
WUM	67.5	61.2	52.3	43.4	33.7	22.9	12.1	41.9
AUMN	66.2	61.9	54.9	44.4	33.3	20.5	9.0	41.5
CoLA	66.2	59.5	51.5	41.9	32.2	22.0	13.1	40.9
ASL	67.0	-	51.8	-	31.1	-	11.4	-
MMSD (Ours)	69.7	64.3	54.6	45.0	36.4	23.0	12.3	43.6

Prerequisites

Recommended Environment

Python 3.6
Pytorch 1.2
Tensorboard Logger
CUDA 10.0

Data Preparation

Prepare THUMOS'14 dataset.
- We recommend using features and annotations provided by this repo.
Place the features and annotations inside a dataset/Thumos14reduced/ folder.

Usage

Training

You can easily train the model by running the provided script.

Refer to train_options.py. Modify the argument of dataset-root to the path of your dataset folder.
Run the command below.

$ python train_main.py --run-type 0 --model-id 1

Models are saved in ./ckpt/dataset_name/model_id/

Evaulation

The trained model can be found here. Please put it into ./ckpt/dataset_name/model_id/.

Run the command below.

$ python train_main.py --pretrained --run-type 1 --model-id 1 --load-epoch 240

load-epoch refers to the epoch of the best model. The best model would not always occur at 240 epoch, please refer to the log in the same folder of saved models to set the load epoch of the best model. Make sure you set the right model-id that corresponds to the model-id during training.

References

We referenced the repos below for the code.

Contact

If you have any question or comment, please contact the first author of the paper - Linjiang Huang (ljhuang524@gmail.com).

About

The official implementation of Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization

deep-learning pytorch weakly-supervised-learning temporal-action-localization

MIT License

Languages

Language:Python 100.0%