Learning and Distillating the Internal Relationship of Motion Features in Action Recognition

By Lu Lu, Siyuan Li, Niannian Chen, Lin Gao,Yong Fan, Yong Jiang and Ling Wu

This work proposes a novel distillation learning strategy (Dual-action Stream Network) to sufficiently learn and mimic the representation of the motion streams. Besides, we propose a lightweight attention-based fusion module to uniformly exploit both appearance and motion information.

For more details, please refer to our ICONIP 2020 paper and our website.

We release the testing code along trained models.

Requirements

Python3
Pytorch 1.0

conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

ffmpeg version 3.2.4
OpenCV with GPU support (will not be providing support in compiling this part)
Directory tree

   dataset/
       HMDB51/ 
           ../(dirs of class names)
               ../(dirs of video names)
       HMDB51_labels/
   results/
       test.txt
   trained_models/
       HMDB51/
           ../(.pth files)

Datasets

The datsets and splits can be downloaded from

Kinetics400

UCF101

HMDB51

SomethingSomethingv1
To extract only frames from videos

python utils1/extract_frames.py path_to_video_files path_to_extracted_frames start_class end_class

To extract optical flows + frames from videos

export OPENCV=path_where_opencv_is_installed

g++ -std=c++11 tvl1_videoframes.cpp -o tvl1_videoframes -I${OPENCV}include/opencv4/ -L${OPENCV}lib64 -lopencv_objdetect -lopencv_features2d -lopencv_imgproc -lopencv_highgui -lopencv_core -lopencv_imgcodecs -lopencv_cudaoptflow -lopencv_cudaarithm

python utils1/extract_frames_flows.py path_to_video_files path_to_extracted_flows_frames start_class end_class gpu_id

Models

Testing script

For RGB stream:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB  \
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

python3 test_single_stream.py 
--batch_size 1 --n_classes 101 --model resnext --model_depth 101 --log 0 --dataset UCF101 
--modality RGB --sample_duration 16 --split 1 --only_RGB 
--resume_path1 "trained_models/UCF101/RGB_UCF101_16f.pth" --frame_dir "/home/lulu/Dataset/videos/ucf_frames/"
 --result_path "test_results/" --annotation_path "/data/ucf101_splits"

For Flow stream:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality Flow --sample_duration 16 --split 1  \
--resume_path1 "trained_models/HMDB51/Flow_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

For single stream DS:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB  \
--resume_path1 "trained_models/HMDB51/MARS_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

python3 test_single_stream.py --batch_size 1 --n_classes 101 --model resnext --model_depth 101 --log 0 
--dataset UCF101 --modality RGB --sample_duration 16 --split 1 --only_RGB  
--resume_path1 "trained_models/UCF101/UCF101_16f.pth" --frame_dir "Dataset/videos/ucf_frames"
 --annotation_path "data/ucf101_splits" --result_path "test_results/"
 
 python3 test_single_stream.py --batch_size 1 --n_classes 101 --model resnext --model_depth 101 --log 0 
 --dataset UCF101 --modality RGB --sample_duration 16 --split 1 --only_RGB 
  --resume_path1 "results/1e-5/MARS_UCF101_1_train_batch16_sample112_clip16_lr0.001_nesterovFalse_manualseed1_modelresnext101_ftbeginidx4_layerdict_alpha50.0_67.pth" 
  --frame_dir "Dataset/videos/ucf_frames" --annotation_path "data/ucf101_splits" 
  --result_path "test_results/"

For two streams RGB+MARS:

python test_two_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB  \
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth" \
--resume_path2 "trained_models/HMDB51/MARS_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

python test_two_stream.py --batch_size 1 --n_classes 101 --model resnext --model_depth 101  --log 0 --modality RGB
 --sample_duration 16 --split 1 --only_RGB --dataset UCF101 
--resume_path1 "trained_models/UCF101/RGB_UCF101_16f.pth" 
--resume_path2 "results/1e-5/UCF101/MARS_UCF101_0.9516256938937351_67.pth" 
--frame_dir "Dataset/videos/ucf_frames/" --annotation_path "data/ucf101_splits" 
--result_path "results/RGB_MAR"

For two streams RGB+Flow:

python test_two_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality RGB_Flow --sample_duration 16 --split 1 \
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth" \
--resume_path2 "trained_models/HMDB51/Flow_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51/HMDB51_frames/" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

For two streams MARS+Flow:

python test_two_stream.py --batch_size 1 --n_classes 101 --model resnext --model_depth 101 
--log 0 --dataset UCF101 --modality RGB_Flow --sample_duration 16 --split 1 
--resume_path1 "results/1e-5/UCF101/MARS_UCF101_0.9516256938937351_67.pth" 
--resume_path2 "trained_models/UCF101/Flow_UCF101_16f.pth" 
--frame_dir "Dataset/videos/tv1_flows" 
--annotation_path "dataset/ucf101_splits" 
--result_path "results/Flow_MAR"

Fusion Module Test:

python test_fusion_modules.py 
--batch_size 1 --n_classes 101 --model resnext --model_depth 101 
--log 0 --dataset UCF101 --modality RGB_Flow --sample_duration 16 --split 1 
--resume_path1 "results/1e-5/UCF101/MARS_UCF101_0.9516256938937351_67.pth" 
--resume_path2 "trained_models/UCF101/Flow_UCF101_16f.pth"
--resume_path3 "results/fusion/UCF101/Fusion_UCF101_1_train_batch16_sample112_clip16_lr0.1_nesterovFalse_manualseed1_modelresnext101_ftbeginidx4_alpha50.0_15.pth"  
--frame_dir "Dataset/videos/tv1_flows" 
--annotation_path "dataset/ucf101_splits" 
--result_path "results/Flow_MAR"

Training script

For MARS:

From scratch:

python MARS_train.py --dataset Kinetics --modality RGB_Flow \
--n_classes 400 \
--batch_size 16 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 \
--output_layers 'avgpool' --MARS_alpha 50 \
--frame_dir "dataset/Kinetics" \
--annotation_path "dataset/Kinetics_labels" \
--resume_path1 "trained_models/Kinetics/Flow_Kinetics_16f.pth" \
--result_path "results/" --checkpoint 1

From pretrained Kinetics400:

python MARS_train.py --dataset HMDB51 --modality RGB_Flow --split 1  \
--n_classes 400 --n_finetune_classes 51 \
--batch_size 16 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 --ft_begin_index 4 \
--output_layers 'avgpool' --MARS_alpha 50 \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--pretrain_path "trained_models/Kinetics/MARS_Kinetics_16f.pth" \
--resume_path1 "trained_models/HMDB51/Flow_HMDB51_16f.pth" \
--result_path "results/" --checkpoint 1

python3 MARS_train.py --dataset UCF101 --modality RGB_Flow --split 1  --n_classes 400 
--n_finetune_classes 101 --batch_size 16 --log 1 --sample_duration 16 --model resnext 
--model_depth 101 --ft_begin_index 4 --output_layers 'dict' --MARS_alpha 50 
--frame_dir "Dataset/videos/tv1_flows" 
--annotation_path "dataset/ucf101_splits" 
--pretrain_path "trained_models/Kinetics/MARS_Kinetics_16f.pth" 
--resume_path1 "trained_models/UCF101/Flow_UCF101_16f.pth" 
--result_path "results/1e-5/" --checkpoint 1

From checkpoint:

python MARS_train.py --dataset HMDB51 --modality RGB_Flow --split 1  \
--n_classes 400 --n_finetune_classes 51 \
--batch_size 16 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 --ft_begin_index 4 \
--output_layers 'avgpool' --MARS_alpha 50 \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--pretrain_path "trained_models/Kinetics/MARS_Kinetics_16f.pth" \
--resume_path1 "trained_models/HMDB51/Flow_HMDB51_16f.pth" \
--MARS_resume_path "results/HMDB51/MARS_HMDB51_1_train_batch16_sample112_clip16_lr0.001_nesterovFalse_manualseed1_modelresnext101_ftbeginidx4_layeravgpool_alpha50.0_1.pth" \
--result_path "results/" --checkpoint 1

For RGB stream:

From scratch:

 python train.py --dataset Kinetics --modality RGB --only_RGB \
--n_classes 400 \
--batch_size 32 --log 1 --sample_duration 16 \
--model resnext --model_depth 101  \
--frame_dir "dataset/Kinetics" \
--annotation_path "dataset/Kinetics_labels" \
--result_path "results/"

From pretrained Kinetics400:

 python train.py --dataset HMDB51 --modality RGB --split 1 --only_RGB \
--n_classes 400 --n_finetune_classes 51 \
--batch_size 32 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 --ft_begin_index 4 \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--pretrain_path "trained_models/Kinetics/RGB_Kinetics_16f.pth" \
--result_path "results/"

From checkpoint:

 python train.py --dataset HMDB51 --modality RGB --split 1 --only_RGB \
--n_classes 400 --n_finetune_classes 51 \
--batch_size 32 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 --ft_begin_index 4 \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--pretrain_path "trained_models/Kinetics/RGB_Kinetics_16f.pth" \
--resume_path1 "results/HMDB51/PreKin_HMDB51_1_RGB_train_batch32_sample112_clip16_nestFalse_damp0.9_weight_decay1e-05_manualseed1_modelresnext101_ftbeginidx4_varLR2.pth" \
--result_path "results/"

For Flow stream

From scratch:

 python train.py --dataset Kinetics --modality Flow \
--n_classes 400 \
--batch_size 32 --log 1 --sample_duration 16 \
--model resnext --model_depth 101  \
--frame_dir "dataset/Kinetics" \
--annotation_path "dataset/Kinetics_labels" \
--result_path "results/"

From pretrained Kinetics400:

 python train.py --dataset HMDB51 --modality Flow --split 1 \
--n_classes 400 --n_finetune_classes 51 \
--batch_size 32 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 --ft_begin_index 4 \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--pretrain_path "trained_models/Kinetics/Flow_Kinetics_16f.pth" \
--result_path "results/"

From checkpoint:

 python train.py --dataset HMDB51 --modality Flow --split 1 \
--n_classes 400 --n_finetune_classes 51 \
--batch_size 32 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 --ft_begin_index 4 \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--pretrain_path "trained_models/Kinetics/Flow_Kinetics_16f.pth" \
--resume_path1 "results/HMDB51/PreKin_HMDB51_1_Flow_train_batch32_sample112_clip16_nestFalse_damp0.9_weight_decay1e-05_manualseed1_modelresnext101_ftbeginidx4_varLR2.pth" \
--result_path "results/"

For MERS:

From scratch:

python MERS_train.py --dataset Kinetics --modality RGB_Flow \
--n_classes 400 \
--batch_size 16 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 \
--output_layers 'avgpool' --MARS_alpha 50 \
--frame_dir "dataset/Kinetics" \
--annotation_path "dataset/Kinetics_labels" \
--resume_path1 "trained_models/Kinetics/Flow_Kinetics_16f.pth" \
--result_path "results/" --checkpoint 1

From pretrained Kinetics400:

python MERS_train.py --dataset HMDB51 --modality RGB_Flow --split 1  \
--n_classes 400 --n_finetune_classes 51 \
--batch_size 16 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 --ft_begin_index 4 \
--output_layers 'avgpool' --MARS_alpha 50 \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--pretrain_path "trained_models/Kinetics/MERS_Kinetics_16f.pth" \
--resume_path1 "trained_models/HMDB51/Flow_HMDB51_16f.pth" \
--result_path "results/" --checkpoint 1

From checkpoint:

python MERS_train.py --dataset HMDB51 --modality RGB_Flow --split 1  \
--n_classes 400 --n_finetune_classes 51 \
--batch_size 16 --log 1 --sample_duration 16 \
--model resnext --model_depth 101 --ft_begin_index 4 \
--output_layers 'avgpool' --MARS_alpha 50 \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--pretrain_path "trained_models/Kinetics/MARS_Kinetics_16f.pth" \
--resume_path1 "trained_models/HMDB51/Flow_HMDB51_16f.pth" \
--MARS_resume_path "results/HMDB51/MERS_HMDB51_1_train_batch16_sample112_clip16_lr0.001_nesterovFalse_manualseed1_modelresnext101_ftbeginidx4_layeravgpool_alpha50.0_1.pth" \
--result_path "results/" --checkpoint 1

GuardSkill / Learn-Relationship-of-Motion-Feature

Learning and Distillating the Internal Relationship of Motion Features in Action Recognition

Citing Dual-action Stream

Contents

Requirements

Datasets

Models

Testing script

Training script

For MARS:

From scratch:

From pretrained Kinetics400:

From checkpoint:

For RGB stream:

From scratch:

From pretrained Kinetics400:

From checkpoint:

For Flow stream

From scratch:

From pretrained Kinetics400:

From checkpoint:

For MERS:

From scratch:

From pretrained Kinetics400:

From checkpoint:

About

Languages