Learning Motion Representation for Real-Time Spatio-Temporal Action Localization

An Pytorch implementation of our work.

We built our work based on Pytorch implementation of Online Real-time Multiple Spatiotemporal Action Localisation and Prediction.

Environment

Ubuntu 16.04
Python 3.6
CUDA 8.0
CuDNN 7.1
Pytorch 0.4.0
Opencv 3.4
Matlab 2016b (if you need to compute the video-frame level)

Training

We use the official Pytorch implementation of PWC-Net as our flow subnet. (Notes: The PWC-Net repo is developed using Python 2.7 & Pytorch 0.2.0 & CUDA 8.0. We test several configurations to use the Pytorch implementation. Current environment can run this code correctly). You can use train-*.py scripts to train the whole network (we recommend to use train-ucf24-apex.py which is much faster but a little accuracy drop).

We use 4 GTX 1080ti graphics cards to train the network with 32 batch-sizes.

Testing

Frame-level

You can use val-ucf24.py to evaluate the frame-level mAP

Video-level

The video-level evaluation coda is in ./matlab-online-display. You can run myI01onlineTubes.m to produce the video-level results.

References

[1] Wei Liu, et al. SSD: Single Shot MultiBox Detector. ECCV2016.
[2] S. Saha, G. Singh, M. Sapienza, P. H. S. Torr, and F. Cuzzolin, Deep learning for detecting multiple space-time action tubes in videos. BMVC 2016
[3] G. Singh, S Saha, M. Sapienza, P. H. S. Torr and F Cuzzolin. Online Real time Multiple Spatiotemporal Action Localisation and Prediction. ICCV, 2017.
[4] Deqing Sun and Xiaodong Yang and Ming-Yu Liu and Jan Kautz. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume. CVPR, 2018.
[5] Liu, Songtao and Huang, Di and Wang, and Yunhong. Receptive Field Block Net for Accurate and Fast Object Detection. The European Conference on Computer Vision (ECCV).
Original SSD Implementation (CAFFE)
A huge thanks to Max deGroot, Ellis Brown for Pytorch implementation of SSD.
A huge thanks to Gurkirt Singh for Online Real-time Multiple Spatiotemporal Action Localisation and Prediction Pytorch implementation ROAD.

Riiick2011 / RT-ST-Action-Localization