Riiick2011 / RT-ST-Action-Localization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Learning Motion Representation for Real-Time Spatio-Temporal Action Localization

An Pytorch implementation of our work.

We built our work based on Pytorch implementation of Online Real-time Multiple Spatiotemporal Action Localisation and Prediction.

Environment

  • Ubuntu 16.04
  • Python 3.6
  • CUDA 8.0
  • CuDNN 7.1
  • Pytorch 0.4.0
  • Opencv 3.4
  • Matlab 2016b (if you need to compute the video-frame level)

Training

We use the official Pytorch implementation of PWC-Net as our flow subnet. (Notes: The PWC-Net repo is developed using Python 2.7 & Pytorch 0.2.0 & CUDA 8.0. We test several configurations to use the Pytorch implementation. Current environment can run this code correctly). You can use train-*.py scripts to train the whole network (we recommend to use train-ucf24-apex.py which is much faster but a little accuracy drop).

We use 4 GTX 1080ti graphics cards to train the network with 32 batch-sizes.

Testing

Frame-level

You can use val-ucf24.py to evaluate the frame-level mAP

Video-level

The video-level evaluation coda is in ./matlab-online-display. You can run myI01onlineTubes.m to produce the video-level results.

References

  • [1] Wei Liu, et al. SSD: Single Shot MultiBox Detector. ECCV2016.
  • [2] S. Saha, G. Singh, M. Sapienza, P. H. S. Torr, and F. Cuzzolin, Deep learning for detecting multiple space-time action tubes in videos. BMVC 2016
  • [3] G. Singh, S Saha, M. Sapienza, P. H. S. Torr and F Cuzzolin. Online Real time Multiple Spatiotemporal Action Localisation and Prediction. ICCV, 2017.
  • [4] Deqing Sun and Xiaodong Yang and Ming-Yu Liu and Jan Kautz. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume. CVPR, 2018.
  • [5] Liu, Songtao and Huang, Di and Wang, and Yunhong. Receptive Field Block Net for Accurate and Fast Object Detection. The European Conference on Computer Vision (ECCV).
  • Original SSD Implementation (CAFFE)
  • A huge thanks to Max deGroot, Ellis Brown for Pytorch implementation of SSD.
  • A huge thanks to Gurkirt Singh for Online Real-time Multiple Spatiotemporal Action Localisation and Prediction Pytorch implementation ROAD.

About


Languages

Language:Python 51.4%Language:Cuda 17.9%Language:C 17.7%Language:MATLAB 9.2%Language:C++ 3.2%Language:Shell 0.3%Language:Jupyter Notebook 0.2%Language:Dockerfile 0.0%