arnoldfychen / Activity-Recognition-TensorRT

3D ResNet Video Classification accelerated by TensorRT

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Activity Recognition TensorRT

Perform video classification using 3D ResNets trained on Kinetics-700 and Moments in Time dataset - accelerated with TensorRT 8.0


P.S Click on the gif to watch the full-length video!


TensorRT 8 Installation

Assuming you have CUDA already installed, go ahead and download TensorRT 8 from here.

Follow instructions of installing the system binaries and python package for tensorrt here.

Python dependencies

Install the necessary python dependencies by running the following command -

pip3 install -r requirements.txt

Clone the repository

This is a straightforward step, however, if you are new to git recommend glancing threw the steps.

First, install git

sudo apt install git

Next, clone the repository

# Using HTTPS
# Using SSH

Download Pretrained Models

Download models from google-drive and place them in the current directory.

Running the code

The code supports a number of command line arguments. Use help to see all supported arguments

➜ python3 --help
usage: [-h] [--stream STREAM] [--model MODEL] [--fp16] [--frameskip FRAMESKIP] [--save_output SAVE_OUTPUT]

Action Recognition using TensorRT 8

optional arguments:
  -h, --help            show this help message and exit
  --stream STREAM       Path to use video stream
  --model MODEL         Path to model to use
  --fp16                To enable fp16 precision
  --frameskip FRAMESKIP
                        Number of frames to skip
  --save_output SAVE_OUTPUT
                        Save output as video

Run the script this way:

# Video
python3 --stream /path/to/video --model resnet-18-kinetics-moments.onnx --fp16 --frameskip 2

# Webcam
python3 --stream webcam --model resnet-18-kinetics-moments.onnx --fp16 --frameskip 2


  author={Kensho Hara and Hirokatsu Kataoka and Yutaka Satoh},
  title={Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?},
  journal={arXiv preprint},


3D ResNet Video Classification accelerated by TensorRT

License:GNU General Public License v3.0


Language:Python 100.0%