TensorRT-Alpha

English | 简体中文

Visualization

Introduce

This repository provides accelerated deployment cases of deep learning CV popular models, and cuda c supports dynamic-batch image process, infer, decode, NMS. Most of the model transformation process is torch->onnx->tensorrt.
There are two ways to obtain onnx files:

According to the network disk provided by TensorRT-Alpha, download ONNX directly. weiyun or google driver
Follow the instructions provided by TensorRT-Alpha to manually export ONNX from the relevant python source code framework.

Update

2023.01.01 🔥 update yolov3, yolov4, yolov5, yolov6
2023.01.04 🍅 update yolov7, yolox, yolor
2023.01.05 🎉 update u2net, libfacedetection
2023.01.08 🚀 The whole network is the first to support yolov8
2023.01.20 update efficientdet, pphunmanseg

Installation

The following environments have been tested：

Ubuntu18.04

cuda11.3
cudnn8.2.0
gcc7.5.0
tensorrt8.4.2.4
opencv3.x or 4.x
cmake3.10.2

Windows10

cuda11.3
cudnn8.2.0
visual studio 2017 or 2019 or 2022
tensorrt8.4.2.4
opencv3.x or 4.x

Python environment(Optional）

# install miniconda first
conda create -n tensorrt-alpha python==3.8 -y
conda activate tensorrt-alpha
git clone https://github.com/FeiYull/tensorrt-alpha
cd tensorrt-alpha
pip install -r requirements.txt

Installation Tutorial：

Quick Start

Ubuntu18.04

set your TensorRT_ROOT path:

git clone https://github.com/FeiYull/tensorrt-alpha
cd tensorrt-alpha/cmake
vim common.cmake
# set var TensorRT_ROOT to your path in line 20, eg:
# set(TensorRT_ROOT /root/TensorRT-8.4.2.4)

start to build project: For example:yolov8

Windows10

waiting for update

Onnx

At present, more than 30 models have been implemented, and some onnx files of them are organized as follows:

model	weiyun	google driver
yolov3	weiyun	google driver
yolov4	weiyun	google driver
yolov5	weiyun	google driver
yolov6	weiyun	google driver
yolov7	weiyun	google driver
yolov8	weiyun	google driver
yolox	weiyun	google driver
yolor	weiyun	google driver
u2net	weiyun	google driver
libfacedetection	weiyun	google driver
facemesh	weiyun	google driver
pphumanseg	weiyun	google driver
efficientdet	weiyun	google driver
more...(🚀: I will be back soon!)

🍉We will test the time of all models on tesla v100 and A100! Now let's preview the performance of yolov8n on RTX2070m(8G)：

model	video resolution	model input size	GPU Memory-Usage	GPU-Util
yolov8n	1920x1080	8x3x640x640	1093MiB/7982MiB	14%

cost time per frame

Some Precision Alignment Renderings Comparison

yolov8n : Offical( left ) vs Ours( right )

yolov7-tiny : Offical( left ) vs Ours( right )

yolov6s : Offical( left ) vs Ours( right )

yolov5s : Offical( left ) vs Ours( right )

libfacedetection : Offical( left ) vs Ours( right topK:2000)

Reference

[0].https://github.com/NVIDIA/TensorRT
[1].https://github.com/onnx/onnx-tensorrt
[2].https://github.com/NVIDIA-AI-IOT/torch2trt
[3].https://github.com/shouxieai/tensorRT_Pro
[4].https://github.com/opencv/opencv_zoo

About

🔥《TensorRT-Alpha》🔥supports YOLOv8, YOLOv7, YOLOv6, YOLOv5, YOLOv4, YOLOv3, YOLOX, YOLOR and so on. It implements 🚀 CUDA C++🚀 accelerated deployment models.🍎CUDA IS ALL YOU NEED🍎.Best Wish!

Languages

Language:C++ 56.0%Language:Cuda 26.8%Language:CMake 10.0%Language:Python 7.0%Language:C 0.2%