YOLOv8-TensorRT
YOLOv8
using TensorRT accelerate !
Prepare the environment
-
Install
CUDA
followCUDA official website
.🚀 RECOMMENDED
CUDA
>= 11.4 -
Install
TensorRT
followTensorRT official website
.🚀 RECOMMENDED
TensorRT
>= 8.4 -
Install python requirements.
pip install -r requirements.txt
-
Install
ultralytics
package for ONNX export or TensorRT API building.pip install ultralytics
-
Prepare your own PyTorch weight such as
yolov8s.pt
oryolov8s-seg.pt
.
NOTICE:
Please use the latest CUDA
and TensorRT
, so that you can achieve the fastest speed !
If you have to use a lower version of CUDA
and TensorRT
, please read the relevant issues carefully !
Normal Usage
If you get ONNX from origin ultralytics
repo, you should build engine by yourself.
You can only use the c++
inference code to deserialize the engine and do inference.
You can get more information in Normal.md
!
Besides, other scripts won't work.
Export End2End ONNX with NMS
You can export your onnx model by ultralytics
API and add postprocess such as bbox decoder and NMS
into ONNX model at the same time.
python3 export-det.py \
--weights yolov8s.pt \
--iou-thres 0.65 \
--conf-thres 0.25 \
--topk 100 \
--opset 11 \
--sim \
--input-shape 1 3 640 640 \
--device cuda:0
Description of all arguments
--weights
: The PyTorch model you trained.--iou-thres
: IOU threshold for NMS plugin.--conf-thres
: Confidence threshold for NMS plugin.--topk
: Max number of detection bboxes.--opset
: ONNX opset version, default is 11.--sim
: Whether to simplify your onnx model.--input-shape
: Input shape for you model, should be 4 dimensions.--device
: The CUDA deivce you export engine .
You will get an onnx model whose prefix is the same as input weights.
Just Taste First
If you just want to taste first, you can download the onnx model which are exported by YOLOv8
package and modified by me.
Build End2End Engine from ONNX
1. Build Engine by TensorRT ONNX Python api
You can export TensorRT engine from ONNX by build.py
.
Usage:
python3 build.py \
--weights yolov8s.onnx \
--iou-thres 0.65 \
--conf-thres 0.25 \
--topk 100 \
--fp16 \
--device cuda:0
Description of all arguments
--weights
: The ONNX model you download.--iou-thres
: IOU threshold for NMS plugin.--conf-thres
: Confidence threshold for NMS plugin.--topk
: Max number of detection bboxes.--fp16
: Whether to export half-precision engine.--device
: The CUDA deivce you export engine .
You can modify iou-thres
conf-thres
topk
by yourself.
2. Export Engine by Trtexec Tools
You can export TensorRT engine by trtexec
tools.
Usage:
/usr/src/tensorrt/bin/trtexec \
--onnx=yolov8s.onnx \
--saveEngine=yolov8s.engine \
--fp16
If you installed TensorRT by a debian package, then the installation path of trtexec
is /usr/src/tensorrt/bin/trtexec
If you installed TensorRT by a tar package, then the installation path of trtexec
is under the bin
folder in the path you decompressed
Build TensorRT Engine by TensorRT API
Please see more information in API-Build.md
Notice !!! We don't support YOLOv8-seg model now !!!
Inference
1. Infer with python script
You can infer images with the engine by infer-det.py
.
Usage:
python3 infer-det.py \
--engine yolov8s.engine \
--imgs data \
--show \
--out-dir outputs \
--device cuda:0
Description of all arguments
--engine
: The Engine you export.--imgs
: The images path you want to detect.--show
: Whether to show detection results.--out-dir
: Where to save detection results images. It will not work when use--show
flag.--device
: The CUDA deivce you use.--profile
: Profile the TensorRT engine.
2. Infer with C++
You can infer with c++ in csrc/detect/end2end
.
Build:
Please set you own librarys in CMakeLists.txt
and modify CLASS_NAMES
and COLORS
in main.cpp
.
export root=${PWD}
cd csrc/detect/end2end
mkdir -p build && cd build
cmake ..
make
mv yolov8 ${root}
cd ${root}
Usage:
# infer image
./yolov8 yolov8s.engine data/bus.jpg
# infer images
./yolov8 yolov8s.engine data
# infer video
./yolov8 yolov8s.engine data/test.mp4 # the video path
TensorRT Segment Deploy
Please see more information in Segment.md
TensorRT Pose Deploy
Please see more information in Pose.md
DeepStream Detection Deploy
See more in README.md
Jetson Deploy
Only test on Jetson-NX 4GB
.
See more in Jetson.md
Profile you engine
If you want to profile the TensorRT engine:
Usage:
python3 trt-profile.py --engine yolov8s.engine --device cuda:0
Refuse To Use PyTorch for Model Inference !!!
If you need to break away from pytorch and use tensorrt inference,
you can get more information in infer-det-without-torch.py
,
the usage is the same as the pytorch version, but its performance is much worse.
You can use cuda-python
or pycuda
for inference.
Please install by such command:
pip install cuda-python
# or
pip install pycuda
Usage:
python3 infer-det-without-torch.py \
--engine yolov8s.engine \
--imgs data \
--show \
--out-dir outputs \
--method cudart
Description of all arguments
--engine
: The Engine you export.--imgs
: The images path you want to detect.--show
: Whether to show detection results.--out-dir
: Where to save detection results images. It will not work when use--show
flag.--method
: Choosecudart
orpycuda
, default iscudart
.