HSQ79815

followers

following

stars

xiaohongshu

beijing

sqhao's repositories

model-optimizer

Language:PythonApache-2.0100

TensorRT

TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

Language:C++Apache-2.0100

onnx-tensorrt

ONNX-TensorRT: TensorRT backend for ONNX

Language:C++Apache-2.0000

openvino

OpenVINO™ Toolkit repository

Language:C++Apache-2.0000

openvino2tensorflow

This script converts the OpenVINO IR model to Tensorflow's saved_model, tflite, h5, tfjs, tftrt(TensorRT), CoreML, EdgeTPU, ONNX and pb. PyTorch (NCHW) -> ONNX (NCHW) -> OpenVINO (NCHW) -> openvino2tensorflow -> Tensorflow/Keras (NHWC) -> TFLite (NHWC). And the conversion from .pb to saved_model and from saved_model to .pb and from .pb to .tflite and saved_model to .tflite and saved_model to onnx. Support for building environments with Docker. It is possible to directly access the host PC GUI and the camera to verify the operation. NVIDIA GPU (dGPU) support. Intel iHD GPU (iGPU) support.

Language:PythonMIT000

3d-photo-inpainting

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

NOASSERTION000

backend

Common source, scripts and utilities for creating Triton backends.

BSD-3-Clause000

cnpy

library to read/write .npy and .npz files in C/C++

MIT000

cuda_optimizer

Language:CudaApache-2.0000

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaApache-2.0000

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

Apache-2.0000

highway

Performance-portable, length-agnostic SIMD with runtime dispatch

Apache-2.0000

image_op

Apache-2.0000

onnx

Open standard for machine learning interoperability

Language:PythonApache-2.0000

ONNX-Python-Examples

ONNX Python Examples

MIT000

onnx-simplifier

Simplify your onnx model

Language:PythonApache-2.0000

optimizer

Actively maintained ONNX Optimizer

Language:C++Apache-2.0000

perf-ninja

This is an online course where you can learn and master the skill of low-level performance analysis and tuning.

000

pillow-resize

Porting of Pillow resize method in C++ and OpenCV.

Apache-2.0000

prajna

a program language for AI infrastructure

NOASSERTION000

pybind11

Seamless operability between C++11 and Python

NOASSERTION000

rocketmq-client-cpp

Apache RocketMQ cpp client

Apache-2.0000

stable-diffusion.cpp

Stable Diffusion in pure C/C++

MIT000

stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

MIT000

tensorRT_Pro

C++ library based on tensorrt integration

MIT000

treelite

model compiler for decision tree ensembles

Apache-2.0000

trt-samples-for-hackathon-cn

Simple samples for TensorRT programming

Apache-2.0000

TRTorch

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

BSD-3-Clause000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Apache-2.0000

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT000