AI推理部署加速 (nndeploy)

AI推理部署加速

nndeploy

Geek Repo

Github PK Tool:Github PK Tool

AI推理部署加速's repositories

nndeploy

nndeploy是一款模型端到端部署框架。以多端推理以及基于有向无环图模型部署为基础,致力为用户提供跨平台、简单易用、高性能的模型部署体验。

Language:C++License:Apache-2.0Stargazers:503Issues:19Issues:11

Awesome-LLM-Inference

💻A small Collection for Awesome LLM Inference [Papers|Blogs|Docs] with codes, contains TensorRT-LLM, streaming-llm, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

License:GPL-3.0Stargazers:2Issues:0Issues:0

onnx-simplifier

Simplify your onnx model

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0