There are 31 repositories under inference topic.
Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Port of OpenAI's Whisper model in C/C++
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
🎨 The exhaustive Pattern Matching library for TypeScript, with smart type inference.
💎1MB lightweight face detection model (1MB轻量级人脸检测模型)
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
OpenVINO™ Toolkit repository
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
An easy to use PyTorch to TensorRT converter
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Pre-trained Deep Learning models and demos (high quality and extremely fast)
Faster Whisper transcription with CTranslate2
A curated list of research in machine learning systems (MLSys). Paper notes are also provided.
TensorFlow template application for deep learning
Large Language Model Text Generation Inference
Acceleration package for neural networks on multi-core CPUs
Inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
云原生一站式机器学习平台,多租户,数据资产,notebook在线开发,拖拉拽任务流编排,多机多卡分布式训练,超参搜索,推理服务,多集群调度,多项目组资源组,边缘计算,大模型实时训练, ai应用商店
📚 Jupyter notebook tutorials for OpenVINO™
The challenge projects for Inferencing machine learning models on iOS