There are 17 repositories under inference topic.
Cross-platform, customizable ML solutions for live and streaming media.
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
💎1MB lightweight face detection model (1MB轻量级人脸检测模型)
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Colossal-AI: A Unified Deep Learning System for Big Model Era
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
An easy to use PyTorch to TensorRT converter
OpenVINO™ Toolkit repository
TypeDB: a strongly-typed database
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
🎨 The exhaustive Pattern Matching library for TypeScript, with smart type inference.
LightSeq: A High Performance Library for Sequence Processing and Generation
TensorFlow template application for deep learning
A curated list of research in machine learning systems (MLSys). Paper notes are also provided.
Acceleration package for neural networks on multi-core CPUs
DELTA is a deep learning based natural language and speech processing platform.
Deploy a ML inference service on a budget in less than 10 lines of code.
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Easy-to-use library to boost AI inference leveraging state-of-the-art optimization techniques.
High-efficiency floating-point neural network inference operators for mobile, server, and Web
The challenge projects for Inferencing machine learning models on iOS
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
A uniform interface to run deep learning models from multiple frameworks
Multi Model Server is a tool for serving neural net models for inference
Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.
Bolt is a deep learning library with high performance and heterogeneous flexibility.
TensorFlow models accelerated with NVIDIA TensorRT
📚 A collection of Jupyter notebooks for learning and experimenting with OpenVINO 👓
Neural network inference engine that delivers GPU-class performance for sparsified models on CPUs
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.
Embedded and mobile deep learning research resources