There are 10 repositories under tvm topic.
Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
AutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。
yolort is a runtime stack for yolov5 on specialized accelerators such as tensorrt, libtorch, onnxruntime, tvm and ncnn.
🗣️ Chat with LLM like Vicuna totally in your browser with WebGPU, safely, privately, and with no server. Powered by web llm.
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
TON Foundation invites talent to imagine and realize projects that have the potential to integrate with the daily lives of users.
🚀🚀🚀 This repository lists some awesome public CUDA, cuda-python, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR, PTX and High Performance Computing (HPC) projects.
Optimizing Mobile Deep Learning on ARM GPU with TVM
Solidity compiler for TVM
TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together
动手学习TVM核心原理教程
A curated list of awesome inference deployment framework of artificial intelligence (AI) models. OpenVINO, TensorRT, MediaPipe, TensorFlow Lite, TensorFlow Serving, ONNX Runtime, LibTorch, NCNN, TNN, MNN, TVM, MACE, Paddle Lite, MegEngine Lite, OpenPPL, Bolt, ExecuTorch.
Large input size REAL-TIME Face Detector on Cpp. It can also support face verification using MobileFaceNet+Arcface with real-time inference. 480P Over 30FPS on CPU
Machine Learning Compiler Road Map
⛏ Boilerplate for mining your very first NFT and becoming a TVM Developer.
Streamline Ethereum, Solana, Aptos, Sui and Tron operations. Effortlessly create transactions, interact with smart contracts, sign, and send transactions for a seamless blockchain experience.
This project contains a code generator that produces static C NN inference deployment code targeting tiny micro-controllers (TinyML) as replacement for other µTVM runtimes. This tools generates a runtime, which statically executes the compiled model. This reduces the overhead in terms of code size and execution time compared to having a dynamic on-device runtime.