samsara's starred repositories
pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
mlir-python-bindings
MLIR Python Bindings
mistral-inference
Official inference library for Mistral models
llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
ControlNet_TensorRT
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案
ControlNet
Let us control diffusion models!
ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
buddy-mlir
An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).
Compass_Optimizer
Compass Optimizer (OPT for short), is part of the Zhouyi Compass Neural Network Compiler. The OPT is designed for converting the float Intermediate Representation (IR) generated by the Compass Unified Parser to an optimized quantized or mixed IR which is suited for Zhouyi NPU hardware platforms.
flash-attention
Fast and memory-efficient exact attention
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
annotated-transformer
An annotated implementation of the Transformer paper.