henryhmko

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.08622 93 1948

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonApache-2.05992 57 629

Liger-Kernel

Efficient Triton Kernels for LLM Training

Language:PythonBSD-2-Clause3407 39 98

equinox

Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/

Language:PythonApache-2.02105 24 454

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonApache-2.01958 35 349

ThunderKittens

Tile primitives for speedy kernels

Language:CudaMIT1645 29 27

ao

PyTorch native quantization and sparsity for training and inference

Language:PythonBSD-3-Clause1557 40 291

awesome-jax

JAX - A curated list of resources https://github.com/google/jax

CC0-1.01549 49 8

ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

Language:C++NOASSERTION1507 26 846

esm

Language:PythonNOASSERTION1255 20 89

lightning-thunder

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

Language:PythonApache-2.01193 34 543

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookApache-2.01111 10 13

awesome-mixture-of-experts

A collection of AWESOME things about mixture-of-experts

967 25 2

melange-nvim

🗡️ Warm color scheme for Neovim and beyond

Language:LuaMIT723 3 38

MS-AMP

Microsoft Automatic Mixed Precision Library

Language:PythonMIT523 11 65

snowflake-arctic

Language:PythonApache-2.0517 6 12

Awesome-GPU

Awesome resources for GPUs

BSD-3-Clause490 240

MS-SNSD

The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired.

Language:HTMLMIT484 20 15