zetwhite

followers

following

stars

SeungHui Company

Organizations

dead4s

SeungHui Youn's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0129939 1120 15351

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION24916 208 211

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.023881 218 3669

llm.c

LLM training in simple, raw C/CUDA

Language:CudaMIT22379 219 125

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Language:C++NOASSERTION19861 573 3448

llamafile

Distribute and run LLMs with a single file.

Language:C++NOASSERTION17920 165 379

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookMIT13939 107 302

onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Language:C++MIT13713 243 6305

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause12658 116 927

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:Jupyter NotebookMIT11207 96 337

torch7

http://torch.ch

Language:CNOASSERTION8965 625 733

MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

Language:C++8504 201 2517

tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

Language:SystemVerilog6744 65 22

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonMIT6389 60 78

victor-mono

A free programming font with cursive italics and ligatures. Donations welcome ❤️

Language:VueOFL-1.13249 27 145

lectures

Material for cuda-mode lectures

Language:Jupyter NotebookApache-2.01988 30 6

XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

Language:CNOASSERTION1779 55 214

ml-cvnets

CVNets: A library for training computer vision networks

Language:PythonNOASSERTION1742 33 93

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonApache-2.01690 37 275

executorch

On-device AI across mobile, embedded and edge for PyTorch

Language:C++NOASSERTION1481 54 322

ATen

ATen: A TENsor library for C++11

Language:C++677 31 122

flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

Language:CudaApache-2.0506 4 4

attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Language:PythonMIT424 8 2

intel-npu-acceleration-library

Intel® NPU Acceleration Library

Language:PythonApache-2.0401 26 57

tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.

Language:C++Apache-2.0337 16 5086

tt-buda

Tenstorrent TT-BUDA Repository

Language:PythonNOASSERTION170 9 18

tt-budabackend

Buda Compiler Backend for Tenstorrent devices

Language:C++NOASSERTION17 40

tt-tvm

TVM for Tenstorrent ASICs

Language:PythonApache-2.011 5 1

rope-triton

Language:Python1000

paged-attention-triton

PagedAttention in triton

100