SeungHui Youn's starred repositories
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
flash-attention
Fast and memory-efficient exact attention
latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
victor-mono
A free programming font with cursive italics and ligatures. Donations welcome ❤️
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
executorch
On-device AI across mobile, embedded and edge for PyTorch
flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
intel-npu-acceleration-library
Intel® NPU Acceleration Library
tt-budabackend
Buda Compiler Backend for Tenstorrent devices
paged-attention-triton
PagedAttention in triton