ChaosCodes's starred repositories
attention-gym
Helpful tools and examples for working with flex-attention
torchtitan
A native PyTorch Library for large model training
Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
ThunderKittens
Tile primitives for speedy kernels
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
scattermoe
Triton-based implementation of Sparse Mixture of Experts.
LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
sailor-llm
⚓️ Sailor: Open Language Models for South-East Asia
DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
text-dedup
All-in-one text de-duplication
tangent_task_arithmetic
Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".