xrdaukar's repositories
ALCF_Hands_on_HPC_Workshop
The ALCF hosts a regular simulation, data, and learning workshop to help users scale their applications. This repository contains the examples used in the workshop.
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
mlx-examples
Examples in the MLX framework
ring-flash-attention
Ring attention implementation with flash attention
RustOrBust
My Rust deep dive down the rabbit-hole !
tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
aminrezae-memory-efficient-attention
Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorch
attention-gym
Helpful tools and examples for working with flex-attention
cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
context-parallelism
Context Parallelism, support Blockwise Attention, Ring Attention and Tree Attention.
EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
LightSeq
Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
oyEasyContext
modified on origin repo
LLaVA-MORE
LLaVA-MORE: Enhancing Visual Instruction Tuning with LLaMA 3.1
pytorch-memory-efficient-attention-pytorch
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
ringattention
Transformers with Arbitrarily Large Context
smol-vision
Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜