Kazuki Fujii's repositories
Megatron-LM
Ongoing research training transformer models at scale
NeMo-Aligner
Scalable toolkit for efficient model alignment
llm-recipes
Ongoing Research Project for continaual pre-training LLM(dense mode)
moe-recipes
Ongoing research training Mixture of Expert models.
nanotron
Minimalistic large language model 3D-parallelism training
torchtitan
A native PyTorch Library for large model training
llama3v
A SOTA vision model built on top of llama3 8B.
grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
llama-recipes
Examples and recipes for Llama 2 model
deploymentmanager-samples
Deployment Manager samples and templates.
levanter
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
ml-engineering
Machine Learning Engineering Open Book
multi-gpu-programming-models
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
Megatron-LM-ABCI
NVIDIA Megatron-LM fork
NeMo
NeMo: a toolkit for conversational AI
NeMo-Megatron-Launcher
NeMo Megatron launcher and tools
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.