There are 0 repository under megatron-lm topic.
Best practice for training LLaMA models in Megatron-LM
Annotations of the interesting ML papers I read
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Odysseus: Playground of LLM Sequence Parallelism
A LLaMA1/LLaMA12 Megatron implement.
Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine
Megatron-LM/GPT-NeoX compatible Text Encoder with 🤗Transformers AutoTokenizer.
Minimal yet high performant code for pretraining llms. Attempts to implement some SOTA features. Implements training through: Deepspeed, Megatron-LM, and FSDP. WIP