dice-group / LOLA-Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dice-group/LOLA-Megatron-DeepSpeed Issues

No issues in this repository yet.