PyTorch Lightning Fused optimizer step
jahatef opened this issue · comments
Add PyTorch Lightning memory optimizations. https://lightning.ai/pages/community/tutorial/faster-pytorch-training-by-reducing-peak-memory/
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
jahatef opened this issue · comments
Add PyTorch Lightning memory optimizations. https://lightning.ai/pages/community/tutorial/faster-pytorch-training-by-reducing-peak-memory/