Add support for deep speed
alaeddine-13 opened this issue · comments
AlaeddineAbdessalem commented
Johannes Messner commented
DeepSpeed does not want to run on our GPU machine since the fused_adam
op cannot be compiled, neither in JIT nor in pre-compiled mode.
I tried various versions of deepspeed and various versions of PyTorch. The only variable I can think of at this point is the cuda/nvvm version that is installed on our machine.
Since we can currently train on an A100 GPU without needing deepspeed, we put this issue on hold.