Add support for deep speed

Question

Add support for deep speed

alaeddine-13 opened this issue a year ago · comments

AlaeddineAbdessalem commented a year ago

Johannes Messner · Answer 1 · Wed Aug 02 2023 16:34:23 GMT+0800 (China Standard Time)

DeepSpeed does not want to run on our GPU machine since the fused_adam op cannot be compiled, neither in JIT nor in pre-compiled mode.
I tried various versions of deepspeed and various versions of PyTorch. The only variable I can think of at this point is the cuda/nvvm version that is installed on our machine.

Since we can currently train on an A100 GPU without needing deepspeed, we put this issue on hold.