Scalable PaLM implementation of PyTorch
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
cainiaogoroad opened this issue a year ago · comments
Above is the program operation log,its says torch.distributed.elastic.multipro cessing.errors.ChildFailedError. Can anybody know why it happen.Thanks!