errors with DecentralizedAlgorithm in shift_one mode
ProHuper opened this issue · comments
ProHuper commented
Rui Wang commented
nranks
means the number of ranks in the NCCL communicator.
Decentralized algorithm will enable hierarchical reduce by default, which means only inter-node decentralized communication will be performed, with an intra-node allreduce before it and an intra-node bcast after it. To try it out on 8 GPUs, set hierarchical =False
. See API for details.
ProHuper commented
got it!