Device-Level Balance Loss and Communication Balance Loss
hsm1997 opened this issue · comments
What's the main difference?
As I see from your paper, pi' == pi'', and fi' = some_coeff * fi''
maybe fi'' should be:
... (Token t is sent to Device i from Device j where j!=i)
maybe the authors already meant this by using the word "sent"...