TODO: Fix pipeline parallelism bugs
hyunwoongko opened this issue · comments
Kevin Ko commented
Describe a TODO feature
- Currently, when pipeline parallelization is run on a large model, an issue arises that gradient values are different. This issue should be addressed.
Assignees
Kevin Ko commented
Fixed