Unsupported data type for HCCL process group
iiacobac opened this issue · comments
Please apply the same fix as done for NCCL for the kbool limitation
follow this update
pytorch/pytorch@366c014#diff-43cb0f438d3eb35dec0a1680ddc2d01c3ae9277d91aca4c2119d0b9ea80adeb6
你好,这个目前我们还不支持,后续可能会排入需求计划里
你好,我是Ignacio。
The fix has been applied since my comment.
You can compare https://github.com/Ascend/pytorch/blob/v2.0.3/pytorch1.8.1/src/torch/lib/c10d/ProcessGroupHCCL.cpp and
where only these data types are supported
{at::kChar, HCCL_DATA_TYPE_INT8},
{at::kFloat, HCCL_DATA_TYPE_FP32},
{at::kInt, HCCL_DATA_TYPE_INT32},
{at::kHalf, HCCL_DATA_TYPE_FP16},
{at::kShort, HCCL_DATA_TYPE_INT16},
{at::kLong, HCCL_DATA_TYPE_INT64},
with https://github.com/Ascend/pytorch/blob/master/torch_npu/csrc/distributed/ProcessGroupHCCL.cpp
where kbool among others, are included
{at::kByte, HCCL_DATA_TYPE_UINT8},
{at::kChar, HCCL_DATA_TYPE_INT8},
{at::kShort, HCCL_DATA_TYPE_INT16},
{at::kInt, HCCL_DATA_TYPE_INT32},
{at::kLong, HCCL_DATA_TYPE_INT64},
{at::kHalf, HCCL_DATA_TYPE_FP16},
{at::kFloat, HCCL_DATA_TYPE_FP32},
{at::kDouble, HCCL_DATA_TYPE_FP64},
{at::kBool, HCCL_DATA_TYPE_UINT8},
{at::kBFloat16, HCCL_DATA_TYPE_BFP16},
BF16 is not supported in the 1.8.1 version, but is supported in the current master version.