GPU direct
tks2004 opened this issue · comments
If we need to enable GPU direct, is there any FI environment to be enabled to utilize that feature.
Applications request GPU direct capability from Libfabric by adding the FI_HMEM
flag when calling fi_getinfo
, as the plugin does here:
aws-ofi-nccl/src/nccl_ofi_net.c
Line 374 in e704fd9
Before Libfabric 1.18, the Libfabric EFA provider also required an environment variable, FI_EFA_USE_DEVICE_RDMA=1
, to enable GPU direct. For Libfabric 1.18+ and Aws-ofi-nccl 1.7.0+, this is no longer required. See also: https://github.com/aws/aws-ofi-nccl/blob/master/doc/efa-env-var.md, mostly relevant to EFA provider.