ofiwg / libfabric

Open Fabric Interfaces

Home Page:http://libfabric.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EFA fabric is aligned with NIC...should align with domain name.

wckzhang opened this issue · comments

In Open MPI, a check was added 0 == strcmp(provider_info->fabric_attr->name, provider->fabric_attr->name) to the multi NIC selection code to identify info objects for the same provider/fabric. This interferes with our (EFA) naming as we tied fabric_attr->name to our individual NIC's and so we fail this check. I think we (EFA) are doing the wrong thing here by tying fabrics to devices, but can you help clarify @shefty what provider behavior is expected here?

The fabric name should be related to the network itself, not the local NIC. The local NIC is better aligned with the domain name, assuming a 1:1 mapping between the domain and a NIC. In other providers, the fabric name maps to the IP network address (e.g. "192.168.0.0/16") or subnet (e.g. network GID prefix).

#7673 aims to address the issue, I'll create backports for 1.14.x and 1.15.x as well