ofiwg / libfabric

Open Fabric Interfaces

Home Page:http://libfabric.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FI_MR_UNSPEC and fi_getinfo usage

hppritcha opened this issue · comments

Several questions have come up with our software (Open MPI) and how to use the mr_mode bits in the fi_domain_attr struct supplied as the hints argument to fi_getinfo.

Currently we are setting the mr_mode field to FI_MR_ALLOCATED | FI_MR_PROV_KEY | FI_MR_VIRT_ADDR but this doesn't work for HPE SS11 since it wants to see the FI_MR_ENDPOINT set. But there's been pushback on setting this bit because it might end up disqualifying providers that don't support this bit. I'm beginning to think that is not a correct assumption.

I then decided to set the mr_mode (we are using the fi_domain_attr field for other properties so I can't just set that to null in our hints argument) to FI_MR_UNSPEC based on this wording in the fi_domain man page:

      Buffers  used in data transfer operations may require notifying the provider of their use before a data transfer can occur.  The mr_mode field
       indicates the type of memory registration that is required, and when registration is necessary.  Applications that require the use of  a  spe-
       cific registration mode should set the domain attribute mr_mode to the necessary value when calling fi_getinfo.  The value FI_MR_UNSPEC may be
       used to indicate support for any registration mode.

thinking that would just get me back mr_mode bits from the provider, but instead in the info arg returned by the call to fi_getinfo, all the mr_mode bits are cleared.

So the question is, should I just supply all the MR bits our software can handle, assuming that the provider will clear the bits that it doesn't use, or should I try and fix the behavior of the util provider (that's were, at least for the CXI provider), this bit setting/clearing is handled to comport with what appears to be stated in the fi_domain man page?

Note the man page seems to contradict itself - at least as best I can tell - because just above the section cited above there's another statement:

    FI_MR_UNSPEC
              Defined  for  compatibility  - library versions 1.4 and earlier.  Setting mr_mode to 0 indicates that FI_MR_BASIC or FI_MR_SCALABLE are
              requested and supported.

Adding FI_MR_ENDPOINT to the mr_mode should not disqualify providers that don't need it. Such providers would clear the bit on return.

If I set this bit in the mr_mode bits might it cause a provider to go through a less efficient path than if I'd not or'd in FI_MR_ENDPOINT?

I'm the one causing Howard trouble here, so I'll jump in with some questions. Is the intended usage that the upper layer sets the MR bits for all the behaviors it supports, initializes the provider, and the provider returns all the bits that are absolutely required? Or is it like many of the capability bits, where the provider may fall into a slow path because the user requested behavior?

Modes are requirements from provider, not the application. Application setting the mode bits means the application can handle such requirements, not asking for the provider to go with the requirement. Provider should clear the bits that it doesn't need. Application should always be prepared to handle the case that any of the mode bit is cleared by the provider. One example I want point out (and that is often neglected) here is FI_MR_VIRT_ADDR, the application should always check if this bit is cleared and if so use offset instead of virtual address in RMA operations (for remote address). By the way, this happens to apply to CXI.

ok, thanks. As long as we expect providers not to try to support FI_MR_ENDPOINT just because the app said it could support it (at the cost of perf), that makes sense to me.

Howard, I'm pretty sure we have a VIRT_ADDR problem with the BTL as well.

FI_MR_UNSPEC means that the app is coded to only what was defined for libfabric version 1.4 or earlier. So more complex mr_mode bits, like FI_MR_ENDPOINT, are not permitted by libfabric or the app.

The mr_mode bits behave similar to the fi_info::mode bits, NOT caps bits.

Thanks for the clarifications.