ofiwg / libfabric

Open Fabric Interfaces

Home Page:http://libfabric.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Libfabric 1.20 ABI compatibility issue

shijin-aws opened this issue · comments

Libfabric 1.20.0 has a PR to bump ABI version to 1.7

#9440

I find if I compile Open MPI with Libfabric 1.20 (--with-libfabric=...), but run with Libfabric 1.19 loaded in runtime via LD_LIBRARY_PATH, I will get this error

/fsx/PortaFiducia/build/workloads/imb/openmpi-v5.0.1/source/mpi-benchmarks-IMB-v2021.7/IMB-RMA: /fsx/PortaFiducia/build/libraries/libfabric/main/install/libfabric/lib/libfabric.so.1: version `FABRIC_1.7' not found (required by /opt/amazon/openmpi5/lib64/libmpi.so.40)

Is this by expected? I thought Libfabric 1.x should always be ABI compatible.

That's expected. ABI compatibility only ensures binaries built with old library still work with new library, not the other way around.

Some technical details: the global symbols (e.g. fi_getinfo) are versioned. The version changes whenever there is ABI change to that function. A few symbols in libfabric 1.20 were bumped to ABI 1.7 and fi_getinfo() is one of them. The actual symbol name of the function contains a special suffix (e.g. @FABRIC_1.7). Application linked with 1ibfabric 1.20 will have fi_getinfo@FABRIC_1.7 in its symbol table. This symbol cannot be resolved with an older version of libfabric library. On the other hand, application linked with libfabric 1.19 refers to fi_getinfo@FABRIC_1.3 which can still be found in libfabric 1.20.

ABI version 1.4 ~ 1.6 don't affect the commonly used symbols (fi_getinfo, fi_dupinfo, fi_freeinfo) and may give the false impression that the compatibility applies both ways.

@j-xiong I see, thanks for the explanation. So to make ABI compatible, is it a better practice to compile application (OMPI) with a pinned older libfabric version and only update the runtime library?

Close as expected behavior