E3SM-Project / E3SM

Energy Exascale Earth System Model source code. NOTE: use "maint" branches for your work. Head of master is not validated.

Home Page:https://docs.e3sm.org/E3SM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Current impl of FindNETCDF.cmake picks up system installation before the one pointed by netcdf cmake vars

bartgol opened this issue · comments

The issue is that in cmake's find_XYZ routines, the entries passed via PATHS are scanned after system paths. There are two solutions.

  1. Use HINTS instead of PATHS: these entries are scanned before system ones, so it would work. The drawback is that CMake states that "These should be paths computed by system introspection, such as a hint provided by the location of another item already found. Hard-coded guesses should be specified with the PATHS option".
  2. Use the NO_SYSTEM_PATH option (possibly with some other NO_XYZ option): this categorically excludes general system paths (like /usr/) from the search. The drawback is that we cannot rely on system installation anymore, even if a machine had a system version of a tpl that works for us. That said, I don't think this will likely be the case for production machines.

Imho, option 1 is better. Although we don't follow CMake's advice of using PATHS for hard-coded guesses, we achieve the desired result, and don't rule out system libraries. Imho, the fact that PATHS is scanned after system paths is a huge flaw of CMake: if someone takes the effort of giving CMake a guess, its guess should have the priority.

i am just trying to learn: Are you referring to * Net *_PATH variables that are set in config_machines? I believe that you also assume default cmake behavior for now as in system's cmake/FindNetCDF.cmake scripts (which in general depend on cmake versions?). But also some of e3sm components (using "find") use their own FindNetCDF scripts, do those obey the same order of preference? Overall i am confused why the system install is a problem -- by that i mean usually we load io modules, if someone does not want them "seen", they don't have to load them. are there really machines with io libraries in /usr/lib etc?

Overall i am confused why the system install is a problem -- by that i mean usually we load io modules, if someone does not want them "seen", they don't have to load them.

By "system" paths, I don't mean modules. Modules are dynamically loaded in the environment, but are not part of the "standard" system paths. In particular, CMake doesn't know about their existence (which is why we need to pass hints to cmake about their location), while it definitely knows about /usr/include and /usr/lib, and it always scans those folders. As it happens, it scans them before any path provided via PATHS.

are there really machines with io libraries in /usr/lib etc?

On most production machines, there is no netcdf installed in the system paths. However, for workstation machines (like mappy), it doesn't take much to end up having one. E.g., installing something like octave, which may seem innocuous and maybe needed by some scientist, will automatically bring a netcdf installation in /usr. Other libraries (which someone may at some point install) may bring in a blas installation, or a yaml-cpp installation. All of these system libs may be built against a compiler that is not compatible with the one used for building E3SM (e.g., there may be a GLIBC incompatibility). For instance, I do run e3sm testing on my workstation, using user-defined config machines that I put in ~/.cime. For fast development, this is quite important for me: I don't need to go on an e3sm supported machine to see if the code compiles correctly in a CIME case, or if the correct options end up being used. My workstation is the perfect solution (alas, it's ~8y old and starts to have some sciatica, so it may have to retire soon).

I believe that you also assume default cmake behavior for now as in system's cmake/FindNetCDF.cmake scripts (which in general depend on cmake versions?). But also some of e3sm components (using "find") use their own FindNetCDF scripts, do those obey the same order of preference?

I actually am talking about the FindNETCDF.cmake file that is in components/cmake/modules, which we are shipping. And yes, all find_XYZ have the same behavior, meaning that the order of paths that are scanned to find XYZ is the same for all those cmake utilities.

In short, I think it will pay off to ensure that out paths are scanned before system paths, even if on most production machines this is a non-issue. It's cleaner, safer, and avoids headaches if/when someone builds e3sm on a workstation or another shared machine with some scientific libraries in the standard system paths.

@vijaysm I wonder if this explains the build issues on Perlmutter.

This is E3SM specific. What I encounter on perlmutter is just building TempestRemap with autotools and using the pre-built HDF5/Netcdf modules leads to runtime symbol resolution errors.