2decomp-fft / 2decomp-fft

The 2DECOMP&FFT library provides access to a slabs and pencil decompositions as well as FFTs.

Home Page:https://2decomp-fft.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

2DECOMP&FFT

DOI

This README contains basic instructions for building and installing the 2DECOMP&FFT library, more detailed instructions about installation and linking to the library within an external project can be found in the install section. Please have a look at HOWTO.md and at the examples examples for how to use the library with your application

Building

The build system is driven by CMake. It is good practice to directly point to the MPI Fortran wrapper that you would like to use to guarantee consistency between Fortran compiler and MPI. This can be done by setting the default Fortran environmental variable

$ export FC=my_mpif90

The build system can then be generated by running

$ cmake -S $path_to_sources -B $path_to_build_directory -DOPTION1 -DOPTION2 ...

for many users a configuration line

$ cmake -S . -B build

run from the 2DECOMP&FFT root directory will be sufficient. If the build directory does not exist it will be generated and it will contain the configuration files. By default a RELEASE build will built for CPU using MPI and the generic FFT backend included with 2DECOMP&FFT, please see INSTALL.md for instructions on changing the build, including debugging builds, building for GPUs and selecting external FFT libraries.

Once the build system has been configured, you can build 2DECOMP&FFT by running

$ cmake --build $path_to_build_directory -j <nproc>

appending -v will display additional information about the build, such as compiler flags.

After building the library can be tested. Please see section Testing and examples

Finally, the build library can be installed by running

$ cmake --install $path_to_build_directory

The default location for libdecomp2d.a is $path_to_build_directory/opt/libor $path_to_build_directory/opt/lib64 unless the variable CMAKE_INSTALL_PREFIX is modified. The module files generated by the build process will similarly be installed to $path_to_build_directory/opt/install, users of the library should add this to the include paths for their program.

Occasionally a clean build is required, this can be performed by running

$ cmake --build $path_to_build_directory --target clean

GPU compilation

The library can perform multi GPU offoloading using the NVHPC compiler suite for NVIDIA hardware. The implementation is based on CUDA-aware MPI and NVIDIA Collective Communication Library (NCCL). The FFT is based on cuFFT.

For details of how to configure 2DECOMP&FFT for GPU offload, see the GPU compilation section in INSTALL.md.

Testing and examples

By default building of the tests is deactivated. To activate the testing the option -DBUILD_TESTING=ON can be added or alternativey the option can be activated in the GUI interface ccmake. After building the library can be tested by running

ctest --test-dir $path_to_build_directory

which uses the ctest utility. By default tests are performed in serial, but more than 1 rank can be used by setting MPIEXEC_MAX_NUMPROCS under ccmake utility. It is also possible to specify the decomposition by setting PROW and PCOL parameters at the configure stage or using ccmake. During the configure stage users should ensure that the number of MPI tasks MPIEXEC_MAX_NUMPROCS is equal to the product of PROW times PCOL. Mesh resolution can also be imposed using the parameters NX, NY and NZ.

For the GPU implementation please be aware that it is based on a single MPI rank per GPU. Therefore, to test multiple GPUs, use the maximum number of available GPUs on the system/node and not the maximum number of MPI tasks.

Profiling

The 2DECOMP&FFT library has integrated profiling support via external libraries, see the Profiling section of INSTALL.md for instructions on configuring a profiling build. Currently, support for profiling is provided by the caliper library.

When the profiling is active, one can tune it before calling decomp_2d_init using the subroutine decomp_profiler_prep. The input argument for this subroutine is a logical array of size 4. Each input allow activation / deactivation of the profiling as follows :

  1. Profile transpose operations (default : true)
  2. Profile IO operations (default : true)
  3. Profile FFT operations (default : true)
  4. Profile decomp_2d init / fin subroutines (default : true)

FFT backends

The library provides a built-in FFT engine and supports various FFT backends : FFTW, Intel oneMKL, Nvidia cuFFT. The FFT engine selected during compilation is available through the variable D2D_FFT_BACKEND defined in the module decomp_2d_fft. The expected value is defined by the integer constants

integer, parameter, public :: D2D_FFT_BACKEND_GENERIC = 0   ! Built-in engine
integer, parameter, public :: D2D_FFT_BACKEND_FFTW3 = 1     ! FFTW
integer, parameter, public :: D2D_FFT_BACKEND_FFTW3_F03 = 2 ! FFTW (Fortran 2003)
integer, parameter, public :: D2D_FFT_BACKEND_MKL = 3       ! Intel oneMKL
integer, parameter, public :: D2D_FFT_BACKEND_CUFFT = 4     ! Nvidia cuFFT

exported by the module decomp_2d_constants. The external code can use the named variables to check the FFT backend used in a given build.

OVERWRITE flag

  • The generic backend supports the OVERWRITE flag but it can not perform in-place transforms
  • The FFTW3 and FFTW3_F03 backends support the OVERWRITE flag and can perform in-place complex 1D fft
  • The oneMKL backend supports the OVERWRITE flag and can perform in-place complex 1D fft
  • The cuFFT backend supports the OVERWRITE flag and can perform in-place complex 1D fft

Miscellaneous

Print the log to a file or to stdout

Before calling decomp_2d_init, the external code can modify the variable decomp_log to change the output for the log. The expected value is defined by the integer constants

integer, parameter, public :: D2D_LOG_QUIET = 0       ! No logging output
integer, parameter, public :: D2D_LOG_STDOUT = 1      ! Root rank logs output to stdout
integer, parameter, public :: D2D_LOG_TOFILE = 2      ! Root rank logs output to the file "decomp_2d_setup.log"
integer, parameter, public :: D2D_LOG_TOFILE_FULL = 3 ! All ranks log output to a dedicated file

exported by the decomp_2d_constants module. Although their values are shown here, users should not rely on these and are recommended to prefer to use the named variables D2D_LOG_QUIET, etc. instead. The default value used is D2D_LOG_TOFILE for the default build and D2D_LOG_TOFILE_FULL for a debug build.

Change the debug level for debug builds

Before calling decomp_2d_init, the external code can modify the variable decomp_debug to change the debug level. The user can also modify this variable using the environment variable DECOMP_2D_DEBUG. Please note that the environment variable is processed only for debug builds. The expected value for the variable decomp_debug is some integer between 0 and 6, bounds included.

Code formatting

The code is formatted using the fprettify program (available via pip), to ensure consistency of use there is a script file scripts/format.sh which will run fprettify across the 2decomp&fft source, you can also use the format build target to run the script. It is recommended that you should format the code before making a pull request.

Versioning

The development of 2DECOMP&FFT occurs on Github, with release versions on the main branch. New features will be implemented on the dev branch and merged into main once a new release is ready. For example, starting from v2.0.0 the main branch will only be updated to receive fixes giving v2.0.1, etc. until the next release (either v2.1.0 or v3.0.0 depending on the magnitude of the change is ready).

Contributing

If you would like to contribute to the development of the 2DECOMP&FFT library or report a bug please refer to the Contributing section

About

The 2DECOMP&FFT library provides access to a slabs and pencil decompositions as well as FFTs.

https://2decomp-fft.github.io

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Fortran 89.7%Language:CMake 4.3%Language:Python 3.7%Language:C++ 2.2%Language:Shell 0.0%