spectralDNS / shenfun

High performance computational platform in Python for the spectral Galerkin method

Home Page:http://shenfun.readthedocs.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

An error related to the MPI communicators

maniset opened this issue · comments

Hello,

Thanks for this great library.

I have a question. I have created a solver with shenfun in a function which is used in another code in an iterative process. However, it only works with a limited number of iterations, and if the number of iterations goes up, the shenfun gives the following error.

mpi4py/MPI/Comm.pyx in mpi4py.MPI.Cartcomm.Sub()

Exception: Other MPI error, error stack:
PMPI_Cart_sub(213)..................: MPI_Cart_sub(comm=0x84000000, remain_dims=0x7f1aedab1320, comm_new=0x7f1aed8c7da0) failed
PMPI_Cart_sub(152)..................: 
MPIR_Comm_split_impl(253)...........: 
MPIR_Get_contextid_sparse_group(602): Too many communicators (0/2048 free on this process; ignore_id=0)

I think I should do something like unload the MPI or free the MPI communicators after each iteration (each time solver is used). Is there any way to solve this problem?

Sincerely

Hi,

You are right. If you create a TensorProductSpace in an iterative process, then you need to be careful with garbage control. This is probably not well documented, but the class has a destroy method, that you can call at the end of the iteration, and that should take care of cleaning up. This method is part of mpi4py-fft, see here. It is used for example in the tests here.

Thanks a lot. That works.