mpiwg-coll / coll-issues

Repository for internal Collectives Working Group issues and discussions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Idea: RECV Functions for Persistent Collectives

trey-ornl opened this issue · comments

I propose the following idea for a future MPI standard: RECV functions for persistent collectives. The goal is to allow persistent collectives more of the asynchrony/overlap that is currently available to persistent and non-blocking point-to-point communication.

Here is a straw-main API. In addition to existing MPI_START and MPI_WAIT functions:

MPI_START_RECV(request)
This function makes the receive buffer associated with the persistent collective request available to MPI. A matching MPI_START(request) must be called later in the application to make the send buffer available and to allow the collective to progress.

MPI_WAIT_RECV(request, status)
This function waits for MPI to finish filling the receive buffer associated with the persistent collective request. A matching MPI_WAIT(request, status) must be called later in the application to wait for the operation to release the send buffer and to clean up the request.

You can imagine related new functions MPI_STARTALL_RECV, MPI_WAITALL_RECV, MPI_TEST_RECV, etc.

Consider the following software pattern that uses non-blocking point-to-point communication.

  • Make MPI_Irecv calls, AKA "prepost receives".
  • Pack send buffers.
  • Make MPI_Isend calls.
  • If possible, perform independent work.
  • MPI_Waitall on receive requests.
  • Unpack or otherwise use receive buffers.
  • MPI_Waitall on send requests.

Existing persistent collectives do not support the same opportunities for asynchrony.

  • Missing asynchrony.
  • Pack send buffers.
  • MPI_Start.
  • If possible, perform independent work.
  • MPI_Wait.
  • Unpack or otherwise use receive buffers.
  • Missing asynchrony.

The proposed new functions fill in the gaps for persistent collectives.

  • MPI_Start_recv.
  • Pack send buffers.
  • MPI_Start.
  • If possible, perform independent work.
  • MPI_Wait_recv.
  • Unpack or otherwise use receive buffers.
  • MPI_Wait.

Please forgive me if the MPI Forum has already investigated similar ideas.

This idea is separate from but complementary to #4.