pmem / rpma

Remote Persistent Memory Access Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Shared Completion Channel

yangx-jy opened this issue · comments

@grom72 @ldorau

I saw the implementation of shared completion channel #1637.
I hope we can discuss the APIs about shared completion channel here. How about the rough desgin?

struct rpma_conn_cfg {
    ......
    bool shared_comp_channel; 
};

/* set shared_comp_channel */
rpma_conn_cfg_get_comp_channel(struct rpma_conn_cfg *cfg, bool shared_comp_channel);
/* get shared_comp_channel */
rpma_conn_cfg_set_comp_channel(struct rpma_conn_cfg *cfg, bool *shared_comp_channel);

/*
 * Extend rpma_cq_new() to accept a completion channel argument
 */
rpma_cq_new(struct ibv_context *ibv_ctx, int cqe, struct ibv_comp_channel *channel, struct rpma_cq **cq_ptr)

/*  change the logic in rpma_conn_req_from_id() */
static int rpma_conn_req_from_id(struct rpma_peer *peer, struct rdma_cm_id *id, const struct rpma_conn_cfg *cfg, struct rpma_conn_req **req_ptr) {
    ......
    bool shared_comp_channel;
    rpma_conn_cfg_set_comp_channel(cfg, &shared_comp_channel);

   struct ibv_comp_channel *channel = ibv_create_comp_channel(ibv_ctx);
   struct rpma_cq *cq = NULL;
   rpma_cq_new(id->verbs, cqe, channel, &cq);

   struct rpma_cq *rcq = NULL;
   if (rcq) {
       if (shared_comp_channel) {
           rpma_cq_new(id->verbs, cqe, channel, &cq);
       } else {
          struct ibv_comp_channel *seperate_channel = ibv_create_comp_channel(ibv_ctx);
          rpma_cq_new(id->verbs, rcqe, seperate_channel , &rcq);
       }
   }
   ......
}

/* I think we don't need to add new rpma_comp_channel_new(), rpma_comp_channel_destroy(), rpma_conn_get_compl_fd() and rpma_conn_wait() */

Hi @yangx-jy ,
about rpma_conn_get_compl_fd() and rpma_conn_wait()
We have to extend existing API rpma_cq_wait(struct rpma_cq *cq) to get information on what CQ causes an event to avoid calling rpma_cq_get_wc() for the wrong CQ. This information can be added via cq_context when ibv_create_cq() is called.

If we do not have a separate API for the new completion event collecting it will be difficult to describe how to use it.
e.g. you wait for rpma_cq_wait on rcq but the function return because cq generates an event.

The new public API semantic is clear: "wait for any completion that is related to particular connection context".

Hi @grom72

Is it necessary to keep both separate completion channels and a shared completion channel?

If necessary, is it possible to unify the same APIs to wait the completion event?

A separate completion events channel works in the context of one completion queue only.
Options to be supported:
a) only CQ (no RCQ) -> completion event channel per rpma_cq
b) CQ + RCQ

  1. completion event channel per rpma_cq - do we see any use-case for that?
  2. completion event channel common for CQ and RCQ on the connection level -> that could also work for "only CQ case" (if no shared event channel or no RCQ, use CQ event channel)

if we forget case b.1) we can replace rpma_cq_wait by rpma_conn_wait() and mark rpma_cq_wait as deprecated (and remove
it later) or we can leave it but not recommend it.

To support transition we should add API call: int rpma_cq_get_conn(struct *rpma_cq, struct **conn_ptr)

@ldorau what is your opinion? is it ok to replace cq_wait with conn_wait?

@ldorau

If we think separate completion channels affect the performance, we can replace it with a shared completion channel.
I think providing many different APIs about two choices is complex for user to use them.

Hi @yangx-jy,
together with @ldorau, we are very close to the conclusion that the completion event channel will only be supported on the connection level (rpma_conn_wait()).
To make a decision to deprecate rpma_cq_wait() we have to:

  • a measure that GPSPM works better with a shared completion channel or at least as good as existing software solution (Your help is appreciated)
  • consult with existing and potential librpma users that they do not need a separate completion channel for RCQ

Hi @yangx-jy, yes, you are right - two very similar APIs may be confusing for users. We have just added a new API and when it proves to have better performance than the old one, we will deprecate and remove the old one.

Hi @yangx-jy, yes, you are right - two very similar APIs may be confusing for users. We have just added a new API and when it proves to have better performance than the old one, we will deprecate and remove the old one.

Great. ^_^

Both API calls rpma_conn_wait() and rpma_cq_wait() will stay.