Shared Completion Channel
yangx-jy opened this issue · comments
I saw the implementation of shared completion channel #1637.
I hope we can discuss the APIs about shared completion channel here. How about the rough desgin?
struct rpma_conn_cfg {
......
bool shared_comp_channel;
};
/* set shared_comp_channel */
rpma_conn_cfg_get_comp_channel(struct rpma_conn_cfg *cfg, bool shared_comp_channel);
/* get shared_comp_channel */
rpma_conn_cfg_set_comp_channel(struct rpma_conn_cfg *cfg, bool *shared_comp_channel);
/*
* Extend rpma_cq_new() to accept a completion channel argument
*/
rpma_cq_new(struct ibv_context *ibv_ctx, int cqe, struct ibv_comp_channel *channel, struct rpma_cq **cq_ptr)
/* change the logic in rpma_conn_req_from_id() */
static int rpma_conn_req_from_id(struct rpma_peer *peer, struct rdma_cm_id *id, const struct rpma_conn_cfg *cfg, struct rpma_conn_req **req_ptr) {
......
bool shared_comp_channel;
rpma_conn_cfg_set_comp_channel(cfg, &shared_comp_channel);
struct ibv_comp_channel *channel = ibv_create_comp_channel(ibv_ctx);
struct rpma_cq *cq = NULL;
rpma_cq_new(id->verbs, cqe, channel, &cq);
struct rpma_cq *rcq = NULL;
if (rcq) {
if (shared_comp_channel) {
rpma_cq_new(id->verbs, cqe, channel, &cq);
} else {
struct ibv_comp_channel *seperate_channel = ibv_create_comp_channel(ibv_ctx);
rpma_cq_new(id->verbs, rcqe, seperate_channel , &rcq);
}
}
......
}
/* I think we don't need to add new rpma_comp_channel_new(), rpma_comp_channel_destroy(), rpma_conn_get_compl_fd() and rpma_conn_wait() */
Hi @yangx-jy ,
about rpma_conn_get_compl_fd() and rpma_conn_wait()
We have to extend existing API rpma_cq_wait(struct rpma_cq *cq)
to get information on what CQ causes an event to avoid calling rpma_cq_get_wc()
for the wrong CQ. This information can be added via cq_context
when ibv_create_cq()
is called.
If we do not have a separate API for the new completion event collecting it will be difficult to describe how to use it.
e.g. you wait for rpma_cq_wait
on rcq
but the function return because cq
generates an event.
The new public API semantic is clear: "wait for any completion that is related to particular connection context".
Hi @grom72
Is it necessary to keep both separate completion channels and a shared completion channel?
If necessary, is it possible to unify the same APIs to wait the completion event?
A separate completion events channel works in the context of one completion queue only.
Options to be supported:
a) only CQ (no RCQ) -> completion event channel per rpma_cq
b) CQ + RCQ
- completion event channel per rpma_cq - do we see any use-case for that?
- completion event channel common for CQ and RCQ on the connection level -> that could also work for "only CQ case" (if no shared event channel or no RCQ, use CQ event channel)
if we forget case b.1) we can replace rpma_cq_wait by rpma_conn_wait() and mark rpma_cq_wait as deprecated (and remove
it later) or we can leave it but not recommend it.
To support transition we should add API call: int rpma_cq_get_conn(struct *rpma_cq, struct **conn_ptr)
@ldorau what is your opinion? is it ok to replace cq_wait with conn_wait?
If we think separate completion channels affect the performance, we can replace it with a shared completion channel.
I think providing many different APIs about two choices is complex for user to use them.
Hi @yangx-jy,
together with @ldorau, we are very close to the conclusion that the completion event channel will only be supported on the connection level (rpma_conn_wait()
).
To make a decision to deprecate rpma_cq_wait()
we have to:
- a measure that GPSPM works better with a shared completion channel or at least as good as existing software solution (Your help is appreciated)
- consult with existing and potential librpma users that they do not need a separate completion channel for RCQ
Hi @yangx-jy, yes, you are right - two very similar APIs may be confusing for users. We have just added a new API and when it proves to have better performance than the old one, we will deprecate and remove the old one.
Hi @yangx-jy, yes, you are right - two very similar APIs may be confusing for users. We have just added a new API and when it proves to have better performance than the old one, we will deprecate and remove the old one.
Great. ^_^
Both API calls rpma_conn_wait()
and rpma_cq_wait()
will stay.