prov/rxm: segfault in fi_rdm_stress server at prov/rxm/src/rxm_msg.c:314
ldorau opened this issue · comments
Describe the bug
The server of fi_rdm_stress
segfaults at prov/rxm/src/rxm_msg.c:314:
https://github.com/ofiwg/libfabric/blob/main/prov/rxm/src/rxm_msg.c#L314
rxm_mr_msg_mr[i] = ((struct rxm_mr *) desc[i])->msg_mr;
for i == 0
because desc[i] == 0x0
.
To Reproduce
Steps to reproduce the behavior:
- Start the server:
$ ./fi_rdm_stress -p verbs -s 192.168.1.4
- Start the client:
$ ./fi_rdm_stress -p verbs -u ../test_configs/rdm_stress/stress.json 192.168.1.4
Expected behavior
The server of fi_rdm_stress
does not segfault, but runs correctly.
Output
$ cgdb --args ./fi_rdm_stress -p verbs -s 192.168.1.4
[...]
Thread 3 "fi_rdm_stress" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeae96700 (LWP 145200)]
0x00007ffff7adc13f in rxm_alloc_rndv_buf (rxm_ep=0x6c09a0, rxm_conn=0x7fff5c0045b8, context=0x7ffff7ef0010, count=1 '\001', iov=0x7fffeae95dc0, desc=0x7fffeae95d80, data_len=1000032, data=0,
flags=16777216, tag=0, op=0 '\000', iface=FI_HMEM_SYSTEM, device=0, rndv_buf=0x7fffeae95d10) at prov/rxm/src/rxm_msg.c:314
(gdb) p i
$1 = 0
(gdb) p desc[i]
$2 = (void *) 0x0
Environment:
provider: verbs
MR local: MSG - 1, RxM - 1
Completions per progress: MSG - 1
Buffered min: 88
Min multi recv size: 16384
inject size: 256
Protocol limits: Eager: 16384, SAR: 131072
Debugging information
mr_mode
isFI_MR_LOCAL
:
https://github.com/pmem/libfabric/blob/main/fabtests/functional/rdm_stress.c#L1253
opts.mr_mode = ... | FI_MR_LOCAL | ... ;
so:
2) rxm_ep->rdm_mr_local
is true
:
https://github.com/pmem/libfabric/blob/main/prov/rxm/src/rxm_ep.c#L1235
rxm_ep->rdm_mr_local = ofi_mr_local(rxm_ep->rxm_info);
but:
3) desc[0] == NULL
, because fi_send()
is called with desc == NULL
in handle_hello()
https://github.com/ofiwg/libfabric/blob/main/fabtests/functional/rdm_stress.c#L1006
ret = fi_send(ep, &resp->hdr, sizeof(resp->hdr), NULL, addr, resp);
- It causes a segfault (NULL pointer dereference) at https://github.com/pmem/libfabric/blob/main/prov/rxm/src/rxm_msg.c#L305-L314
NOTICE
Removing FI_MR_LOCAL
from mr_mode
at:
https://github.com/pmem/libfabric/blob/main/fabtests/functional/rdm_stress.c#L1253
causes that this bug does not appear, only the assertion occurs in the client:
fi_rdm_stress: prov/util/src/util_mem_monitor.c:160: ofi_monitor_cleanup: Assertion `dlist_empty(&monitor->list)' failed.
Additional information
libfabric:145723:1663043419:ofi_rxm:verbs:core:ofi_check_ep_type():667<info> unsupported endpoint type
libfabric:145723:1663043419:ofi_rxm:verbs:core:ofi_check_ep_type():668<info> Supported: FI_EP_DGRAM
libfabric:145723:1663043419:ofi_rxm:verbs:core:ofi_check_ep_type():668<info> Requested: FI_EP_MSG
libfabric:145723:1663043419:ofi_rxm:core:core:ofi_layering_ok():1027<info> Provider ofi_rxm is excluded
libfabric:145723:1663043419:ofi_rxm:core:core:ofi_layering_ok():1038<info> Need core provider, skipping ofi_rxd
libfabric:145723:1663043419:ofi_rxm:core:core:ofi_layering_ok():1038<info> Need core provider, skipping ofi_mrail
libfabric:145723:1663043419::ofi_rxm:core:fi_param_get_():279<info> variable sar_limit=<not set>
libfabric:145723:1663043419::ofi_rxm:core:rxm_ep_settings_init():1270<info> Settings:
MR local: MSG - 1, RxM - 1
Completions per progress: MSG - 1
Buffered min: 88
Min multi recv size: 16384
inject size: 256
Protocol limits: Eager: 16384, SAR: 131072
libfabric:145723:1663043419::verbs:ep_ctrl:vrb_pep_listen():525<info> listening on: fi_sockaddr_in://192.168.1.4:9228
Hi @shefty , could you give a hint how it should be fixed? For example, is FI_MR_LOCAL
required in mr_mode
at:
https://github.com/pmem/libfabric/blob/main/fabtests/functional/rdm_stress.c#L1253
AFAIK, msg_mr
can be created only by .regv == vrb_mr_regv
or .regattr == vrb_mr_regattr
hooks, but none of them is called in the rdm_stress
test, so usage of FI_MR_LOCAL
seems suspicious for me (or this test just lacks one of these calls).
I missed that I was tagged on this way back when.
The rdm_stress tests is not coded to handle FI_MR_LOCAL correctly. At least one missing piece is in start_rcp(). After the resp buffer is allocated, the resp data needs to be registered if FI_MR_LOCAL is specified. The struct rpc_resp already has a mr field for this purpose, which is closed in complete_rpc().
I'd consider a set of changes along these lines:
static uint64_t rpc_resp_reg_flags[cmd_last] = {
0,
0,
FI_SEND,
0,
FI_SEND,
FI_READ,
FI_WRITE,
};
static void start_rpc(...)
{
...
resp = calloc(...)
if (need FI_MR_LOCAL && rpc_resp_reg_flags[req->cmd])
fi_mr_reg(...)
...
}