ofiwg / libfabric

Open Fabric Interfaces

Home Page:http://libfabric.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

prov/rxm: fi_addr returned from fi_av_insert is always incrementing

shefty opened this issue · comments

The following output is from fi_rdm_stress, server side. There were 2 client runs.

Starting rpc stress server
ofi_ibuf_alloc 0x7fcafc000c58 0
(0) start rpc hello
ofi_ibuf_alloc 0x7fcaf8000c80 0
(0) complete rpc hello (Success)
(0) start rpc write
(0) complete rpc write (Resource temporarily unavailable)
(0) unreachable, removing
ofi_ibuf_free 0x7fcafc000c58 0
ofi_ibuf_free (new free head) 0x7fcafc000c58 0
ofi_ibuf_alloc 0x7fcafc000c58 0
(0) start rpc hello
ofi_ibuf_alloc 0x7fcaf8000d10 1
(1) complete rpc hello (Success)
(1) start rpc write
(1) complete rpc write (Resource temporarily unavailable)
(1) unreachable, removing
ofi_ibuf_free 0x7fcafc000c58 0
ofi_ibuf_free (new free head) 0x7fcafc000c58 0

The number in parenthesis (0) or (1) is the address that was assigned to the client. The first client was assigned address 0 as expected. However, after that client exited, it was removed by the server as a result of a failed send. The second client that connected then received address 1, when 0 was expected.

Prints were added to the bufpool. There were 2 allocations, but only 1 entry was freed. I don't know the scope of this problem. It looks like a possible memory leak, but investigating why the entry wasn't freed and if accessing it could cause other issues.

Added fix to PR #7752