pmem / rpma

Remote Persistent Memory Access Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

examples: RPMA file size limitation

DanielLee343 opened this issue · comments

Hi, I'm new to RPMA, and this concern is troubling me for days.

In examples/03-read-to-persistent (also applies to other examples), when I start the client and server on the same node, the first test runs well, and the log prompts the file's content under the PM file system. However, when the client and server run on different nodes, the log always shows "New Value" followed by "Hello world!" or "¡Hola Mundo!". Also, is it expected that the file on the server side reads data from the client side and appends to its own? However the file on both sides were modified with hello world str at the very beginning. I'm I getting the example code wrong? Hope to see your response. Thanks!

Hi @DanielLee343 ,
can you attach logs?
It will be easier to follow the logic your example follows.

Hi @grom72,
Running the example 03 within 64K size of file is fine, but seems there's a maximum limit on the file size when calling ibv_reg_mr():

[lyuze@master 03-read-to-persistent]$ ./server 192.168.0.13 1234 /mnt/pmemfs0/file_larger_than_64k
memory size is: 65537
is pmem? 1
Jun 05 18:44:52.741986 [748654] *NOTE*  ep.c: 101: rpma_ep_listen: Waiting for incoming connection on 192.168.0.13:1234
Jun 05 18:44:52.742071 [748654] *ERROR* peer.c: 182: rpma_peer_mr_reg: ibv_reg_mr() failed: Cannot allocate memory

in which 65537 is 1 byte larger than 64K(65536)
Is there a proper solution if I want to perform RPMA R/W when file size is larger than 64K? Thanks!

Also, for Device DAX also doesn't work, same error here:

[lyuze@master 03-read-to-persistent]$ ./server 192.168.0.13 1234 /dev/dax1.0 
memory size is: 199229440
is pmem? 1
Jun 06 21:30:29.497260 [3243689] *NOTE*  ep.c: 101: rpma_ep_listen: Waiting for incoming connection on 192.168.0.13:1234
Jun 06 21:30:29.497342 [3243689] *ERROR* peer.c: 182: rpma_peer_mr_reg: ibv_reg_mr() failed: Cannot allocate memory
[lyuze@master 03-read-to-persistent]$ ndctl list -N
[
  {
    "dev":"namespace1.0",
    "mode":"devdax",
    "map":"dev",
    "size":799063146496,
    "uuid":"aa3d1d27-8fde-4258-b345-9cc97d3b4f0c",
    "chardev":"dax1.0",
    "align":2097152
  },
  {
    "dev":"namespace0.0",
    "mode":"fsdax",
    "map":"dev",
    "size":799063146496,
    "uuid":"3cb9cda2-bf7e-4012-a0a2-1e5fa2c13f79",
    "sector_size":512,
    "align":2097152,
    "blockdev":"pmem0"
  }
]

I guess there's some issue with my own configuration?

@DanielLee343

Could you list all entries in /dev/infiniband/? like this:

# ll /dev/infiniband/
total 0
crw-rw-rw-. 1 root root  10, 122 Jun 10 14:01 rdma_cm
crw-rw-rw-. 1 root root 231, 192 Jun 10 14:01 uverbs0

@yangx-jy
I got this

[lyuze@master yuze]$ ll /dev/infiniband/
total 0
crw-rw-rw- 1 root root  10,  60 Jun  1 11:30 rdma_cm
crw-rw-rw- 1 root root 231, 192 Jun  1 10:23 uverbs0
crw-rw-rw- 1 root root 231, 193 Jun  1 10:23 uverbs1

Got it, we have to change the memory lock size, check item 17 in https://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages-user