Mellanox / k8s-rdma-shared-dev-plugin

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rdma-shared-device-plugin device isolation

cairong-ai opened this issue · comments

I have 4 InfiniBand (IB) network cards on my server, and I have shared one of them with the following configuration:

rdmaSharedDevicePlugin:
  deploy: true
  resources:
    - name: rdma_shared_device_a
      ifNames: [ibs1]

However, when I check inside the Pod that has requested the rdma_shared_device_a resource, I can still see all 4 IB network cards. It seems like device isolation is not being achieved. What should I do?

Hi,

I can still see all 4 IB network cards
what do you mean ? can you add an example ?

generallly rdma shared device plugin will mount rdma ULP devices to container (under /dev/infiniband) for the specified device.

the "mlx5_0" rdma devices under /sys/class/infiniband are visible within container but only the relevant device should have its upper-layer-protocol(ULP) char devices exposed to container.

$ kubectl logs -f rdma-shared-dp-ds-hp2z8 -n network-operator
Defaulted container "rdma-shared-dp" out of: rdma-shared-dp, ofed-driver-validation (init)
2024/01/07 06:01:53 Starting K8s RDMA Shared Device Plugin version= master
2024/01/07 06:01:53 resource manager reading configs
2024/01/07 06:01:53 Reading /k8s-rdma-shared-dev-plugin/config.json
Using Kubelet Plugin Registry Mode
2024/01/07 06:01:53 loaded config: [{ResourceName:rdma_shared_device_a ResourcePrefix: RdmaHcaMax:10 Devices:[] Selectors:{Vendors:[] DeviceIDs:[] Drivers:[] IfNames:[ibs1] LinkTypes:[]}} {ResourceName:rdma_shared_device_b ResourcePrefix: RdmaHcaMax:10 Devices:[] Selectors:{Vendors:[] DeviceIDs:[] Drivers:[] IfNames:[ibs8] LinkTypes:[]}}]
2024/01/07 06:01:53 no periodic update interval is set, use default interval 60 seconds
2024/01/07 06:01:53 Discovering host devices
2024/01/07 06:01:53 discovering host network devices
2024/01/07 06:01:53 DiscoverHostDevices(): device found: 0000:14:00.0	02          	Mellanox Technolo...	MT28908 Family [ConnectX-6]
2024/01/07 06:01:53 DiscoverHostDevices(): device found: 0000:30:00.0	02          	Mellanox Technolo...	MT28908 Family [ConnectX-6]
2024/01/07 06:01:53 DiscoverHostDevices(): device found: 0000:68:00.0	02          	Intel Corporation   	Ethernet Controller X710 for 10GbE SFP+
2024/01/07 06:01:53 DiscoverHostDevices(): device found: 0000:68:00.1	02          	Intel Corporation   	Ethernet Controller X710 for 10GbE SFP+
2024/01/07 06:01:53 DiscoverHostDevices(): device found: 0000:b9:00.0	02          	Mellanox Technolo...	MT28908 Family [ConnectX-6]
2024/01/07 06:01:53 DiscoverHostDevices(): device found: 0000:d2:00.0	02          	Mellanox Technolo...	MT28908 Family [ConnectX-6]
2024/01/07 06:01:53 Initializing resource servers
2024/01/07 06:01:53 Resource: &{ResourceName:rdma_shared_device_a ResourcePrefix:rdma RdmaHcaMax:10 Devices:[] Selectors:{Vendors:[] DeviceIDs:[] Drivers:[] IfNames:[ibs1] LinkTypes:[]}}
2024/01/07 06:01:53 error creating new device: "missing RDMA device spec for device 0000:68:00.0, RDMA device \"issm\" not found"
2024/01/07 06:01:53 error creating new device: "missing RDMA device spec for device 0000:68:00.1, RDMA device \"issm\" not found"
2024/01/07 06:01:54 Resource: &{ResourceName:rdma_shared_device_b ResourcePrefix:rdma RdmaHcaMax:10 Devices:[] Selectors:{Vendors:[] DeviceIDs:[] Drivers:[] IfNames:[ibs8] LinkTypes:[]}}
2024/01/07 06:01:54 error creating new device: "missing RDMA device spec for device 0000:68:00.0, RDMA device \"issm\" not found"
2024/01/07 06:01:54 error creating new device: "missing RDMA device spec for device 0000:68:00.1, RDMA device \"issm\" not found"
2024/01/07 06:01:54 Starting all servers...
2024/01/07 06:01:54 starting rdma/rdma_shared_device_a device plugin endpoint at: rdma_shared_device_a.sock
2024/01/07 06:01:54 rdma/rdma_shared_device_a device plugin endpoint started serving
2024/01/07 06:01:54 starting rdma/rdma_shared_device_b device plugin endpoint at: rdma_shared_device_b.sock
2024/01/07 06:01:54 rdma/rdma_shared_device_b device plugin endpoint started serving
2024/01/07 06:01:54 All servers started.
2024/01/07 06:01:54 Listening for term signals

The above is the log of the rdma-shared-dev-plugin pod. rdma_shared_device_a only specified one IB card, but in the Pod that was allocated the rdma_shared_device_a resource, four IB cards can be seen under /sys/class/infiniband

# ll /sys/class/infiniband
total 0
drwxr-xr-x  2 root root 0 Jan  7 07:36 ./
drwxr-xr-x 83 root root 0 Jan  7 07:36 ../
lrwxrwxrwx  1 root root 0 Jan  7 07:36 mlx5_0 -> ../../devices/pci0000:09/0000:09:02.0/0000:0a:00.0/0000:0b:08.0/0000:12:00.0/0000:13:00.0/0000:14:00.0/infiniband/mlx5_0/
lrwxrwxrwx  1 root root 0 Jan  7 07:36 mlx5_1 -> ../../devices/pci0000:1a/0000:1a:02.0/0000:1b:00.0/0000:1c:08.0/0000:2e:00.0/0000:2f:00.0/0000:30:00.0/infiniband/mlx5_1/
lrwxrwxrwx  1 root root 0 Jan  7 07:36 mlx5_2 -> ../../devices/pci0000:b0/0000:b0:02.0/0000:b1:00.0/0000:b2:04.0/0000:b7:00.0/0000:b8:10.0/0000:b9:00.0/infiniband/mlx5_2/
lrwxrwxrwx  1 root root 0 Jan  7 07:36 mlx5_3 -> ../../devices/pci0000:c9/0000:c9:02.0/0000:ca:00.0/0000:cb:04.0/0000:d0:00.0/0000:d1:10.0/0000:d2:00.0/infiniband/mlx5_3/

in the pod, what do you see under /dev/infiniband/ ?

# ll /dev/infiniband/
total 0
drwxr-xr-x 2 root root      120 Jan  7 07:35 ./
drwxr-xr-x 6 root root      480 Jan  7 07:35 ../
crw------- 1 root root 231,  65 Jan  7 07:35 issm1
crw-rw-rw- 1 root root  10,  58 Jan  7 07:35 rdma_cm
crw------- 1 root root 231,   1 Jan  7 07:35 umad1
crw-rw-rw- 1 root root 231, 193 Jan  7 07:35 uverbs1

/dev/infiniband in the pod, the content is as above

according to your feedback, rdma shared device plugin behaves as expected.

the reason why you see all mlx_* devices under /sys/class/infiniband is because kernel does not namespace them.
however only one device is actually accessible from container as you only have mounts under /dev/infiniband from ibs1 device

Thanks,Is there any way to make sure that a Pod can only see the allowed IB cards?
@adrianchiris