Permissions for the /dev/{kfd,dri/renderXXXX} devices in containers
elukey opened this issue · comments
Hi folks!
I am trying the AMD device plugin on my system, deployed as Systemd unit on Debian 11 (so not a DaemonSet, but directly on the K8s node). Everything works fine and I am able to see two devices in my test container:
- /dev/kfd
- /dev/dri/renderD128
I am trying to run the container with an unpriviledged user, like nobody
, but I am struggling to assign the proper permissions to the above devices. In the container I see something like the following (tested via nsenter
):
root@alexnet-tf-gpu-pod:/# ls -l /dev/kfd
crw-rw---- 1 root 106 242, 0 Apr 18 15:58 /dev/kfd
root@alexnet-tf-gpu-pod:/# ls -l /dev/dri/renderD128
crw-rw---- 1 root 106 226, 128 Apr 18 15:58 /dev/dri/renderD128
The gid 106 is the render
group on the underlying "bare metal" K8s worker OS, that gets mapped to the test container, but in this way I don't have a clear way to add nobody
to render
or similar (in the Docker image). Is there a best practice that you can suggest?
Thanks in advance!
In the securityContext
for the pod, you can add supplementalGroups
that the pod is run as, which I found enabled me to use the hardware.
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.29/#podsecuritycontext-v1-core