ROCm / ROCm

AMD ROCm™ Software - GitHub Home

Home Page:https://rocm.docs.amd.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature]: CDI generator for a universal container support

Scapal opened this issue · comments

Suggestion Description

In order to make it easier way to expose AMD GPUs in containers, ROCm should embrace the Container Device Interface.

It should be quite straightforward to implement as it is just about naming the devices in a yaml file and having hooks.

cdiVersion: 0.5.0
kind: amd.com/gpu
devices:
- name: "0"
  containerEdits:
    deviceNodes:
    - path: /dev/kfd
    - path: /dev/dri/renderD128
- name: "1"
  containerEdits:
    deviceNodes:
    - path: /dev/kfd
    - path: /dev/dri/renderD130
- name: "all"
  containerEdits:
    deviceNodes:
    - path: /dev/kfd
    - path: /dev/dri/renderD128
    - path: /dev/dri/renderD130

CDI is now supported by Docker (v25), containers, CRI-O and Podman.

For example, for Docker or Podman, you can then specify --device amd.com/gpu/1 instead of --device /dev/kfd --device /dev/dri/renderD130

Operating System

Linux

GPU

No response

ROCm Component

No response