utkuozdemir / nvidia_gpu_exporter

Nvidia GPU exporter for prometheus using nvidia-smi binary

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Working with multinode servers?

johnnynunez opened this issue · comments

Hello I have 4 servers that each servers have 8 gpus.
Can I connect with multiwindow?

Hi, I don't know what you mean by multiwindow, but the exporter should work fine with multiple GPUs, as it detects each GPU with its UUID and exports its metrics labeled with that UUID. The Grafana dashboard is also aware of these UUIDs - on the dropdown at the top of the dashboard, you can choose from the list of UUIDs to see the metrics of that specific GPU.

Hi, I don't know what you mean by multiwindow, but the exporter should work fine with multiple GPUs, as it detects each GPU with its UUID and exports its metrics labeled with that UUID. The Grafana dashboard is also aware of these UUIDs - on the dropdown at the top of the dashboard, you can choose from the list of UUIDs to see the metrics of that specific GPU.

But I should install nvidia exporter on each node right?

Yes exactly.

Perfect, so after installing in each node,
Can I have all in the same dashboard or I have to select for different IP’s to show it in grafana?, for example,
Node 0: 128 -> 8 gpus
Node 1: 129 -> 8 gpus

Same dashboard would work if you get metrics from all nodes into the same prometheus. As the metrics are labeled with the GPU UUID, they do not conflict with each other, and the dashboard is able to display multiple GPUs.