prometheus / node_exporter

Exporter for machine metrics

Home Page:https://prometheus.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make node_exporter aware of k8s cpu limits and use its quota wisely

raptorsun opened this issue · comments

Currently, the default GOMAXPROCS value for node_exporter is set to 1, which means it will utilize a single CPU core.
User can either set the argument --runtime.gomaxprocs to the desired number of processes or set it to a value inferior to 1 to use all CPU cores on the node.

However, in a containerized environment where CPU quotas are set, it becomes complicated to set the optimal value of --runtime.gomaxprocs. Imagine we have a Kubernetes cluster made of nodes with different CPU quotas. A user use one DaemonSet to deploy node_exporter across the cluster, it can only set the --runtime.gomaxprocs to either a fixed value or use all CPU cores. This may lead to the situation where some node_exporter pod spends long time to scrape metric on a big node, while some node_exporter pods get throttled due to over consumption of the CPU quota of its pod on small nodes.

I suggest that node_exporter has a feature to automatically calculate its optimal GOMAXPROCS value according to not only the available CPU cores, but also the CPU quota.

Here I made a PR using the go.uber.org/automaxprocs to implement this feature.
#2831

Shall we have such feature in node_exporter?

There should almost never be a need to run more than GOMAXPROCS=1. This was done intentionally as only one CPU to avoid race conditions and load spikes reading files from the kernel.

Basically the node_exporter should never need more than 100m CPU. Even on very large nodes.