prometheus / node_exporter

Exporter for machine metrics

Home Page:https://prometheus.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Collect linux bpf stats

tomershafir opened this issue · comments

Since linux 5.1 the kernel can collect some bpf stats: https://github.com/torvalds/linux/blob/master/tools/bpf/bpftool/Documentation/bpftool-prog.rst?plain=1#L80

It seems possible to get the stats from anon_inode, or bpftool indirectly, I didnt test yet. However, I'm not sure how much its stable, or what permissions are needed.

I would like to know wdyt about monitoring system-wide bpf by node-exporter. Need to check the issues above if you think it fits.

Makes sense but as usual, needs to go into https://github.com/prometheus/procfs first

@discordianfish wdyt?

Unfortunately we don't allow collectors that require CAP_SYS_ADMIN. We have a policy against requiring users to need root access.

@SuperQ some reasonable use cases require CAP_SYS_ADMIN, maybe we should support this somehow?
Maybe some 'admin mode' where the node-exporter:

  • is suppose to run with CAP_SYS_ADMIN and errors out if not
  • only provides metrics from collectors that require CAP_SYS_ADMIN

Then you could run two node-exporters, one unprivileged and one with CAP_SYS_ADMIN. Dunno.. but writing a textfile script for each of these seems meh..

A client cant avoid CAP_SYS_ADMIN, and node_exporter can help. I would prefer not to force have 2 daemons instead of 1, with some admin mode as suggested and indicating telemetry.

  • https://github.com/cloudflare/ebpf_exporter does export the metrics (it can do much more, though I think the extended capabilities less fit the ebpf model with userspace controller).
  • I don't know why they don't require CAP_SYS_ADMIN for those metrics in docs.
  • I would like to have only node exporter.
  • I think its actually tricky for node_exporter, as it may treat bpf progs like processes and let them be out of scope, and possible aggregation should take place at query time. However, they are part of the loaded kernel, so it may monitor them.

The Prometheus project does not have a "single node agent" data model. Having everything in the node_exporter is not something we ever plan to support. There are different exporters for different things.

So, again, this is not a feature we can support at this time due to the privileges necessary to implement it. As well as the fact that Go does not have any kind of privilege dropping support.