Provide ability to configure the location of the /proc and /sys directories
jamtur01 opened this issue · comments
A lot of folks collect host metrics using monitoring tools running inside Docker or Kubernetes. Inside these containers, to access the host system underneath, we need to bind-mount /proc
and /sys
into the container so the monitoring tool finds the right filesystem. The monitoring tool is then configurable to specify the bind-mount instead of the container's filesystem.
A good example is the Prometheus node_exporter, this setup you can see in their Helm chart for deploying in k8s: https://github.com/prometheus-community/helm-charts/blob/main/charts/prometheus-node-exporter/templates/daemonset.yaml). The node_exporter then has some configuration to specify the location of /proc
and /sys
:
The volumes:
volumes:
- name: proc
hostPath:
path: /proc
- name: sys
hostPath:
path: /sys
The bind-mount:
volumeMounts:
- name: proc
mountPath: /host/proc
readOnly: true
- name: sys
mountPath: /host/sys
readOnly: true
The node_exporter configuration:
containers:
- name: node-exporter
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
It'd be great if heim
had the option to configure the location of the relevant filesystems to allow this use case.
Hi, @jamtur01! Yeah, I had this idea in my head for a long time, but it is great to fix it in a written form finally.
It would be great to use heim::virt::detect()
function to do that automatically, but I'm not sure how would it work in a whole yet.
We need this for our vectordotdev/vector#4163 at Vector, let's try and figure out something?
Currently, it looks like heim
doesn't support any kind of configuration - every call is ad-hoc, and doesn't rely on any kind of state (besides the hard-coded one).
In essence, I think heim
should have the API infrastructure for the global configuration - for things like this, and other similar situations.
If the idea of the global configuration is not appealing - we could use hack-in some kind of global state to pass the platform-specific configuration, and preserve API as-is, while implicitly referring to the global state. I don't like this approach and am mentioning it just for completion. However, this might be a temporary solution for us (in a fork) if the progress on a proper implementation stalls. Of course, we would rather help with improving heim
properly, rather than using custom hacks!
Automatically detecting the paths based on heim::virt::detect()
is not possible without hard-coding the paths to something for an in-container case. This is something we'd like to avoid, as some people use non-standard mount points - in some cases dictated by enterprise policies that are hard to work around - and it would be great to allow customization for such cases.
I suggest providing a way to simply pass the paths to the procfs
and sysfs
mount points on Linux.
We could design the API like this:
- in all the relevant Linux system-level implementations we pass a reference to a struct containing two params:
procfs_mountpoint: PathBuf
andsysfs_mountpoint: PathBuf
; - user-facing calls can be transformed from something like
heim::memory::memory()
intoheim::memory::memory(config)
, or something likelet configured = heim::preconfigured(config); configured.memory()
;
This adds some boilerplate to the usage on all of the platforms, even the ones not requiring the configuration. However, in exchange, we get a common (and not too bad imo) way to pass configuration for Heim.
With this layout, the question remains what the config
will look like. It can be either a cross-platform struct
or enum
, or platform-specific type, that is different depending on the compilation environment/settings. The decision on how it should look it actually quite an easy one: if the API is designed to be cross-platform, then the config type should be common, and not platform-specific, although on some of the platforms it will contain the data not relevant for the platform it executes on. This is the easiest way for the crate users to drive the cross-platform configuration. It can be an enum
that only allow a single platform config at a time (linux/windows/macos/unix/etc) to avoid being in a situation where we pay for what we don't use at runtime.
Hi, @MOZGIII!
Currently, it looks like heim doesn't support any kind of configuration - every call is ad-hoc, and doesn't rely on any kind of state (besides the hard-coded one).
Yes, you are correct, heim
at the moment does not allows to configure neither /proc
or /sys
directory.
Automatically detecting the paths based on heim::virt::detect() is not possible without hard-coding the paths to something for an in-container case. This is something we'd like to avoid, as some people use non-standard mount points - in some cases dictated by enterprise policies that are hard to work around - and it would be great to allow customization for such cases.
Thanks for pointing that out, I was not exploring this feature that deeply.
What I would ideally want to see is the following flow:
- If nothing was done explicitly by
heim
user,/proc
and/sys
directories are used by default - User can manually somehow re-define paths to this directories
- Optionally they can enable some sort of
linux-procfs-auto-detect
Cargo feature, which will utilizeheim::virt::detect
function to find these directories for us.
Main point here is that default configuration should target most of the potential cases (which is the usual location for /proc
filesystem). And if there is some customization needed, it should be done in a easy way, which should not affect other users and not to break user experience for Windows and macOS.
Item 3 here is a desirable behavior, which we can't get right now, as the heim::virt::detect
function is unreliable and untested; that leaves with the first two cases.
I'm also against your idea about explicitly passing some sort of configuration into all public functions: it badly affects user experience for everyone, who is not running heim
inside of the containers — some part of the Linux users and all Windows and macOS users. I also can't think about any other use case where we might also need such a configuration, so it settles it.
Considering all that, I think we should step back and implement a naive global-state approach; while it might not be that fancy and well designed right, it will cover your (and possibly other) cases and if it will not enough, we can rework it later.
What I would want to see is the following public API:
/// Returns path to the `procfs` directory from where `heim` loads data.
#[cfg(target_os = "linux")]
pub fn procfs_path() -> &Path { unimplemented!() }
/// Set path to `procfs` mount for `heim` to load data from.
#[cfg(target_os = "linux")]
pub fn set_procfs_path<T: AsRef<Path>>(procfs: T) { unimplemented!() }
, plus the same API for sysfs
, where these global state values are stored inside of the heim-runtime
crate (say, in the src/linux.rs
).
#[cfg(target_os = "linux")]
would be a required attribute for this API, because, well, it is available only for Linux and creating some sort of stub functions for other OSes is an incorrect approach.
Let me know what you think about it!
I was thinking about it, and, in our case, users might want to collect metrics from both host and container at the same time - and it would be a problem in that use case - this is why I was gravitating toward the API change.
That said, in the immediate future this will help, and we can work around this by configuring those settings via env vars (to kind of hide them from the user-visible configs that are a per-topology unit, of which we can have many host_metrics
at the same time).
So, overall sounds good! I prefer the API-change approach better though.
@svartalf do you have any plans to work on this? If not, I can take a stab at implementing this next week.
@MOZGIII yeah, I understand where you are going with your idea; it just at this point there is more disadvantages for other use cases. That, of course, might change later if some new thoughts on that matter will appear.
do you have any plans to work on this? If not, I can take a stab at implementing this next week.
I'm not sure if I'll able to work on it this week, so let maybe sync at Monday or so to see if anything changed?