thanos-community / ebpf-instrumentation

Demo for "Auto-instrumentation of Prometheus For RED Monitoring With eBPF" talk performed in Q4 2021

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Demo of ebpf-instrumentation

This repository holds demonstration materials for @bwplotka and @Harshitha1234 talk "Auto-instrumentation of Prometheus For RED Monitoring With eBPF".

Talk

We hope it will get your started quicker with eBPF!

Demo Architecture

Demo is scripted using our Go e2e framework inside standard Go test framework. You can see the implementation here

arch

In principle, we want to show IF and HOW to monitor HTTP requests using RED method without any service mesh or application (open box) instrumentation. We deploy all in docker containers to simulate that all can be working on complex cloud native environments (e.g. Kubernetes).

So without eBPF, Prometheus is nice enough to expose metrics about its own HTTP traffic (who many queries or other APIs requests were made and handled). The metrics we are interested are under prometheus_http_requests_total metric name. To obtain them is as easy as Prometheus scraping its own metrics, so we can query them.

Unfortunately, this is not always the case. Other processes might not have such metrics or use different behaviour on them which is very painful for anyone operating this software. Service mesh helps but it's extra complexity, cost and maintenance burden and we still we might not have all the visibility.

In this demo we will try to show same behaviour metrics using eBPF and compare it with Prometheus native instrumentation.

During this demo we:

  1. Deploy Prometheus docker container that scrapes itself (!) and ebpf exporter.
  2. Get Prometheus process ID (PID) from host perspective.
  3. Deploy ebpf_exporter docker container with special privileges (volumes, privileged mode and capabilities) our configuration.
  • In this configuration we specify metrics we want to expose and what eBPF map we should attach to.
  • We specify what way we want to hook in to the kernel with our eBPF program; In this example we will use Syscall tracepoints.
  • We specify the program itself, but before we inject Prometheus PID, since we want to filter HTTP traffic only from our Prometheus server.
  1. We open Prometheus UI to explore metrics, plus we do some calls to mimic simple HTTP traffic.

Limitations

  • This demo works on my machine ™️ (trust me): Linux pop-os 5.11.0-7620-generic #21~1626191760~20.10~55de9c3-Ubuntu SMP Wed Jul 21 20:34:49 UTC x86_64 x86_64 x86_64 GNU/Linux. Due to eBPF requiring kernel headers and function signatures from exactly the kernel version it will be running in, this demo is unlikely portable. You can try to build your own ebpf-exporter docker image with your own kernel headers based on my Dockerfile and then make docker from repo. This is fine if you control your kernel version in your cluster.
  • Method we chose (tracepoints on write, read, close syscalls) works with plain HTTP, but not with TLS. For TLS we would need to switch to uprobes on OpenSSL method or uprobes hooked into go specifically. Doable.
  • We did not implement "duration" element of the RED method. This would need to be added to our eBPF program. Doable too (:

How To Use it?

  • Build ebpf_exporter docker container image: make docker
  • Run example make run-example

Useful links that helped us create this talk

There is a lot of prior work and amazing tutorials that helped us make this demo:

About

Demo for "Auto-instrumentation of Prometheus For RED Monitoring With eBPF" talk performed in Q4 2021

License:Apache License 2.0


Languages

Language:Makefile 56.7%Language:C 28.2%Language:Dockerfile 9.7%Language:Shell 5.3%