podman_container_mem_usage_bytes metrics disappeared
felixkrohn opened this issue · comments
Describe the bug
I used to query podman_container_mem_usage_bytes in prometheus, but now noticed it's not exposed anymore. a manual crawl of the /metrics file confirms this.
To Reproduce
unfortunately I can't really say what has changed. First I assumed a permission problem on the podman socket and created a separate one only for the prometheus-podman-exporter container:
/ $ id
uid=65534(nobody) gid=65534(nobody)
/ $ ls -la /run/podman/podman.sock
srw------- 1 nobody nobody 0 Dec 5 17:35 /run/podman/podman.sock
/ $
logs (last 7 lines repeat):
ts=2022-12-05T17:35:47.710Z caller=exporter.go:63 level=info msg="Starting podman-prometheus-exporter" version="(version=1.3.0, branch=main, revision=dev.1)"
ts=2022-12-05T17:35:47.711Z caller=handler.go:93 level=info msg="enabled collectors"
ts=2022-12-05T17:35:47.711Z caller=handler.go:104 level=info collector=container
ts=2022-12-05T17:35:47.711Z caller=handler.go:104 level=info collector=image
ts=2022-12-05T17:35:47.711Z caller=handler.go:104 level=info collector=network
ts=2022-12-05T17:35:47.711Z caller=handler.go:104 level=info collector=pod
ts=2022-12-05T17:35:47.711Z caller=handler.go:104 level=info collector=system
ts=2022-12-05T17:35:47.711Z caller=handler.go:104 level=info collector=volume
ts=2022-12-05T17:35:47.711Z caller=exporter.go:74 level=info msg="Listening on" address=127.0.0.1:9882
ts=2022-12-05T17:35:47.712Z caller=tls_config.go:232 level=info msg="Listening on" address=127.0.0.1:9882
ts=2022-12-05T17:35:47.712Z caller=tls_config.go:235 level=info msg="TLS is disabled." http2=false address=127.0.0.1:9882
ts=2022-12-05T17:35:58.307Z caller=handler.go:34 level=debug msg="collect query:" filters="unsupported value type"
ts=2022-12-05T17:35:58.312Z caller=collector.go:135 level=debug msg="collector succeeded" name=network duration_seconds=0.002719641
ts=2022-12-05T17:35:58.323Z caller=collector.go:135 level=debug msg="collector succeeded" name=pod duration_seconds=0.013724029
ts=2022-12-05T17:35:58.349Z caller=collector.go:135 level=debug msg="collector succeeded" name=volume duration_seconds=0.040115157
ts=2022-12-05T17:35:58.363Z caller=collector.go:135 level=debug msg="collector succeeded" name=container duration_seconds=0.053379088
ts=2022-12-05T17:35:58.511Z caller=collector.go:135 level=debug msg="collector succeeded" name=system duration_seconds=0.202251355
ts=2022-12-05T17:35:58.558Z caller=collector.go:135 level=debug msg="collector succeeded" name=image duration_seconds=0.24884938
testing the socket itself seems OK
$ echo -e "GET /containers/json HTTP/1.0\r\n" | podman unshare nc -U ${SOCKET}
HTTP/1.0 200 OK
Api-Version: 1.41
Content-Type: application/json
Libpod-Api-Version: 4.3.1
Server: Libpod/4.3.1 (linux)
X-Reference-Id: 0xc0003c8000
Date: Mon, 05 Dec 2022 18:13:37 GMT
[{"Id":"1ef8fdf2102ba787e29125f3def643e6a4c5da4b22266daccffd2b24423b2549","Names" [...]
Expected behavior
have podman_container_mem_usage_bytes available in metrics exposed by prometheus-podman-exporter
environment
- CentOS 9 Stream
- Podman Version 4.3.1
- exporter run in a podman container:
/bin/podman_exporter --collector.enable-all --collector.store_labels --debug --web.listen-address 127.0.0.1:9882
Additional context
Any help in debugging this is welcome
Hi @felixkrohn
I have tried on FC37 the and cannot reproduce the issue (building from main branch).
Will try on CentOS 9 stream and let you know.
[navid@devnode prometheus-podman-exporter]$ ./bin/prometheus-podman-exporter --version
prometheus-podman-exporter (version=1.3.0, branch=main, revision=dev.1)
Example output:
# HELP podman_container_mem_usage_bytes Container memory usage.
# TYPE podman_container_mem_usage_bytes gauge
podman_container_mem_usage_bytes{id="1cddd7e911ef"} 3.01056e+06
podman_container_mem_usage_bytes{id="42907dc71261"} 49152
podman_container_mem_usage_bytes{id="5241480811bd"} 0
podman_container_mem_usage_bytes{id="bd7627e4d928"} 45056
podman_container_mem_usage_bytes{id="c81bdeea85df"} 2.740224e+06
Can u please also attach the output of
curl http://localhost:9882/metrics | grep containers
In fact I seem to get none of the resource usage metrics under podman_container_*:
$ podman exec -ti prometheus-podman-exporter sh
/ $ ps a
PID USER TIME COMMAND
1 nobody 2:11 /bin/podman_exporter --collector.enable-all --collector.store_labels --web.telemetry-path /podman/metrics --debug --web.listen-address 127.0.0.1:9882
18 nobody 0:00 sh
87 nobody 0:00 ps a
/ $ /bin/podman_exporter --version
prometheus-podman-exporter (version=1.3.0, branch=main, revision=dev.1)
/ $ wget -q http://127.0.0.1:9882/podman/metrics -O- | grep -v "^#" |grep "^podman" | awk -F'{' '{print $1}' | sort | uniq
podman_container_created_seconds
podman_container_exit_code
podman_container_exited_seconds
podman_container_info
podman_container_started_seconds
podman_container_state
podman_image_created_seconds
podman_image_info
podman_image_size
podman_network_info
podman_scrape_collector_duration_seconds
podman_scrape_collector_success
podman_system_api_version
podman_system_buildah_version
podman_system_conmon_version
podman_system_runtime_version
podman_volume_created_seconds
podman_volume_info
A friendly reminder that this issue had no activity for 30 days.