Podman stats showing wrong memory usage
abdelaziz-ouhammou opened this issue · comments
Issue Description
$ podman version
Client: Podman Engine
Version: 4.4.1
API Version: 4.4.1
Go Version: go1.19.6
Built: Thu Jun 15 17:39:56 2023
OS/Arch: linux/amd64
$ rpm -q podman
podman-4.4.1-14.module+el8.8.0+19108+ffbdcd02.x86_64
Steps to reproduce the issue
Steps to reproduce the issue
- run any file creation command such as tar or dd to create a file inside the container
example :
$ sudo podman container exec -it api sh
/app # dd if=/dev/zero of=output.dat bs=24M count=100
100+0 records in
100+0 records out
/app # sync
/app #
Describe the results you received
when runnin podman stats
it shows a high memory utilisation for the container
$ sudo podman stats
ID NAME CPU % MEM USAGE / LIMIT MEM % NET IO BLOCK IO PIDS CPU TIME AVG CPU %
ed71asdasdaas api 4.91% 2.58GB / 8.052GB 32.04% 842.7MB / 1.188GB 55.28MB / 29.39MB 7 15m35.164593502s 4.91%
the process is normaly using 48mb
the only way to fix this is to manually run the following command on the host:
$ sudo sh -c 'echo 1 > /proc/sys/vm/drop_caches'
$ sudo podman stats
ed71asdasda api 0.14% 40.98MB / 8.052GB 0.51% 852.7MB / 1.201GB 55.28MB / 29.93MB 7 15m45.419706366s 4.91%
Describe the results you expected
I expect podman stats to be accurate for monitoring the memory usage of containers. But running a backup inside the container or any IO operation messes up with the output
podman info output
host:
arch: amd64
buildahVersion: 1.29.0
cgroupControllers: []
cgroupManager: cgroupfs
cgroupVersion: v1
conmon:
package: conmon-2.1.6-1.module+el8.8.0+18098+9b44df5f.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.6, commit: 8c4ab5a095127ecc96ef8a9c885e0e1b14aeb11b'
cpuUtilization:
idlePercent: 81.92
systemPercent: 4.36
userPercent: 13.72
cpus: 2
distribution:
distribution: '"rhel"'
version: "8.8"
eventLogger: file
hostname: [hostname redacted]
idMappings:
gidmap:
- container_id: 0
host_id: 1002
size: 1
- container_id: 1
host_id: 231072
size: 65536
uidmap:
- container_id: 0
host_id: 1002
size: 1
- container_id: 1
host_id: 231072
size: 65536
kernel: 4.18.0-372.19.1.el8_6.x86_64
linkmode: dynamic
logDriver: k8s-file
memFree: 4544892928
memTotal: 8052305920
networkBackend: cni
ociRuntime:
name: runc
package: runc-1.1.4-1.module+el8.8.0+18060+3f21f2cc.x86_64
path: /usr/bin/runc
version: |-
runc version 1.1.4
spec: 1.0.2-dev
go: go1.19.4
libseccomp: 2.5.2
os: linux
remoteSocket:
path: /run/user/1002/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_SYS_CHROOT,CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.2.0-2.module+el8.8.0+18060+3f21f2cc.x86_64
version: |-
slirp4netns version 1.2.0
commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.2
swapFree: 8332025856
swapTotal: 8589930496
uptime: 1592h 37m 31.00s (Approximately 66.33 days)
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
search:
- registry.access.redhat.com
- registry.redhat.io
- docker.io
store:
configFile: ~/.config/containers/storage.conf
containerStore:
number: 0
paused: 0
running: 0
stopped: 0
graphDriverName: overlay
graphOptions: {}
graphRoot: ~/.local/share/containers/storage
graphRootAllocated: 10726932480
graphRootUsed: 1256620032
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 0
runRoot: /run/user/1002/containers
transientStore: false
volumePath: ~/.local/share/containers/storage/volumes
version:
APIVersion: 4.4.1
Built: 1686839996
BuiltTime: Thu Jun 15 17:39:56 2023
GitCommit: ""
GoVersion: go1.19.6
Os: linux
OsArch: linux/amd64
Version: 4.4.1
Podman in a container
No
Privileged Or Rootless
Privileged
Upstream Latest Release
Yes
Additional environment details
No response
Additional information
No response
can you reproduce it with any image?
How have you created the container?
@giuseppe Yes so far all images are affected (postgres image, prometheus image....)
the containers have been created using podman native commands.
sudo podman container run --restart=always -d --name api -p 4444:4444 api
that is expected output, the file you've just created is in the memory cache and the kernel accounts for that in the memory.usage_in_bytes
file.
I've tried your same command on cgroup v1 and the cgroup reports the following usage:
# podman stats --no-stream --no-reset
ID NAME CPU % MEM USAGE / LIMIT MEM % NET IO BLOCK IO PIDS CPU TIME AVG CPU %
5db629b04d96 unruffled_darwin 2.21% 2.87GB / 3.843GB 74.68% 190.2MB / 409.4kB 27.49MB / 5.935MB 1 42.101330797s 2.21%
# cat /sys/fs/cgroup/memory/machine.slice/libpod-5db629b04d960b7b4641928480075c245dc0503dc309fe033eb76096e6adee62.scope/memory.usage_in_bytes
2869575680
This memory is reclaimed if the container needs more, in fact you can see it is only cache:
# cat /sys/fs/cgroup/memory/machine.slice/libpod-5db629b04d960b7b4641928480075c245dc0503dc309fe033eb76096e6adee62.scope/memory.stat
cache 2842775552
rss 675840
rss_huge 0
shmem 0
mapped_file 0
dirty 0
writeback 0
swap 0
pgpgin 2239392
pgpgout 1545190
pgfault 279477
pgmajfault 6
inactive_anon 655360
active_anon 20480
inactive_file 2653802496
active_file 188973056
unevictable 0
hierarchical_memory_limit 9223372036854771712
hierarchical_memsw_limit 9223372036854771712
total_cache 2842775552
total_rss 675840
total_rss_huge 0
total_shmem 0
total_mapped_file 0
total_dirty 0
total_writeback 0
total_swap 0
total_pgpgin 2239392
total_pgpgout 1545190
total_pgfault 279477
total_pgmajfault 6
total_inactive_anon 655360
total_active_anon 20480
total_inactive_file 2653802496
total_active_file 188973056
total_unevictable 0
You can give a hint the kernel about releasing a file with fadvise, e.g. I've tried the following C program:
#include <fcntl.h>
int main() {
return posix_fadvise(1, 0, 0, POSIX_FADV_DONTNEED) ? 1 : 0;
}
and from the container:
# ./try-release-file-from-cache < output.dat
and after a while:
# podman stats --no-stream --no-reset
ID NAME CPU % MEM USAGE / LIMIT MEM % NET IO BLOCK IO PIDS CPU TIME AVG CPU %
5db629b04d96 unruffled_darwin 1.99% 347.2MB / 3.843GB 9.04% 190.2MB / 409.4kB 27.49MB / 5.935MB 1 42.293771618s 1.99%
I am closing the issue since Podman is just reporting the information it gets from the kernel, but feel free to comment further
@giuseppe Thank you very much for your help. I wrongly assumed that podman has the same behavior as docker. this is the excerpt from the documentation for docker stats
:
On Linux, the Docker CLI reports memory usage by subtracting cache usage from the total memory usage.
so in your opinion @giuseppe what would be the best way to monitor the actual usage of memory ?
thanks for the additional info, I'll take another look and compare with Docker
@giuseppe I just want to add that i checked the docker source code and they have the following function
// calculateMemUsageUnixNoCache calculate memory usage of the container.
// Cache is intentionally excluded to avoid misinterpretation of the output.
//
// On cgroup v1 host, the result is `mem.Usage - mem.Stats["total_inactive_file"]` .
// On cgroup v2 host, the result is `mem.Usage - mem.Stats["inactive_file"] `.
//
// This definition is consistent with cadvisor and containerd/CRI.
// * https://github.com/google/cadvisor/commit/307d1b1cb320fef66fab02db749f07a459245451
// * https://github.com/containerd/cri/commit/6b8846cdf8b8c98c1d965313d66bc8489166059a
//
// On Docker 19.03 and older, the result was `mem.Usage - mem.Stats["cache"]`.
// See https://github.com/moby/moby/issues/40727 for the background.
func calculateMemUsageUnixNoCache(mem types.MemoryStats) float64 {
// cgroup v1
if v, isCgroup1 := mem.Stats["total_inactive_file"]; isCgroup1 && v < mem.Usage {
return float64(mem.Usage - v)
}
// cgroup v2
if v := mem.Stats["inactive_file"]; v < mem.Usage {
return float64(mem.Usage - v)
}
return float64(mem.Usage)
}
this is the actual link for the file https://github.com/docker/cli/blob/master/cli/command/container/stats_helpers.go
opened a PR: #1643
We should probably match Docker's behaviour. Thanks @abdelaziz-ouhammou for diagnosing this.