Podman stats showing wrong memory usage

abdelaziz-ouhammou opened this issue · comments

Issue Description

$ podman version
Client:       Podman Engine
Version:      4.4.1
API Version:  4.4.1
Go Version:   go1.19.6
Built:        Thu Jun 15 17:39:56 2023
OS/Arch:      linux/amd64
$ rpm -q podman

Steps to reproduce the issue

  1. run any file creation command such as tar or dd to create a file inside the container
    example :
$ sudo podman container exec -it api sh
/app # dd if=/dev/zero of=output.dat  bs=24M  count=100
100+0 records in
100+0 records out
/app # sync
/app #

Describe the results you received

when runnin podman stats it shows a high memory utilisation for the container

$ sudo podman stats
ID             NAME        CPU %       MEM USAGE / LIMIT  MEM %       NET IO             BLOCK IO           PIDS        CPU TIME           AVG CPU %
ed71asdasdaas  api         4.91%       2.58GB / 8.052GB   32.04%      842.7MB / 1.188GB  55.28MB / 29.39MB  7           15m35.164593502s   4.91%

the process is normaly using 48mb

the only way to fix this is to manually run the following command on the host:

$ sudo sh -c 'echo 1 > /proc/sys/vm/drop_caches'
$ sudo podman stats
ed71asdasda   api         0.14%       40.98MB / 8.052GB  0.51%       852.7MB / 1.201GB  55.28MB / 29.93MB  7           15m45.419706366s   4.91%

Describe the results you expected

I expect podman stats to be accurate for monitoring the memory usage of containers. But running a backup inside the container or any IO operation messes up with the output

podman info output

  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
    package: conmon-2.1.6-1.module+el8.8.0+18098+9b44df5f.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.6, commit: 8c4ab5a095127ecc96ef8a9c885e0e1b14aeb11b'
    idlePercent: 81.92
    systemPercent: 4.36
    userPercent: 13.72
  cpus: 2
    distribution: '"rhel"'
    version: "8.8"
  eventLogger: file
  hostname: [hostname redacted]
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 231072
      size: 65536
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 231072
      size: 65536
  kernel: 4.18.0-372.19.1.el8_6.x86_64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 4544892928
  memTotal: 8052305920
  networkBackend: cni
    name: runc
    package: runc-1.1.4-1.module+el8.8.0+18060+3f21f2cc.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.1.4
      spec: 1.0.2-dev
      go: go1.19.4
      libseccomp: 2.5.2
  os: linux
    path: /run/user/1002/podman/podman.sock
    apparmorEnabled: false
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-2.module+el8.8.0+18060+3f21f2cc.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.4.0
      libseccomp: 2.5.2
  swapFree: 8332025856
  swapTotal: 8589930496
  uptime: 1592h 37m 31.00s (Approximately 66.33 days)
  authorization: null
  - k8s-file
  - none
  - passthrough
  - journald
  - bridge
  - macvlan
  - ipvlan
  - local
  configFile: ~/.config/containers/storage.conf
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: ~/.local/share/containers/storage
  graphRootAllocated: 10726932480
  graphRootUsed: 1256620032
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
    number: 0
  runRoot: /run/user/1002/containers
  transientStore: false
  volumePath: ~/.local/share/containers/storage/volumes
  APIVersion: 4.4.1
  Built: 1686839996
  BuiltTime: Thu Jun 15 17:39:56 2023
  GitCommit: ""
  GoVersion: go1.19.6
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.1

Upstream Latest Release


can you reproduce it with any image?

How have you created the container?

@giuseppe Yes so far all images are affected (postgres image, prometheus image....)

the containers have been created using podman native commands.

sudo podman container run --restart=always -d --name api -p 4444:4444 api

that is expected output, the file you've just created is in the memory cache and the kernel accounts for that in the memory.usage_in_bytes file.

I've tried your same command on cgroup v1 and the cgroup reports the following usage:

# podman stats --no-stream --no-reset
ID            NAME              CPU %       MEM USAGE / LIMIT  MEM %       NET IO             BLOCK IO           PIDS        CPU TIME       AVG CPU %
5db629b04d96  unruffled_darwin  2.21%       2.87GB / 3.843GB   74.68%      190.2MB / 409.4kB  27.49MB / 5.935MB  1           42.101330797s  2.21%
# cat /sys/fs/cgroup/memory/machine.slice/libpod-5db629b04d960b7b4641928480075c245dc0503dc309fe033eb76096e6adee62.scope/memory.usage_in_bytes 

This memory is reclaimed if the container needs more, in fact you can see it is only cache:

# cat /sys/fs/cgroup/memory/machine.slice/libpod-5db629b04d960b7b4641928480075c245dc0503dc309fe033eb76096e6adee62.scope/memory.stat
cache 2842775552
rss 675840
rss_huge 0
shmem 0
mapped_file 0
dirty 0
writeback 0
swap 0
pgpgin 2239392
pgpgout 1545190
pgfault 279477
pgmajfault 6
inactive_anon 655360
active_anon 20480
inactive_file 2653802496
active_file 188973056
unevictable 0
hierarchical_memory_limit 9223372036854771712
hierarchical_memsw_limit 9223372036854771712
total_cache 2842775552
total_rss 675840
total_rss_huge 0
total_shmem 0
total_mapped_file 0
total_dirty 0
total_writeback 0
total_swap 0
total_pgpgin 2239392
total_pgpgout 1545190
total_pgfault 279477
total_pgmajfault 6
total_inactive_anon 655360
total_active_anon 20480
total_inactive_file 2653802496
total_active_file 188973056
total_unevictable 0

You can give a hint the kernel about releasing a file with fadvise, e.g. I've tried the following C program:

#include <fcntl.h>

int main() {
    return posix_fadvise(1, 0, 0, POSIX_FADV_DONTNEED) ? 1 : 0;

and from the container:

# ./try-release-file-from-cache < output.dat

and after a while:

# podman stats --no-stream --no-reset
ID            NAME              CPU %       MEM USAGE / LIMIT  MEM %       NET IO             BLOCK IO           PIDS        CPU TIME       AVG CPU %
5db629b04d96  unruffled_darwin  1.99%       347.2MB / 3.843GB  9.04%       190.2MB / 409.4kB  27.49MB / 5.935MB  1           42.293771618s  1.99%

I am closing the issue since Podman is just reporting the information it gets from the kernel, but feel free to comment further

@giuseppe Thank you very much for your help. I wrongly assumed that podman has the same behavior as docker. this is the excerpt from the documentation for docker stats:
On Linux, the Docker CLI reports memory usage by subtracting cache usage from the total memory usage.

so in your opinion @giuseppe what would be the best way to monitor the actual usage of memory ?

thanks for the additional info, I'll take another look and compare with Docker

@giuseppe I just want to add that i checked the docker source code and they have the following function

// calculateMemUsageUnixNoCache calculate memory usage of the container.
// Cache is intentionally excluded to avoid misinterpretation of the output.
// On cgroup v1 host, the result is `mem.Usage - mem.Stats["total_inactive_file"]` .
// On cgroup v2 host, the result is `mem.Usage - mem.Stats["inactive_file"] `.
// This definition is consistent with cadvisor and containerd/CRI.
// *
// *
// On Docker 19.03 and older, the result was `mem.Usage - mem.Stats["cache"]`.
// See for the background.
func calculateMemUsageUnixNoCache(mem types.MemoryStats) float64 {
	// cgroup v1
	if v, isCgroup1 := mem.Stats["total_inactive_file"]; isCgroup1 && v < mem.Usage {
		return float64(mem.Usage - v)
	// cgroup v2
	if v := mem.Stats["inactive_file"]; v < mem.Usage {
		return float64(mem.Usage - v)
	return float64(mem.Usage)

this is the actual link for the file

opened a PR: #1643

We should probably match Docker's behaviour. Thanks @abdelaziz-ouhammou for diagnosing this.