containers / conmon

An OCI container runtime monitor.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Conmon hang after wake from sleep

jurf opened this issue · comments

On my two systems, one Silverblue 35 and one Silverblue 36, I sporadically get a conmon hang, using one core fully, right after waking from sleep. Usually happens in the morning.

$ conmon --version
conmon version 2.1.0
commit: 

Unsure what info to provide; if debugging is needed I do not mind trying to get a stack trace with debug symbols.

I am getting the same trouble pretty reliably when coming out of sleep too.

hey sorry for the late reply. what is the podman command used to spawn the conmon?

This is one from a currently-well-behaving one. I’ll update it once I can reproduce the bug again:

/usr/bin/conmon --api-version 1 -c 2c611e53f692f811772b306d81e30ce7faa535e902d91d96aa19a9e1bbe0cc8e -u 2c611e53f692f811772b306d81e30ce7faa535e902d91d96aa19a9e1bbe0cc8e -r /usr/bin/crun -b /var/home/jurf/.local/share/containers/storage/overlay-containers/2c611e53f692f811772b306d81e30ce7faa535e902d91d96aa19a9e1bbe0cc8e/userdata -p /run/user/1000/containers/overlay-containers/2c611e53f692f811772b306d81e30ce7faa535e902d91d96aa19a9e1bbe0cc8e/userdata/pidfile -n fedora-toolbox-35 --exit-dir /run/user/1000/libpod/tmp/exits --full-attach -s -l journald --log-level error --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/1000/containers/overlay-containers/2c611e53f692f811772b306d81e30ce7faa535e902d91d96aa19a9e1bbe0cc8e/userdata/oci-log --conmon-pidfile /run/user/1000/containers/overlay-containers/2c611e53f692f811772b306d81e30ce7faa535e902d91d96aa19a9e1bbe0cc8e/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/home/jurf/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg cni --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 2c611e53f692f811772b306d81e30ce7faa535e902d91d96aa19a9e1bbe0cc8e

I just got the problem, here is the related command:

/usr/bin/conmon --api-version 1 -c 3be367bd82add7d05abe9be8094a545ed515cc9a756ebaf1b17c34c8f9251cbc -u 02d60c0da319206c19c7eadab42cc82a9754ee120cacdb6c50ab4fe0f518ac93 -r /usr/bin/crun -b /var/home/mathieuv/.local/share/containers/storage/overlay-containers/3be367bd82add7d05abe9be8094a545ed515cc9a756ebaf1b17c34c8f9251cbc/userdata/02d60c0da319206c19c7eadab42cc82a9754ee120cacdb6c50ab4fe0f518ac93 -p /var/home/mathieuv/.local/share/containers/storage/overlay-containers/3be367bd82add7d05abe9be8094a545ed515cc9a756ebaf1b17c34c8f9251cbc/userdata/02d60c0da319206c19c7eadab42cc82a9754ee120cacdb6c50ab4fe0f518ac93/exec_pid -n fedora-toolbox-35 --exit-dir /var/home/mathieuv/.local/share/containers/storage/overlay-containers/3be367bd82add7d05abe9be8094a545ed515cc9a756ebaf1b17c34c8f9251cbc/userdata/02d60c0da319206c19c7eadab42cc82a9754ee120cacdb6c50ab4fe0f518ac93/exit --full-attach -s -l none --log-level error --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/var/home/mathieuv/.local/share/containers/storage/overlay-containers/3be367bd82add7d05abe9be8094a545ed515cc9a756ebaf1b17c34c8f9251cbc/userdata/02d60c0da319206c19c7eadab42cc82a9754ee120cacdb6c50ab4fe0f518ac93/oci-log -t -i -e --exec-attach --exec-process-spec /var/home/mathieuv/.local/share/containers/storage/overlay-containers/3be367bd82add7d05abe9be8094a545ed515cc9a756ebaf1b17c34c8f9251cbc/userdata/02d60c0da319206c19c7eadab42cc82a9754ee120cacdb6c50ab4fe0f518ac93/exec-process-244645825 --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/home/mathieuv/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1002/containers --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1002/libpod/tmp --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --exec --exit-command-arg 02d60c0da319206c19c7eadab42cc82a9754ee120cacdb6c50ab4fe0f518ac93 --exit-command-arg 3be367bd82add7d05abe9be8094a545ed515cc9a756ebaf1b17c34c8f9251cbc

I'm also having this issue on two separate computers running Silverblue 35.

/usr/bin/conmon --api-version 1 -c 201ecffb89ac75e3d53df3ed427f7dc6a2da8a06c4e51ed9f87affd14cdf3c3f -u 7fdaa6a1819a701c2e01a4a83797749b6c0d06d0cdb882b5e14a57b9a9fe56bb -r /usr/bin/crun -b /var/home/user/.local/share/containers/storage/overlay-containers/201ecffb89ac75e3d53df3ed427f7dc6a2da8a06c4e51ed9f87affd14cdf3c3f/userdata/7fdaa6a1819a701c2e01a4a83797749b6c0d06d0cdb882b5e14a57b9a9fe56bb -p /var/home/user/.local/share/containers/storage/overlay-containers/201ecffb89ac75e3d53df3ed427f7dc6a2da8a06c4e51ed9f87affd14cdf3c3f/userdata/7fdaa6a1819a701c2e01a4a83797749b6c0d06d0cdb882b5e14a57b9a9fe56bb/exec_pid -n fedora-toolbox-35 --exit-dir /var/home/user/.local/share/containers/storage/overlay-containers/201ecffb89ac75e3d53df3ed427f7dc6a2da8a06c4e51ed9f87affd14cdf3c3f/userdata/7fdaa6a1819a701c2e01a4a83797749b6c0d06d0cdb882b5e14a57b9a9fe56bb/exit --full-attach -s -l none --log-level error --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/var/home/user/.local/share/containers/storage/overlay-containers/201ecffb89ac75e3d53df3ed427f7dc6a2da8a06c4e51ed9f87affd14cdf3c3f/userdata/7fdaa6a1819a701c2e01a4a83797749b6c0d06d0cdb882b5e14a57b9a9fe56bb/oci-log -t -i -e --exec-attach --exec-process-spec /var/home/user/.local/share/containers/storage/overlay-containers/201ecffb89ac75e3d53df3ed427f7dc6a2da8a06c4e51ed9f87affd14cdf3c3f/userdata/7fdaa6a1819a701c2e01a4a83797749b6c0d06d0cdb882b5e14a57b9a9fe56bb/exec-process-123356233 --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/home/user/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --exec --exit-command-arg 7fdaa6a1819a701c2e01a4a83797749b6c0d06d0cdb882b5e14a57b9a9fe56bb --exit-command-arg 201ecffb89ac75e3d53df3ed427f7dc6a2da8a06c4e51ed9f87affd14cdf3c3f

and just double checking, is everyone using v2.1.0, or is anyone hitting this somewhere earlier?

My two machines where I have the issue are both on v2.1.0. One Silverblue 35, one Silverblue 36 beta.

Just started hitting this yesterday on my notebook running Silverblue 35. I initially noticed something was off as my notebook fans seemed to be running non-stop.

$ grep ^VERSION= /etc/os-release
VERSION="35.20220418.0 (Silverblue)"

$ conmon --version
conmon version 2.1.0
commit: 

$ rpm -qa conmon
conmon-2.1.0-2.fc35.x86_64

$ podman --version
podman version 3.4.4

Screenshot from 2022-04-20 08-55-54

I met the same issue on Fedora Silverblue 36.
The bad conmon is from a container generated by toolbox with Fedora 35 image.

$ grep ^VERSION= /etc/os-release
VERSION="36.20220415.n.0 (Silverblue Prerelease)"

$ conmon --version
conmon version 2.1.0
commit:

$ rpm -q conmon podman toolbox
conmon-2.1.0-2.fc36.x86_64
podman-4.0.2-1.fc36.x86_64
toolbox-0.0.99.3-4.fc36.x86_64
commented

I am also a Fedora Silverblue 36 user and the same happened to me while using https://github.com/palazzem/archlinux-toolbox

Same issue here. one CPU core is 100% busy with conmon.

I managed to get a strace and a gdb backtrace of a 100% CPU pinned conmon. The issue is pretty straightforward.

strace:

[pid 14804] read(21, 0x7fc32ea5ec90, 16) = -1 EBADF (Bad file descriptor)
[pid 14804] poll([{fd=21, events=POLLIN}], 1, -1) = 1 ([{fd=21, revents=POLLNVAL}])
[pid 14804] read(21, 0x7fc32ea5ec90, 16) = -1 EBADF (Bad file descriptor)
[pid 14804] poll([{fd=21, events=POLLIN}], 1, -1) = 1 ([{fd=21, revents=POLLNVAL}])
[pid 14804] read(21, 0x7fc32ea5ec90, 16) = -1 EBADF (Bad file descriptor)
[pid 14804] poll([{fd=21, events=POLLIN}], 1, -1) = 1 ([{fd=21, revents=POLLNVAL}])
[pid 14804] read(21, 0x7fc32ea5ec90, 16) = -1 EBADF (Bad file descriptor)
[pid 14804] poll([{fd=21, events=POLLIN}], 1, -1) = 1 ([{fd=21, revents=POLLNVAL}])
[pid 14804] read(21, 0x7fc32ea5ec90, 16) = -1 EBADF (Bad file descriptor)
[pid 14804] poll([{fd=21, events=POLLIN}], 1, -1) = 1 ([{fd=21, revents=POLLNVAL}])
[pid 14804] read(21, 0x7fc32ea5ec90, 16) = -1 EBADF (Bad file descriptor)
[pid 14804] poll([{fd=21, events=POLLIN}], 1, -1) = 1 ([{fd=21, revents=POLLNVAL}])
...etc...

gdb t a a bt

Thread 2 (Thread 0x7fc32ea5f640 (LWP 14804) "gmain"):
#0  0x00007fc32ee55baf in __GI___poll (fds=0x55ae87c3eed0, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007fc32f0f823d in g_main_context_poll (priority=<optimized out>, n_fds=1, fds=0x55ae87c3eed0, timeout=<optimized out>, context=0x55ae87c3b7b0) at ../glib/gmain.c:4516
#2  g_main_context_iterate.constprop.0 (context=context@entry=0x55ae87c3b7b0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4206
#3  0x00007fc32f0a0940 in g_main_context_iteration (context=0x55ae87c3b7b0, may_block=may_block@entry=1) at ../glib/gmain.c:4276
#4  0x00007fc32f0a0991 in glib_worker_main (data=<optimized out>) at ../glib/gmain.c:6178
#5  0x00007fc32f0cd302 in g_thread_proxy (data=0x55ae87c339e0) at ../glib/gthread.c:827
#6  0x00007fc32eddce1d in start_thread (arg=<optimized out>) at pthread_create.c:442
#7  0x00007fc32ee625e0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 1 (Thread 0x7fc32ea607c0 (LWP 14802) "conmon"):
#0  0x00007fc32ee2ceff in __GI___wait4 (pid=pid@entry=-1, stat_loc=stat_loc@entry=0x7ffcdff603b4, options=options@entry=0, usage=usage@entry=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
#1  0x00007fc32ee2ce7b in __GI___waitpid (pid=pid@entry=-1, stat_loc=stat_loc@entry=0x7ffcdff603b4, options=options@entry=0) at waitpid.c:38
#2  0x000055ae8757de57 in do_exit_command () at src/ctr_exit.c:174
#3  0x00007fc32ed91085 in __run_exit_handlers (status=status@entry=1, listp=0x7fc32ef47838 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:113
#4  0x00007fc32ed91200 in __GI_exit (status=status@entry=1) at exit.c:143
#5  0x000055ae8757e28b in write_sync_fd (fd=4, res=0, message=<optimized out>) at src/parent_pipe_fd.c:54
#6  0x000055ae875790ce in main (argc=<optimized out>, argv=<optimized out>) at src/conmon.c:518

So it's the glib worker thread that's spinning. There's not a lot of fds that get registered in that main context, but every main context has an eventfd used for cross-thread wakeups. Someone closed that.

This looks like it was introduced in e2215a1, which is recent enough that it's consistent with the bug being discovered in the last months.

I took a very brief look at the (small) subset of GLib's API that conmon is actually using, and if I had to make an educated guess, I'd say it's the signal handlers stuff that is causing the worker thread to be spun up. That's irrelevant though: the core issue here is that you can't just randomly close all fds like that.

thank you for looking into it @allisonkarlitskaya ! @giuseppe can you PTAL?

any recipe to reproduce the issue? I've tried pausing and resuming a virtual machine but I've never managed to get conmon stuck in this state

I've only seen it happen sporadically here, but the bug would definitely be getting triggered every single time do_exit_command() is running. As soon as the "close all fds" code runs, the bug has occurred. Normally you don't notice it, though, as exiting goes quickly, so maybe there's a 100% CPU spike, but it's very brief(edit: see detailed explanation below).

The key to seeing this issue is that the wait() on the child process needs to block (see the backtrace of the main thread, above). When I've observed this happening, it's just because the child process simply wasn't done yet. So basically, you need to get conmon to want to exit before its child process has. I don't really understand the lifecycle of conmon (or even what it does, to be honest), but I guess getting into this state involves a second unrelated bug.

Okay. I've done a bit more research.

First: it seems that conmon will attempt to exit when its primary child process has exited, but the waitpid() loop in the atexit() handler (here https://github.com/containers/conmon/blob/main/src/ctr_exit.c#L174) will block until all child processes are done. I consider this to be the "second unrelated bug" mentioned above: conmon shouldn't spend so much time in an atexit() handler. That blocking should be done in the main body of the program, really.

Second: per POSIX, close() of a file descriptor that you are poll()ing on from another thread is an undefined operation. On Linux it seems that poll() hangs in this case. Since there's only one fd being polled, we'd hang in this undefined state forever, but there's at least one event that causes the kernel to notice this situation: suspend. In that case, poll() returns, with EBADF, kicking off the loop.

So putting these two ideas together:

  1. run something under conmon where the primary process exits, but other subprocesses are still running:
podman exec f36 sh -c 'sleep 1000&'
  1. suspend your laptop
  2. wake your laptop to find conmon eating CPU.

This works 100% of the time for me.

the assumption was that g_main_loop_quit would terminate the worker thread while it doesn't seem to be the case. Do you know if there is a way to terminate the glib worker thread?

the assumption was that g_main_loop_quit would terminate the worker thread while it doesn't seem to be the case. Do you know if there is a way to terminate the glibc worker thread?

g_main_loop_quit() only signals the specified GMainLoop instance to exit. There may even be several of these in a program (as was done for gtk_dialog_run() and such, back in the day).

Once the worker context is started, it will never exit.

If you'd really like to avoid removing the "close all fds" logic, then the other route to go is to avoid g_unix_signal_add().

It's possible to get the same behaviour with fewer hacks using signalfd() and g_unix_fd_add(). This would also let you avoid the jump through SIGUSR1.

Out of interest, as this is still causing CPU/fan overheating (especially during this hot summer) -- is that fix going to go into Fedora soon? Or wasn't it fixed after all yet? I still have to pkill -9 conmon every day.

New versions of Podman, conmon and friends should be updated in Fedora in the next week or two.