`youki` fails to run in `docker-in-docker` with `cgroups v1`
jprendes opened this issue · comments
Reproduction:
cargo install cross --git https://github.com/cross-rs/cross
git clone --branch dind git@github.com:jprendes/youki.git
cd youki
cross build --features systemd,v1,v2 --bin youki
./dind.sh
Removing --runtime=youki
from dind.sh
, the example runs fine, but with youki it fails.
Old error message
The error I receive isdocker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/278419ddb31fde6701841c5b302a632751de2a5a6aa051a62530b12ee8ab4763/log.json: no such file or directory): /youki did not terminate successfully: exit status 1: unknown.
ERRO[0003] error waiting for container:
This originates from tying to run runwasi on docker-in-docker. In that case it works fine on cgroups v2 but fails on cgroups v1, and the error is while mounting cgroups.
New error message
After workind around the issue with journald, I get an error when using youki
with dind
+ cgroups v1
.
libcontainer::rootfs::mount: failed to canonicalize "/sys/fs/cgroup/systemd/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6": No such file or directory (os error 2)
See #2528 (comment) for log context.
From the dockerd log files
cleanup warnings
time="2023-11-14T09:19:48Z"
level=warning
msg="failed to remove runc container"
error="/youki did not terminate successfully: exit status 1: failed to initialize observability: No such file or directory (os error 2)\nError: No such file or directory (os error 2)\n"
runtime=io.containerd.runc.v2
time="2023-11-14T09:19:48Z"
level=warning msg="failed to read init pid file"
error="open /run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/0ad434dd1d96105d31e416174c5c4d3b6a721c32586b612218dcad3cb68bb9f6/init.pid: no such file or directory"
runtime=io.containerd.runc.v2
namespace=moby
I believe fixing this particular error would be the first step, and not the final solution.
As I mentioned before, runwasi
fails on cgroups mounting, so eventially we should hit the same issue.
I believe fixing this particular error would be the first step, and not the final solution.
I think this is a side effect of failure to create container, just showing up as different error. The failed to read init pid file
is mostly because the mounting failed, hence the init process creation didn't happen, hence there is no pid file created. I think this would get automatically fixed by addressing the original issue, not a separate problem.
The above error is due to journald logging. Working around that I get to the cgroups error.
2023-11-14T10:35:57.050971Z INFO libcgroups::common: cgroup manager V1 will be used
2023-11-14T10:35:57.051044Z DEBUG libcgroups::v1::manager: Get path for subsystem: cpu
2023-11-14T10:35:57.051816Z DEBUG libcgroups::v1::manager: Get path for subsystem: cpuacct
2023-11-14T10:35:57.052422Z DEBUG libcgroups::v1::manager: Get path for subsystem: cpuset
2023-11-14T10:35:57.053005Z DEBUG libcgroups::v1::manager: Get path for subsystem: devices
2023-11-14T10:35:57.053567Z DEBUG libcgroups::v1::manager: Get path for subsystem: hugetlb
2023-11-14T10:35:57.054149Z DEBUG libcgroups::v1::manager: Get path for subsystem: memory
2023-11-14T10:35:57.054748Z DEBUG libcgroups::v1::manager: Get path for subsystem: pids
2023-11-14T10:35:57.055309Z DEBUG libcgroups::v1::manager: Get path for subsystem: perf_event
2023-11-14T10:35:57.055930Z DEBUG libcgroups::v1::manager: Get path for subsystem: blkio
2023-11-14T10:35:57.056538Z DEBUG libcgroups::v1::manager: Get path for subsystem: net_prio
2023-11-14T10:35:57.057108Z DEBUG libcgroups::v1::manager: Get path for subsystem: net_cls
2023-11-14T10:35:57.057789Z DEBUG libcgroups::v1::manager: Get path for subsystem: freezer
2023-11-14T10:35:57.081832Z DEBUG libcgroups::v1::blkio: Apply blkio cgroup config
2023-11-14T10:35:57.081903Z DEBUG libcgroups::v1::devices: Apply Devices cgroup config
2023-11-14T10:35:57.082138Z DEBUG libcgroups::v1::memory: Apply Memory cgroup config
2023-11-14T10:35:57.082191Z DEBUG libcontainer::namespaces: unshare or setns: LinuxNamespace { typ: Pid, path: None }
2023-11-14T10:35:57.082381Z DEBUG libcontainer::process::channel: sending init pid (Pid(167))
2023-11-14T10:35:57.082664Z DEBUG libcontainer::namespaces: unshare or setns: LinuxNamespace { typ: Uts, path: None }
2023-11-14T10:35:57.082753Z DEBUG libcontainer::namespaces: unshare or setns: LinuxNamespace { typ: Ipc, path: None }
2023-11-14T10:35:57.082809Z DEBUG libcontainer::namespaces: unshare or setns: LinuxNamespace { typ: Network, path: None }
2023-11-14T10:35:57.083165Z DEBUG libcontainer::namespaces: unshare or setns: LinuxNamespace { typ: Mount, path: None }
2023-11-14T10:35:57.083306Z DEBUG libcontainer::rootfs::rootfs: prepare rootfs rootfs="/var/lib/docker/rootfs/overlayfs/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0"
2023-11-14T10:35:57.084111Z DEBUG libcontainer::rootfs::rootfs: mount root fs "/var/lib/docker/rootfs/overlayfs/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0"
2023-11-14T10:35:57.084162Z DEBUG libcontainer::rootfs::mount: mounting Mount { destination: "/proc", typ: Some("proc"), source: Some("proc"), options: Some(["nosuid", "noexec", "nodev"]) }
2023-11-14T10:35:57.084312Z DEBUG libcontainer::rootfs::mount: mounting Mount { destination: "/dev", typ: Some("tmpfs"), source: Some("tmpfs"), options: Some(["nosuid", "strictatime", "mode=755", "size=65536k"]) }
2023-11-14T10:35:57.084439Z DEBUG libcontainer::rootfs::mount: mounting Mount { destination: "/dev/pts", typ: Some("devpts"), source: Some("devpts"), options: Some(["nosuid", "noexec", "newinstance", "ptmxmode=0666", "mode=0620", "gid=5"]) }
2023-11-14T10:35:57.084526Z DEBUG libcontainer::rootfs::mount: mounting Mount { destination: "/sys", typ: Some("sysfs"), source: Some("sysfs"), options: Some(["nosuid", "noexec", "nodev", "ro"]) }
2023-11-14T10:35:57.084616Z DEBUG libcontainer::rootfs::mount: mounting Mount { destination: "/sys/fs/cgroup", typ: Some("cgroup"), source: Some("cgroup"), options: Some(["ro", "nosuid", "noexec", "nodev"]) }
2023-11-14T10:35:57.084675Z DEBUG libcontainer::rootfs::mount: mounting cgroup v1 filesystem
2023-11-14T10:35:57.084711Z DEBUG libcontainer::rootfs::mount: mounting Mount { destination: "/sys/fs/cgroup", typ: Some("tmpfs"), source: Some("tmpfs"), options: Some(["noexec", "nosuid", "nodev", "mode=755"]) }
2023-11-14T10:35:57.085640Z DEBUG libcontainer::rootfs::mount: cgroup mounts: ["/sys/fs/cgroup/systemd", "/sys/fs/cgroup/cpu,cpuacct", "/sys/fs/cgroup/devices", "/sys/fs/cgroup/rdma", "/sys/fs/cgroup/hugetlb", "/sys/fs/cgroup/pids", "/sys/fs/cgroup/net_cls,net_prio", "/sys/fs/cgroup/cpuset", "/sys/fs/cgroup/misc", "/sys/fs/cgroup/blkio", "/sys/fs/cgroup/freezer", "/sys/fs/cgroup/memory", "/sys/fs/cgroup/perf_event"]
2023-11-14T10:35:57.085798Z DEBUG libcontainer::rootfs::mount: Process cgroups: {"memory": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "misc": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6", "": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6", "perf_event": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "freezer": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "cpuset": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "net_cls,net_prio": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "hugetlb": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "devices": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "rdma": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6", "name=systemd": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6", "pids": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "blkio": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0", "cpu,cpuacct": "/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6/docker/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0"}
2023-11-14T10:35:57.085938Z DEBUG libcontainer::rootfs::mount: cgroup root: "/var/lib/docker/rootfs/overlayfs/c08259f3ad0c2c0c99c8016da601c212fd304c8b4aa8391d2b70486a2b73d7e0/sys/fs/cgroup"
2023-11-14T10:35:57.085964Z DEBUG libcontainer::rootfs::mount: Mounting (emulated) "systemd" cgroup subsystem
2023-11-14T10:35:57.085994Z DEBUG libcontainer::rootfs::mount: Mounting emulated cgroup subsystem: Mount { destination: "/sys/fs/cgroup/systemd", typ: Some("bind"), source: Some("/sys/fs/cgroup/systemd/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6"), options: Some(["rw", "rbind"]) }
2023-11-14T10:35:57.086030Z DEBUG libcontainer::rootfs::mount: mounting Mount { destination: "/sys/fs/cgroup/systemd", typ: Some("bind"), source: Some("/sys/fs/cgroup/systemd/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6"), options: Some(["rw", "rbind"]) }
2023-11-14T10:35:57.086112Z ERROR libcontainer::rootfs::mount: failed to canonicalize "/sys/fs/cgroup/systemd/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6": No such file or directory (os error 2)
2023-11-14T10:35:57.086150Z ERROR libcontainer::rootfs::mount: failed to mount Mount { destination: "/sys/fs/cgroup/systemd", typ: Some("bind"), source: Some("/sys/fs/cgroup/systemd/docker/af0c557cb9806654e1a9eac1de4a12afda57bf9f60952e4c3db016668c554ea6"), options: Some(["rw", "rbind"]) }: io error
2023-11-14T10:35:57.086189Z ERROR libcontainer::rootfs::mount: failed to mount systemd cgroup hierarchy: io error
2023-11-14T10:35:57.086227Z ERROR libcontainer::rootfs::mount: failed to mount cgroup v2: io error
2023-11-14T10:35:57.086253Z ERROR libcontainer::process::container_init_process: failed to prepare rootfs err=Mount(Io(Os { code: 2, kind: NotFound, message: "No such file or directory" }))
2023-11-14T10:35:57.086291Z ERROR libcontainer::process::container_intermediate_process: failed to initialize container process: failed to prepare rootfs
2023-11-14T10:35:57.086653Z ERROR libcontainer::process::container_main_process: failed to wait for init ready: failed to receive. "waiting for init ready". BrokenChannel
2023-11-14T10:35:57.086701Z ERROR libcontainer::container::builder_impl: failed to run container process err=Channel(ReceiveError { msg: "waiting for init ready", source: BrokenChannel })
I think this is a side effect of failure to create container, just showing up as different error. The failed to read init pid file is mostly because the mounting failed, hence the init process creation didn't happen, hence there is no pid file created. I think this would get automatically fixed by addressing the original issue, not a separate problem.
I think the interesting part of that error message was the failed to initialize observability
part.
Working around that issue, I get to the same issue as in runwasi
.
What happens if we use the cgroup v2 driver instead of the systemd driver?
I might be missunderstanding you, but the host has cgroups-v1, using a v2 driver would not be correct, right?
it looks like when I run it, it's taking the path of setup_emulated_subsystem
instead of setup_namespaced_subsystem
.
That branch is controlled by cgroups_ns
.
If I ignore cgroups_ns
and unconditionally use setup_namespaced_subsystem
everything works fine.
I'm not familiar with the need for emulation here. Why do we need it and how does it relate to cgroups_ns
?
IIUC, runc doesn't have an "emulated" mode.
https://github.com/opencontainers/runc/blob/main/libcontainer/rootfs_linux.go#L253
I think I have a fix for it: main...jprendes:youki:fix-dind
It still needs some kind of test. Unfortunately the docker
image is built on alpine
so it needs a musl
build of youki
.
I think I have a fix for it: main...jprendes:youki:fix-dind
It still needs some kind of test. Unfortunately the
docker
image is built onalpine
so it needs amusl
build ofyouki
.
Thanks for this fix:
I also experience this issue when developing youki
in a dev container.
I'll create a PR after adding some test
I might be missunderstanding you, but the host has cgroups-v1, using a v2 driver would not be correct, right?
Yes, you can see these information using youk info
.
What happens if we use the cgroup v2 driver instead of the systemd driver?
How would I do that?
youki info
:
Output
$ sudo ./target/x86_64-unknown-linux-musl/debug/youki info
DEBUG youki: started by user 0 with ArgsOs { inner: ["./target/x86_64-unknown-linux-musl/debug/youki", "info"] }
Version 0.3.0
Commit c7567ab4
Kernel-Release 5.15.0-88-generic
Kernel-Version #98~20.04.1-Ubuntu SMP Mon Oct 9 16:43:45 UTC 2023
Architecture x86_64
Operating System Ubuntu 20.04.6 LTS
Cores 8
Total Memory 7936
Cgroup setup hybrid
Cgroup mounts
blkio /sys/fs/cgroup/blkio
cpu /sys/fs/cgroup/cpu,cpuacct
cpuacct /sys/fs/cgroup/cpu,cpuacct
cpuset /sys/fs/cgroup/cpuset
devices /sys/fs/cgroup/devices
freezer /sys/fs/cgroup/freezer
hugetlb /sys/fs/cgroup/hugetlb
memory /sys/fs/cgroup/memory
net_cls /sys/fs/cgroup/net_cls,net_prio
net_prio /sys/fs/cgroup/net_cls,net_prio
perf_event /sys/fs/cgroup/perf_event
pids /sys/fs/cgroup/pids
unified /sys/fs/cgroup/unified
CGroup v2 controllers
cpu detached
cpuset detached
hugetlb detached
io detached
memory detached
pids detached
device attached
Namespaces enabled
mount enabled
uts enabled
ipc enabled
user enabled
pid enabled
network enabled
cgroup enabled
Capabilities
CAP_BPF available
CAP_PERFMON available
CAP_CHECKPOINT_RESTORE available
Thanks. It looks like using cgroup v2. In that case, if you enable the systemd feature, you use systemd
youki/crates/libcgroups/src/common.rs
Lines 363 to 372 in c538361
Cgroup setup hybrid
So you mean building without the systemd
feature, and trying agian?
cross build --features v1,v2 --bin youki
That fails in the same way:
2023-11-16T12:07:36.598531Z DEBUG libcontainer::rootfs::mount: 307: Mounting (emulated) "systemd" cgroup subsystem
2023-11-16T12:07:36.598589Z DEBUG libcontainer::rootfs::mount: 347: Mounting emulated cgroup subsystem: Mount { destination: "/sys/fs/cgroup/systemd", typ: Some("bind"), source: Some("/sys/fs/cgroup/systemd/docker/340bef13dfc80b834d44b1fb8c3efe96a9ec59d553f194cc6a63f351411ad692"), options: Some(["rw", "rbind"]) }
2023-11-16T12:07:36.598656Z DEBUG libcontainer::rootfs::mount: 76: mounting Mount { destination: "/sys/fs/cgroup/systemd", typ: Some("bind"), source: Some("/sys/fs/cgroup/systemd/docker/340bef13dfc80b834d44b1fb8c3efe96a9ec59d553f194cc6a63f351411ad692"), options: Some(["rw", "rbind"]) }
2023-11-16T12:07:36.598809Z ERROR libcontainer::rootfs::mount: 506: failed to canonicalize "/sys/fs/cgroup/systemd/docker/340bef13dfc80b834d44b1fb8c3efe96a9ec59d553f194cc6a63f351411ad692": No such file or directory (os error 2)
2023-11-16T12:07:36.598868Z ERROR libcontainer::rootfs::mount: 127: failed to mount Mount { destination: "/sys/fs/cgroup/systemd", typ: Some("bind"), source: Some("/sys/fs/cgroup/systemd/docker/340bef13dfc80b834d44b1fb8c3efe96a9ec59d553f194cc6a63f351411ad692"), options: Some(["rw", "rbind"]) }: io error
2023-11-16T12:07:36.598935Z ERROR libcontainer::rootfs::mount: 350: failed to mount systemd cgroup hierarchy: io error
2023-11-16T12:07:36.599007Z ERROR libcontainer::rootfs::mount: 90: failed to mount cgroup v1: io error
2023-11-16T12:07:36.599051Z ERROR libcontainer::process::container_init_process: 326: failed to prepare rootfs err=Mount(Io(Os { code: 2, kind: NotFound, message: "No such file or directory" }))
2023-11-16T12:07:36.599114Z ERROR libcontainer::process::container_intermediate_process: 151: failed to initialize container process: failed to prepare rootfs
2023-11-16T12:07:36.599625Z ERROR libcontainer::process::container_main_process: 153: failed to wait for init ready: failed to receive. "waiting for init ready". BrokenChannel
2023-11-16T12:07:36.599728Z ERROR libcontainer::container::builder_impl: 156: failed to run container process err=Channel(ReceiveError { msg: "waiting for init ready", source: BrokenChannel })