Unexpected `chdir` invoked on container `init` and `start`
Mossaka opened this issue · comments
While investigating a performance issue, I observed that the working directory /run/containerd/io.containerd.runtime.v2.task/<namespace>/<containerid>/
becomes inaccessible or gets deleted after executing the shim::wait()
call in the runwasi
shim process. This deletion prevents the shim process from reading the address
file to delete the shim socket. (e.g. ref code)
Logs
I ran bpftrace on unlink
and unlinkat
syscalls on that paths and found that youki
inner process unlink
s the bundle path before containerd calls delete-shim
(before process 2611761
gets started).
Process started: /usr/local/bin/containerd-shim-wasmtime-v1 PID: 2611672
Process started: /usr/local/bin/containerd-shim-wasmtime-v1 PID: 2611681
PID 2611707 (youki:[2:INIT]): File unlink in target directory: /run/containerd/io.containerd.runtime.v2.task/default/testwasm/..
PID 2611681 (client_handler): File unlinkat in target directory: /run/containerd/wasmtime/default/testwasm
Process started: /usr/local/bin/containerd-shim-wasmtime-v1 PID: 2611761
PID 569984 (containerd): File unlinkat in target directory: /run/containerd/io.containerd.runtime.v2.task/default/testwasm/..
PID 569984 (containerd): File unlinkat in target directory: /run/containerd/io.containerd.runtime.v2.task/default/testwasm/..
PID 569984 (containerd): File unlinkat in target directory: /run/containerd/io.containerd.runtime.v2.task/default/.testwasm..
PID 569984 (containerd): File unlinkat in target directory: /run/containerd/io.containerd.runtime.v2.task/default/.testwasm..
PID 569984 (containerd): File unlinkat in target directory: .testwasm
PID 569984 (containerd): File unlinkat in target directory: .testwasm
PID 569984 (containerd): File unlinkat in target directory: /run/containerd/io.containerd.runtime.v2.task/default/testwasm/..
PID 569984 (containerd): File unlinkat in target directory: /run/containerd/io.containerd.runtime.v2.task/default/testwasm/..
PID 2611658 (ctr): File unlinkat in target directory: testwasm-stderr
PID 2611658 (ctr): File unlinkat in target directory: testwasm-stdout
PID 2611658 (ctr): File unlinkat in target directory: testwasm-stdin
Specifically, this caught my attention: PID 2611707 (youki:[2:INIT]): File unlink in target directory: /run/containerd/io.containerd.runtime.v2.task/default/testwasm/..
Question:
I am raising this issue to try to understand why youki does that. This might be the reason why the shim process is not able to delete the socket address after the ttrpc server shuts down.
I want to make sure something before the investigation:
a. Don't the executor and post hook you passed to libcontainer call unlink
?
b. Is there the smallest step to reproduce this?
c. May I ask you to give us before and after syscalls to help us understand?
I will try to reproduce this in youki, getting back to you later.
Okay I spent more time tracing where the root cause is, and found that after handling the Create
request in runwasi, the shim process's current directory has been set to the container root_directory
(e.g. /run/youki/<ns>/<id>
) by youki at here. And after the Delete
request, youki cleans up the container path, and so the shim process doesn't have a current directory anymore.
Question: why does youki chdir
to the container directory at init and container_start?
@Furisto Hi, Thomas. I'd like to know about your comment https://github.com/containers/youki/pull/143/files#r673503679. Is this assuming that console_socket
was a relative path?
Is there a problem here?
youki/crates/libcontainer/src/tty.rs
Lines 94 to 99 in 377a7ca
Sorry, but I've created another PR to fix it.
#2780
Hey @Mossaka , The related PR will release soon, but I had a question with this issue -
You mentioned that
... youki at here. And after the Delete request, youki cleans up the container path, and so the shim process doesn't have a current directory anymore.
Question: why does youki chdir to the container directory at init and container_start?
I'm not sure why youki setting cwd in the container init process would have any issues with shim? The start, run and delete youki processes (created using youki create
, youki start
and youki delete
resp.) would run independent of each other, so probably the only potential cause of the removed dir would be in the delete process right? What would be the issue with start and run processes (and by extension init )?