google / gvisor

Application Kernel for Containers

Home Page:https://gvisor.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

--overlay2=none + supervisord - panic: interface conversion: interface {} is nil, not *gofer.lisafsDentry

jcodybaker opened this issue · comments

Description

With running gVisor with the --overlay2=none config, supervisord (with unix_http_server configured) will cause gvisor to panic.

Here's the panic from the gVisor debug log. Full log available at runsc-debug.log.

panic: interface conversion: interface {} is nil, not *gofer.lisafsDentry

goroutine 40 gp=0xc00047ddc0 m=14 mp=0xc00040a008 [running]:
panic({0x10e9d40?, 0xc0004f5260?})
        GOROOT/src/runtime/panic.go:779 +0x158 fp=0xc0002973a8 sp=0xc0002972f8 pc=0x43cb78
runtime.panicdottypeE(0x0, 0x1294f60, 0x10af900)
        GOROOT/src/runtime/iface.go:262 +0x65 fp=0xc0002973c8 sp=0xc0002973a8 pc=0x40ee65
gvisor.dev/gvisor/pkg/sentry/fsimpl/gofer.(*dentry).link(0xc0002741e0?, {0x14ed218?, 0xc00047f508?}, 0x0?, {0xc0007d8b05?, 0xe1d20c?})
        pkg/sentry/fsimpl/gofer/dentry_impl.go:359 +0xc9 fp=0xc000297408 sp=0xc0002973c8 pc=0xe04849
gvisor.dev/gvisor/pkg/sentry/fsimpl/gofer.(*filesystem).LinkAt.func1(0xc0001b7b08, {0xc0007d8b05, 0xf}, 0xf?)
        pkg/sentry/fsimpl/gofer/filesystem.go:812 +0xfb fp=0xc000297460 sp=0xc000297408 pc=0xe10d9b
gvisor.dev/gvisor/pkg/sentry/fsimpl/gofer.(*filesystem).doCreateAt(0xc0000e4500, {0x14ed218, 0xc00047f508}, 0xc0002a5208, 0x0, 0xc000297680, 0x0)
        pkg/sentry/fsimpl/gofer/filesystem.go:510 +0x8ea fp=0xc000297640 sp=0xc000297460 pc=0xe0e3ca
gvisor.dev/gvisor/pkg/sentry/fsimpl/gofer.(*filesystem).LinkAt(0xc0002b8c60?, {0x14ed218?, 0xc00047f508?}, 0xc0002a5208?, {0xc0000cd4a0?, 0xc0007d5208?})
        pkg/sentry/fsimpl/gofer/filesystem.go:792 +0x7d fp=0xc0002976c0 sp=0xc000297640 pc=0xe10bfd
gvisor.dev/gvisor/pkg/sentry/vfs.(*VirtualFilesystem).LinkAt(0xc0002b8c60, {0x14ed218, 0xc00047f508}, 0xc0002741e0, 0xc0007d8b01?, 0xc000297820)
        pkg/sentry/vfs/vfs.go:332 +0x187 fp=0xc000297730 sp=0xc0002976c0 pc=0x78c1a7
gvisor.dev/gvisor/pkg/sentry/syscalls/linux.linkat(0xc00047f508, 0xffffff9c, 0x4?, 0xffffff9c, 0x7f48d9a56390, 0x0)
        pkg/sentry/syscalls/linux/sys_file.go:1054 +0x36a fp=0xc000297910 sp=0xc000297730 pc=0xb2268a
gvisor.dev/gvisor/pkg/sentry/syscalls/linux.Link(0x0?, 0xc00047f508?, {{0x7f48da61d190}, {0x7f48d9a56390}, {0x0}, {0xfffffffffffff11a}, {0x0}, {0xffffffff}})
        pkg/sentry/syscalls/linux/sys_file.go:1013 +0x27 fp=0xc000297950 sp=0xc000297910 pc=0xb22287
gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).executeSyscall(0xc00047f508, 0x56, {{0x7f48da61d190}, {0x7f48d9a56390}, {0x0}, {0xfffffffffffff11a}, {0x0}, {0xffffffff}})
        pkg/sentry/kernel/task_syscall.go:143 +0x673 fp=0xc000297c90 sp=0xc000297950 pc=0x9c3cd3
gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).doSyscallInvoke(0xc00047f508, 0x56, {{0x7f48da61d190}, {0x7f48d9a56390}, {0x0}, {0xfffffffffffff11a}, {0x0}, {0xffffffff}})
        pkg/sentry/kernel/task_syscall.go:323 +0x45 fp=0xc000297ce8 sp=0xc000297c90 pc=0x9c4e85
gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).doSyscallEnter(0xc00047f508, 0x56, {{0x7f48da61d190}, {0x7f48d9a56390}, {0x0}, {0xfffffffffffff11a}, {0x0}, {0xffffffff}})
        pkg/sentry/kernel/task_syscall.go:283 +0x65 fp=0xc000297d38 sp=0xc000297ce8 pc=0x9c4b85
gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).doSyscall(0xc00047f508?)
        pkg/sentry/kernel/task_syscall.go:258 +0x2e5 fp=0xc000297e30 sp=0xc000297d38 pc=0x9c4905
gvisor.dev/gvisor/pkg/sentry/kernel.(*runApp).execute(0xc00014b4d0?, 0xc00047f508)
        pkg/sentry/kernel/task_run.go:263 +0xef7 fp=0xc000297f48 sp=0xc000297e30 pc=0x9b8d97
gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).run(0xc00047f508, 0x1)
        pkg/sentry/kernel/task_run.go:98 +0x1e2 fp=0xc000297fc0 sp=0xc000297f48 pc=0x9b7822
gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).Start.gowrap1()
        pkg/sentry/kernel/task_start.go:391 +0x25 fp=0xc000297fe0 sp=0xc000297fc0 pc=0x9c25c5
runtime.goexit({})
        src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000297fe8 sp=0xc000297fe0 pc=0x478c41
created by gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).Start in goroutine 1
        pkg/sentry/kernel/task_start.go:391 +0xe5

Steps to reproduce

runsc must be configured with the --overlay2=none

/usr/local/bin/runsc install -- --debug --debug-log=/tmp/runsc-debug.log --strace --overlay2=none
systemctl restart docker
docker run --rm --runtime=runsc jcodybaker/gvisor-supervisord-crash:latest

Reproducer Image (Docker Hub): https://hub.docker.com/layers/jcodybaker/gvisor-supervisord-crash/latest/images/sha256-e84134b1f2267b7b7f678232c3b6c6394a2fa8056d28de5bcc2ae3818497c176?context=repo

GitHub Dockerfile + config: https://github.com/jcodybaker/apps-8443-repro

runsc version

runsc version release-20240305.0
spec: 1.1.0-rc.1

docker version (if using docker)

Docker version 25.0.4, build 1a576c5

uname

Linux gvisor-dev 6.1.0-9-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.27-1 (2023-05-08) x86_64 GNU/Linux

kubectl (if using Kubernetes)

No response

repo state (if built from source)

No response

runsc debug logs (if available)

[runsc-debug.log](https://github.com/google/gvisor/files/14564720/runsc-debug.log)

This looks to be due to dentry.impl being nil in dentry.link. Per this comment, "If impl is nil, this dentry represents a synthetic file". Yet the comment in dentry.link says "Precontions: !d.isSynthetic().". Sounds like that isn't actually guaranteed in this code path.

The comment in dentry.impl says that the current implementation of synthetic dentries only supports sockets, pipes, and directories. This workload is trying to create a hard link (not sure to what), so perhaps this is a known deficiency? But gVisor still shouldn't panic.

/cc @ayushr2

TLDR; Pass either --host-uds=create or --host-uds=all

D0311 21:21:35.843691    8069 client.go:400] send [channel 0xc00014a870] OpenAtReq{FD: 9, Flags: 0}
D0311 21:21:35.843718    8069 client.go:400] recv [channel 0xc00014a870] OpenAtResp{OpenFD: 10}
D0311 21:21:35.843741    8069 client.go:400] send [channel 0xc00014a870] Getdents64Req{DirFD: 10, Count: -65536}
D0311 21:21:35.843819    8069 client.go:400] recv [channel 0xc00014a870] Getdents64Resp{Dirents: [Dirent64{Ino: 262162, DevMinor: 46, DevMajor: 0, Off: 3, Type: 8, Name: example}]}
D0311 21:21:35.843831    8069 client.go:400] send [channel 0xc00014a870] Getdents64Req{DirFD: 10, Count: 65536}
D0311 21:21:35.843845    8069 client.go:400] recv [channel 0xc00014a870] Getdents64Resp{Dirents: []}
I0311 21:21:35.843853    8069 vfs.go:1046] Skipping internal tmpfs mount for "/tmp" because it's not empty

The /tmp dir is usually backed by sentry-internal tmpfs. But since it is not empty here (the container filesystem has /tmp/example file), we use gofer filesystem to serve /tmp.

I0311 21:21:36.298833    8069 strace.go:561] [   1:   1] supervisord E unlink(0x7f48d9a56390 /tmp/supervisor.sock.1)
D0311 21:21:36.298849    8069 client.go:400] send [channel 0xc00014a870] WalkReq{DirFD: 9, Path: [supervisor.sock.1]}
D0311 21:21:36.298894    8069 client.go:400] recv [channel 0xc00014a870] WalkResp{Status: ComponentDoesNotExist, Inodes: []}
I0311 21:21:36.298908    8069 strace.go:599] [   1:   1] supervisord X unlink(0x7f48d9a56390 /tmp/supervisor.sock.1) = 0 (0x0) errno=2 (no such file or directory) (64.715µs)
I0311 21:21:36.298964    8069 strace.go:567] [   1:   1] supervisord E socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0x0)
I0311 21:21:36.298995    8069 strace.go:605] [   1:   1] supervisord X socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0x0) = 4 (0x4) (12.543µs)
I0311 21:21:36.299045    8069 strace.go:567] [   1:   1] supervisord E bind(0x4 socket:[3], 0x7fbc9c081870 {Family: AF_UNIX, error extracting address: address family not supported by protocol}, 0x18)
D0311 21:21:36.299075    8069 client.go:400] send [channel 0xc00014a870] BindAtReq{DirFD: 9, Mode: S_IFSOCK|0o755, UID: 0, GID: 0, SockType: 1, Name: "supervisor.sock.1"}
D0311 21:21:36.299097    8069 client.go:400] recv [channel 0xc00014a870] ErrorResp{errno: 1}
I0311 21:21:36.299113    8069 strace.go:605] [   1:   1] supervisord X bind(0x4 socket:[3], 0x7fbc9c081870 {Family: AF_UNIX, error extracting address: address family not supported by protocol}, 0x18) = 0 (0x0) (57.132µs)
I0311 21:21:36.299149    8069 strace.go:564] [   1:   1] supervisord E chmod(0x7f48da61d190 /tmp/supervisor.sock.1, 0o700)
I0311 21:21:36.299163    8069 strace.go:602] [   1:   1] supervisord X chmod(0x7f48da61d190 /tmp/supervisor.sock.1, 0o700) = 0 (0x0) (4.625µs)
I0311 21:21:36.299193    8069 strace.go:564] [   1:   1] supervisord E link(0x7f48da61d190 /tmp/supervisor.sock.1, 0x7f48d9a56390 /tmp/supervisor.sock)
panic: interface conversion: interface {} is nil, not *gofer.lisafsDentry
...

The application tries to bind a socket at /tmp/supervisor.sock.1. The BindAt RPC fails with EPERM because --host-uds flag is not set correctly. To get past this issue, pass either --host-uds=create or --host-uds=all. This should unblock you.

Yes the sentry should not be panicking. I will look into it. There seems to be some bug, given chmod(/tmp/supervisor.sock.1) even though the file should not be existing after the failed BindAt RPC.

I think I understand the problem. When BindAt returns EPERM, we fallback to creating a synthetic socket.

Yet the comment in dentry.link says "Precontions: !d.isSynthetic().". Sounds like that isn't actually guaranteed in this code path.

The precondition is actually met. It is regarding d being non-synthetic. The panic occurs because target dentry is synthetic (which is not mentioned in the preconditions). There is a bug in that function (gofer.dentry.link()) in that it does not handle a synthetic target. The support for hard links in gofer is broken due to #6739. So implementing hard link support for such a synthetic file will be broken (although it will be as broken as the non-synthetic file case). But maybe we should still add it (to avoid panic).

I believe this is the same issue as #6577. For now, you can:

  • Pass --host-uds=create or --host-uds=all as runsc flags so that the BindAt RPC actually passes (and we don't have to fallback to synthetic socket).
  • Ensure container /tmp is empty, so gVisor tmpfs is used, which has seamless support for such bind(2) and link(2) scenarios. Is the /tmp/example file used? From the debug logs, it doesn't look like it. Deleting it will do it. Usually container images have /tmp empty...

Thanks for the quick turn-around! Will give this a test once the nightly is cut.

Will give this a test once the nightly is cut.

Please note that even with this fix, the application may fail (because the said link(2) syscall will fail with EOPNOTSUPP). But the sentry will not panic.

To get the application to work properly, refer to the two suggestions mentioned in #10143 (comment).