google / gvisor

Application Kernel for Containers

Home Page:https://gvisor.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

runtime.spanOf: nosplit stack overflow

Skyxim opened this issue · comments

Description

I use gViosr as the mode network stack for applying tun, I can debug normally during go1.17, but I will receive the following error when upgrading to 1.18.

GOARCH=arm64 and -gcflags=all="-N -l" The go compiler will report an error
runtime.spanOf: nosplit stack overflow
792 assumed on entry to gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRef<1> (nosplit)
744 after gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRef<1> (nosplit) uses 48
600 after gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRefWithDestructor<1> (nosplit) uses 144
472 after gvisor.dev/gvisor/pkg/refs.(*weakRefList).Remove<1> (nosplit) uses 128
456 after gvisor.dev/gvisor/pkg/refs.(*weakRefEntry).SetNext<1> (nosplit) uses 16
232 after runtime.gcWriteBarrier<1> (nosplit) uses 224
200 after runtime.wbBufFlush<0> (nosplit) uses 32
168 after runtime.wbBufFlush<1> (nosplit) uses 32
120 after runtime.cgoCheckWriteBarrier<1> (nosplit) uses 48
40 after runtime.cgoIsGoPointer<1> (nosplit) uses 80
8 after runtime.inHeapOrStack<1> (nosplit) uses 32
-40 after runtime.spanOf<1> (nosplit) uses 48

Steps to reproduce

  1. use GOARCH=arm64 GOOS=linux go build -gcflags=all="-N -l"
  2. the compiler will report
        792     assumed on entry to gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRef<1> (nosplit)
        744     after gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRef<1> (nosplit) uses 48
        600     after gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRefWithDestructor<1> (nosplit) uses 144
        472     after gvisor.dev/gvisor/pkg/refs.(*weakRefList).Remove<1> (nosplit) uses 128
        456     after gvisor.dev/gvisor/pkg/refs.(*weakRefEntry).SetNext<1> (nosplit) uses 16
        232     after runtime.gcWriteBarrier<1> (nosplit) uses 224
        200     after runtime.wbBufFlush<0> (nosplit) uses 32
        168     after runtime.wbBufFlush<1> (nosplit) uses 32
        120     after runtime.cgoCheckWriteBarrier<1> (nosplit) uses 48
        40      after runtime.cgoIsGoPointer<1> (nosplit) uses 80
        8       after runtime.inHeapOrStack<1> (nosplit) uses 32
        -40     after runtime.spanOf<1> (nosplit) uses 48

if change GOARCH=amd64, can compile normally

runsc version

No response

docker version (if using docker)

No response

uname

No response

kubectl (if using Kubernetes)

No response

repo state (if built from source)

No response

runsc debug logs (if available)

No response

cc @amscanne

FYI, gcflags -N -l disables most optimizations and inlining, which (unsurprisingly) tends to increase stack usage. Typically this used with debuggers to provide more precise debugging context (fewer "" values).

yep, but for debugging I have to use these two params

I'm unable to use the debugger in Goland (which uses -N -l) on an M1 mac once our project imported gVisor.

C.f. golang/go#53942

Also testing on M1. Two ways to deal with this:

  • Quick way: newer versions of gVisor appear not to have this problem. The following commands update the gVisor module to a newer commit, and no longer have stack splitting issues. However, it looks like coder needs some changes to handle the newer version. You may have to bump the gVisor version in the 'wireguard-go' repo as well:
$ go get gvisor.dev/gvisor@d9d19f613e70
$ go test  -gcflags all="-N -l" github.com/coder/coder/coderd/devtunnel
# golang.zx2c4.com/wireguard/tun/netstack
../../go/pkg/mod/github.com/coder/wireguard-go/tun/netstack@v0.0.0-20220614153727-d82b4ba8619f/tun.go:93:15: l.Front undefined (type stack.PacketBufferList has no field or method Front)
FAIL    github.com/coder/coder/coderd/devtunnel [build failed]
FAIL
  • Longer term: building from gVisor HEAD also avoids the stack splitting issue, but it looks like coder's tailscale dependency breaks. Presumably tailscale will pull in a newer version of the gVisor repo at some point, and this issue will go away.

Let me know if either of those work for you.

Well, if it is fixed upstream, that's about all I'd reasonably expect from gVisor. Unfortunately as you've noticed, the dependency is a couple projects deep, so I can't unilaterally upgrade.

A friendly reminder that this issue had no activity for 120 days.

This issue has been closed due to lack of activity.