google / gvisor

Application Kernel for Containers

Home Page:https://gvisor.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

runtime.wbBufFlush: nosplit stack overflow

mikewiacek opened this issue · comments

Description

Using go version go1.18 linux/amd64

With gvisor.dev/gvisor v0.0.0-20220319025644-e785bfc153f5

While compiling against a module that imports gvisor.dev/gvisor/pkg/seccomp (to call seccomp.Install()

I started getting these new errors at linking time:

Use --sandbox_debug to see verbose messages from the sandbox
runtime.wbBufFlush: nosplit stack overflow
	792	assumed on entry to gvisor.dev/gvisor/pkg/abi/linux.(*SigAction).CopyOut<1> (nosplit)
	600	after gvisor.dev/gvisor/pkg/abi/linux.(*SigAction).CopyOut<1> (nosplit) uses 192
	592	on entry to gvisor.dev/gvisor/pkg/abi/linux.(*SigAction).CopyOutN<1> (nosplit)
	144	after gvisor.dev/gvisor/pkg/abi/linux.(*SigAction).CopyOutN<1> (nosplit) uses 448
	136	on entry to runtime.gcWriteBarrierCX<1> (nosplit)
	128	on entry to runtime.gcWriteBarrier<1> (nosplit)
	8	after runtime.gcWriteBarrier<1> (nosplit) uses 120
	0	on entry to runtime.wbBufFlush<0> (nosplit)
	-24	after runtime.wbBufFlush<0> (nosplit) uses 24
runtime.wbBufFlush: nosplit stack overflow
	792	assumed on entry to gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPPacketInfo).CopyOut<1> (nosplit)
	600	after gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPPacketInfo).CopyOut<1> (nosplit) uses 192
	592	on entry to gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPPacketInfo).CopyOutN<1> (nosplit)
	136	after gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPPacketInfo).CopyOutN<1> (nosplit) uses 456
	128	on entry to runtime.gcWriteBarrierCX<1> (nosplit)
	120	on entry to runtime.gcWriteBarrier<1> (nosplit)
	0	after runtime.gcWriteBarrier<1> (nosplit) uses 120
	-8	on entry to runtime.wbBufFlush<0> (nosplit)
runtime.wbBufFlush: nosplit stack overflow
	792	assumed on entry to gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPv6PacketInfo).CopyOut<1> (nosplit)
	600	after gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPv6PacketInfo).CopyOut<1> (nosplit) uses 192
	592	on entry to gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPv6PacketInfo).CopyOutN<1> (nosplit)
	144	after gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPv6PacketInfo).CopyOutN<1> (nosplit) uses 448
	136	on entry to runtime.gcWriteBarrierCX<1> (nosplit)
	128	on entry to runtime.gcWriteBarrier<1> (nosplit)
	8	after runtime.gcWriteBarrier<1> (nosplit) uses 120
	0	on entry to runtime.wbBufFlush<0> (nosplit)
	-24	after runtime.wbBufFlush<0> (nosplit) uses 24
link: error running subcommand external/go_sdk/pkg/tool/linux_amd64/link: exit status 2

Steps to reproduce

No response

runsc version

No response

docker version (if using docker)

No response

uname

No response

kubectl (if using Kubernetes)

No response

repo state (if built from source)

No response

runsc debug logs (if available)

No response

Woah, why on earth does the compiler want 448 bytes of stack space for SigAction.CopyOutN? That seems ridiculous. I'm also not sure why this is being called from within write barrier functions?

CC @prattmic

Our default bazel build configuration has --@io_bazel_rules_go//go/config:race=true turned on. If I manually turn it off, the build seems to work ok. Perhaps is this expected to be incompatible? (I'm working on some code I inherited so please forgive me if I'm a little uneducated here).

I'm also not sure why this is being called from within write barrier functions?

You are reading the trace backwards. linux.(*SigAction).CopyOutN contains calls to write barriers, not the other way around.

@mikewiacek do you have a reproducer you can share? I can't reproduce this with either:

$ git checkout e785bfc153f58fd3e9fcef339d4afe8eee775a5b
$ go build -o /tmp/runsc -race gvisor.dev/gvisor/runsc
$ go version
go version go1.18 linux/amd64

or

$ git checkout dd6bf5ecdda3a3a7dec06dc398414ed5025bce52
# Modify WORKSPACE to use Go 1.18
$ bazel build --@io_bazel_rules_go//go/config:race=true //runsc

@amscanne FWIW, it surprises me that these functions are //go:nosplit. Why is that?

I think we can probably remove the nosplit. All the CopyOut/CopyIn functions are allocation free, and there's a test that verifies this [1]. We can just relax that test to allow splits, so that we ensure they remain free of other allocations.

[1] https://cs.opensource.google/gvisor/gvisor/+/master:tools/go_marshal/test/escape/escape.go;l=54?q=doCopyIn&ss=gvisor

I'm also not sure why this is being called from within write barrier functions?

You are reading the trace backwards. linux.(*SigAction).CopyOutN contains calls to write barriers, not the other way around.

@mikewiacek do you have a reproducer you can share? I can't reproduce this with either:

Let me see if I can package something up. We've got a lot of xooglers here and our monorepo will take a little work to un-intertwine. Let me see what I can do.

Let me see if I can package something up. We've got a lot of xooglers here and our monorepo will take a little work to un-intertwine. Let me see what I can do.

#7314 will resolve this by dropping the //go:nosplit annotation entirely, so I wouldn't worry about it. I don't know why CopyOutN was using so much stack, but it doesn't really surprise me that race mode uses more stack space.

(Just for curiousity's sake, my build at e785bfc seems to have linux.(*SigAction).CopyOutN using 136 bytes of stack, and with no calls to write barriers, so it oddly doesn't seem to match your code at all).

Thank you! This solved my issue!

I'm getting this error (or something like it) but it's interesting, it only happens when running my program in debug mode (using GoLand). A simple go build does not reproduce the error. I believe GoLand is running this command to build

go build -o /private/var/folders/ml/kr1jx1fj6wjgssyg3fkpgw980000gn/T/GoLand/___go_build_manager -gcflags all=-N -l . #gosetup
gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRef: nosplit stack over 792 byte limit
gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRef<1>
    grows 48 bytes, calls gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRefWithDestructor<1>
        grows 144 bytes, calls gvisor.dev/gvisor/pkg/refs.(*weakRefList).Remove<1>
            grows 128 bytes, calls gvisor.dev/gvisor/pkg/refs.(*weakRefEntry).SetNext<1>
                grows 16 bytes, calls runtime.gcWriteBarrier<1>
                    grows 224 bytes, calls runtime.wbBufFlush<0>
                        grows 32 bytes, calls runtime.wbBufFlush<1>
                            grows 32 bytes, calls runtime.cgoCheckWriteBarrier<1>
                                grows 48 bytes, calls runtime.cgoIsGoPointer<1>
                                    grows 80 bytes, calls runtime.inHeapOrStack<1>
                                        grows 32 bytes, calls runtime.spanOf<1>
                                            grows 48 bytes, calls runtime.arenaIndex<1>
                                            40 bytes over limit
                                            grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                40 bytes over limit
                                            grows 48 bytes, calls runtime.arenaIdx.l1<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                40 bytes over limit
                                            grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                40 bytes over limit
                                            grows 48 bytes, calls runtime.panicIndexU<1>
                                                grows 0 bytes, calls runtime.goPanicIndexU<1>
                                                    grows 0 bytes, calls runtime.morestack<0>
                                                    40 bytes over limit
                                            grows 48 bytes, calls runtime.panicIndexU<1>
                                                grows 0 bytes, calls runtime.goPanicIndexU<1>
                                                    grows 0 bytes, calls runtime.morestack<0>
                                                    40 bytes over limit
                                grows 48 bytes, calls runtime.cgoIsGoPointer<1>
                                    grows 80 bytes, calls runtime.inHeapOrStack<1>
                                        grows 32 bytes, calls runtime.spanOf<1>
                                            grows 48 bytes, calls runtime.arenaIndex<1>
                                            40 bytes over limit
                                            grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                40 bytes over limit
                                            grows 48 bytes, calls runtime.arenaIdx.l1<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                40 bytes over limit
                                            grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                40 bytes over limit
                                            grows 48 bytes, calls runtime.panicIndexU<1>
                                                grows 0 bytes, calls runtime.goPanicIndexU<1>
                                                    grows 0 bytes, calls runtime.morestack<0>
                                                    40 bytes over limit
                                            grows 48 bytes, calls runtime.panicIndexU<1>
                                                grows 0 bytes, calls runtime.goPanicIndexU<1>
                                                    grows 0 bytes, calls runtime.morestack<0>
                                                    40 bytes over limit
            grows 128 bytes, calls runtime.gcWriteBarrier<1>
                grows 224 bytes, calls runtime.wbBufFlush<0>
                    grows 32 bytes, calls runtime.wbBufFlush<1>
                        grows 32 bytes, calls runtime.cgoCheckWriteBarrier<1>
                            grows 48 bytes, calls runtime.cgoIsGoPointer<1>
                                grows 80 bytes, calls runtime.inHeapOrStack<1>
                                    grows 32 bytes, calls runtime.spanOf<1>
                                        grows 48 bytes, calls runtime.arenaIndex<1>
                                        24 bytes over limit
                                        grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                            grows 0 bytes, calls runtime.morestack<0>
                                            24 bytes over limit
                                        grows 48 bytes, calls runtime.arenaIdx.l1<1>
                                            grows 0 bytes, calls runtime.morestack<0>
                                            24 bytes over limit
                                        grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                            grows 0 bytes, calls runtime.morestack<0>
                                            24 bytes over limit
                                        grows 48 bytes, calls runtime.panicIndexU<1>
                                            grows 0 bytes, calls runtime.goPanicIndexU<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                24 bytes over limit
                                        grows 48 bytes, calls runtime.panicIndexU<1>
                                            grows 0 bytes, calls runtime.goPanicIndexU<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                24 bytes over limit
                            grows 48 bytes, calls runtime.cgoIsGoPointer<1>
                                grows 80 bytes, calls runtime.inHeapOrStack<1>
                                    grows 32 bytes, calls runtime.spanOf<1>
                                        grows 48 bytes, calls runtime.arenaIndex<1>
                                        24 bytes over limit
                                        grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                            grows 0 bytes, calls runtime.morestack<0>
                                            24 bytes over limit
                                        grows 48 bytes, calls runtime.arenaIdx.l1<1>
                                            grows 0 bytes, calls runtime.morestack<0>
                                            24 bytes over limit
                                        grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                            grows 0 bytes, calls runtime.morestack<0>
                                            24 bytes over limit
                                        grows 48 bytes, calls runtime.panicIndexU<1>
                                            grows 0 bytes, calls runtime.goPanicIndexU<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                24 bytes over limit
                                        grows 48 bytes, calls runtime.panicIndexU<1>
                                            grows 0 bytes, calls runtime.goPanicIndexU<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                24 bytes over limit
            grows 128 bytes, calls gvisor.dev/gvisor/pkg/refs.(*weakRefEntry).SetPrev<1>
                grows 16 bytes, calls runtime.gcWriteBarrier<1>
                    grows 224 bytes, calls runtime.wbBufFlush<0>
                        grows 32 bytes, calls runtime.wbBufFlush<1>
                            grows 32 bytes, calls runtime.cgoCheckWriteBarrier<1>
                                grows 48 bytes, calls runtime.cgoIsGoPointer<1>
                                    grows 80 bytes, calls runtime.inHeapOrStack<1>
                                        grows 32 bytes, calls runtime.spanOf<1>
                                            grows 48 bytes, calls runtime.arenaIndex<1>
                                            40 bytes over limit
                                            grows 48 bytes, calls runtime.arenaIdx.l2<1>
                                                grows 0 bytes, calls runtime.morestack<0>
                                                40 bytes over limit
                                            grows 48 bytes, calls runtime.arenaIdx.l1<1>
                                                grows 0 bytes, calls runtime.morestack<0>

Looks like it might be a bug in Go 1.19 golang/go#54291

I see that this was closed; what was the fix? There is a comment above referencing removing the nosplit pragmas, and a referenced PR, but that PR is closed without any changes, and the nosplit still are there.

@deitch I'm still seeing this issue when running gvisor in a debugger.

Same here, along with anything that imports it. I have no idea how to get around it. Did you find anything @clarkmcc ?

@deitch Sadly, no. I've just had to stop using the debugger.

Got it, thanks. I will try to reduce dependencies on gvisor except where I really need it, but sometimes I am importing things that depend on it.

Interestingly, I recently worked on a program that imported some of https://github.com/lima-vm/lima, which in turn imports gvisor. I was unable to run dlv... but I was able to do so on lima itself. I assume some default setting or such. It would be great to know how I triggered it.