runtime.wbBufFlush: nosplit stack overflow
mikewiacek opened this issue · comments
Description
Using go version go1.18 linux/amd64
With gvisor.dev/gvisor v0.0.0-20220319025644-e785bfc153f5
While compiling against a module that imports gvisor.dev/gvisor/pkg/seccomp (to call seccomp.Install()
I started getting these new errors at linking time:
Use --sandbox_debug to see verbose messages from the sandbox
runtime.wbBufFlush: nosplit stack overflow
792 assumed on entry to gvisor.dev/gvisor/pkg/abi/linux.(*SigAction).CopyOut<1> (nosplit)
600 after gvisor.dev/gvisor/pkg/abi/linux.(*SigAction).CopyOut<1> (nosplit) uses 192
592 on entry to gvisor.dev/gvisor/pkg/abi/linux.(*SigAction).CopyOutN<1> (nosplit)
144 after gvisor.dev/gvisor/pkg/abi/linux.(*SigAction).CopyOutN<1> (nosplit) uses 448
136 on entry to runtime.gcWriteBarrierCX<1> (nosplit)
128 on entry to runtime.gcWriteBarrier<1> (nosplit)
8 after runtime.gcWriteBarrier<1> (nosplit) uses 120
0 on entry to runtime.wbBufFlush<0> (nosplit)
-24 after runtime.wbBufFlush<0> (nosplit) uses 24
runtime.wbBufFlush: nosplit stack overflow
792 assumed on entry to gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPPacketInfo).CopyOut<1> (nosplit)
600 after gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPPacketInfo).CopyOut<1> (nosplit) uses 192
592 on entry to gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPPacketInfo).CopyOutN<1> (nosplit)
136 after gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPPacketInfo).CopyOutN<1> (nosplit) uses 456
128 on entry to runtime.gcWriteBarrierCX<1> (nosplit)
120 on entry to runtime.gcWriteBarrier<1> (nosplit)
0 after runtime.gcWriteBarrier<1> (nosplit) uses 120
-8 on entry to runtime.wbBufFlush<0> (nosplit)
runtime.wbBufFlush: nosplit stack overflow
792 assumed on entry to gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPv6PacketInfo).CopyOut<1> (nosplit)
600 after gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPv6PacketInfo).CopyOut<1> (nosplit) uses 192
592 on entry to gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPv6PacketInfo).CopyOutN<1> (nosplit)
144 after gvisor.dev/gvisor/pkg/abi/linux.(*ControlMessageIPv6PacketInfo).CopyOutN<1> (nosplit) uses 448
136 on entry to runtime.gcWriteBarrierCX<1> (nosplit)
128 on entry to runtime.gcWriteBarrier<1> (nosplit)
8 after runtime.gcWriteBarrier<1> (nosplit) uses 120
0 on entry to runtime.wbBufFlush<0> (nosplit)
-24 after runtime.wbBufFlush<0> (nosplit) uses 24
link: error running subcommand external/go_sdk/pkg/tool/linux_amd64/link: exit status 2
Steps to reproduce
No response
runsc version
No response
docker version (if using docker)
No response
uname
No response
kubectl (if using Kubernetes)
No response
repo state (if built from source)
No response
runsc debug logs (if available)
No response
Woah, why on earth does the compiler want 448 bytes of stack space for SigAction.CopyOutN? That seems ridiculous. I'm also not sure why this is being called from within write barrier functions?
CC @prattmic
Our default bazel build configuration has --@io_bazel_rules_go//go/config:race=true
turned on. If I manually turn it off, the build seems to work ok. Perhaps is this expected to be incompatible? (I'm working on some code I inherited so please forgive me if I'm a little uneducated here).
I'm also not sure why this is being called from within write barrier functions?
You are reading the trace backwards. linux.(*SigAction).CopyOutN
contains calls to write barriers, not the other way around.
@mikewiacek do you have a reproducer you can share? I can't reproduce this with either:
$ git checkout e785bfc153f58fd3e9fcef339d4afe8eee775a5b
$ go build -o /tmp/runsc -race gvisor.dev/gvisor/runsc
$ go version
go version go1.18 linux/amd64
or
$ git checkout dd6bf5ecdda3a3a7dec06dc398414ed5025bce52
# Modify WORKSPACE to use Go 1.18
$ bazel build --@io_bazel_rules_go//go/config:race=true //runsc
@amscanne FWIW, it surprises me that these functions are //go:nosplit
. Why is that?
I think we can probably remove the nosplit. All the CopyOut/CopyIn functions are allocation free, and there's a test that verifies this [1]. We can just relax that test to allow splits, so that we ensure they remain free of other allocations.
I'm also not sure why this is being called from within write barrier functions?
You are reading the trace backwards.
linux.(*SigAction).CopyOutN
contains calls to write barriers, not the other way around.@mikewiacek do you have a reproducer you can share? I can't reproduce this with either:
Let me see if I can package something up. We've got a lot of xooglers here and our monorepo will take a little work to un-intertwine. Let me see what I can do.
Let me see if I can package something up. We've got a lot of xooglers here and our monorepo will take a little work to un-intertwine. Let me see what I can do.
#7314 will resolve this by dropping the //go:nosplit
annotation entirely, so I wouldn't worry about it. I don't know why CopyOutN
was using so much stack, but it doesn't really surprise me that race mode uses more stack space.
(Just for curiousity's sake, my build at e785bfc seems to have linux.(*SigAction).CopyOutN
using 136 bytes of stack, and with no calls to write barriers, so it oddly doesn't seem to match your code at all).
Thank you! This solved my issue!
I'm getting this error (or something like it) but it's interesting, it only happens when running my program in debug mode (using GoLand). A simple go build
does not reproduce the error. I believe GoLand is running this command to build
go build -o /private/var/folders/ml/kr1jx1fj6wjgssyg3fkpgw980000gn/T/GoLand/___go_build_manager -gcflags all=-N -l . #gosetup
gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRef: nosplit stack over 792 byte limit
gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRef<1>
grows 48 bytes, calls gvisor.dev/gvisor/pkg/refs.(*AtomicRefCount).DecRefWithDestructor<1>
grows 144 bytes, calls gvisor.dev/gvisor/pkg/refs.(*weakRefList).Remove<1>
grows 128 bytes, calls gvisor.dev/gvisor/pkg/refs.(*weakRefEntry).SetNext<1>
grows 16 bytes, calls runtime.gcWriteBarrier<1>
grows 224 bytes, calls runtime.wbBufFlush<0>
grows 32 bytes, calls runtime.wbBufFlush<1>
grows 32 bytes, calls runtime.cgoCheckWriteBarrier<1>
grows 48 bytes, calls runtime.cgoIsGoPointer<1>
grows 80 bytes, calls runtime.inHeapOrStack<1>
grows 32 bytes, calls runtime.spanOf<1>
grows 48 bytes, calls runtime.arenaIndex<1>
40 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l1<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.panicIndexU<1>
grows 0 bytes, calls runtime.goPanicIndexU<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.panicIndexU<1>
grows 0 bytes, calls runtime.goPanicIndexU<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.cgoIsGoPointer<1>
grows 80 bytes, calls runtime.inHeapOrStack<1>
grows 32 bytes, calls runtime.spanOf<1>
grows 48 bytes, calls runtime.arenaIndex<1>
40 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l1<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.panicIndexU<1>
grows 0 bytes, calls runtime.goPanicIndexU<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.panicIndexU<1>
grows 0 bytes, calls runtime.goPanicIndexU<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 128 bytes, calls runtime.gcWriteBarrier<1>
grows 224 bytes, calls runtime.wbBufFlush<0>
grows 32 bytes, calls runtime.wbBufFlush<1>
grows 32 bytes, calls runtime.cgoCheckWriteBarrier<1>
grows 48 bytes, calls runtime.cgoIsGoPointer<1>
grows 80 bytes, calls runtime.inHeapOrStack<1>
grows 32 bytes, calls runtime.spanOf<1>
grows 48 bytes, calls runtime.arenaIndex<1>
24 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l1<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.panicIndexU<1>
grows 0 bytes, calls runtime.goPanicIndexU<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.panicIndexU<1>
grows 0 bytes, calls runtime.goPanicIndexU<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.cgoIsGoPointer<1>
grows 80 bytes, calls runtime.inHeapOrStack<1>
grows 32 bytes, calls runtime.spanOf<1>
grows 48 bytes, calls runtime.arenaIndex<1>
24 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l1<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.panicIndexU<1>
grows 0 bytes, calls runtime.goPanicIndexU<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 48 bytes, calls runtime.panicIndexU<1>
grows 0 bytes, calls runtime.goPanicIndexU<1>
grows 0 bytes, calls runtime.morestack<0>
24 bytes over limit
grows 128 bytes, calls gvisor.dev/gvisor/pkg/refs.(*weakRefEntry).SetPrev<1>
grows 16 bytes, calls runtime.gcWriteBarrier<1>
grows 224 bytes, calls runtime.wbBufFlush<0>
grows 32 bytes, calls runtime.wbBufFlush<1>
grows 32 bytes, calls runtime.cgoCheckWriteBarrier<1>
grows 48 bytes, calls runtime.cgoIsGoPointer<1>
grows 80 bytes, calls runtime.inHeapOrStack<1>
grows 32 bytes, calls runtime.spanOf<1>
grows 48 bytes, calls runtime.arenaIndex<1>
40 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l2<1>
grows 0 bytes, calls runtime.morestack<0>
40 bytes over limit
grows 48 bytes, calls runtime.arenaIdx.l1<1>
grows 0 bytes, calls runtime.morestack<0>
Looks like it might be a bug in Go 1.19 golang/go#54291
I see that this was closed; what was the fix? There is a comment above referencing removing the nosplit
pragmas, and a referenced PR, but that PR is closed without any changes, and the nosplit
still are there.
@deitch I'm still seeing this issue when running gvisor in a debugger.
Same here, along with anything that imports it. I have no idea how to get around it. Did you find anything @clarkmcc ?
@deitch Sadly, no. I've just had to stop using the debugger.
Got it, thanks. I will try to reduce dependencies on gvisor except where I really need it, but sometimes I am importing things that depend on it.
Interestingly, I recently worked on a program that imported some of https://github.com/lima-vm/lima, which in turn imports gvisor. I was unable to run dlv
... but I was able to do so on lima itself. I assume some default setting or such. It would be great to know how I triggered it.