Test TestPerfReaderWakeupEvents gets stuck on some runs

Question

Test TestPerfReaderWakeupEvents gets stuck on some runs

dylandreimerink opened this issue 5 months ago · comments

Describe the bug

We seem to have a flake in the perf reader wakeup events tests / bug in the perf

=== Failed
=== FAIL: perf TestPerfReaderWakeupEvents (unknown)
panic: test timed out after 10m0s
running tests:
	TestPerfReaderWakeupEvents (10m0s)

goroutine 29 [running]:
testing.(*M).startAlarm.func1()
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:2259 +0x3b9
created by time.goFunc
	/opt/hostedtoolcache/go/1.21.8/x64/src/time/sleep.go:176 +0x2d

goroutine 1 [chan receive]:
testing.(*T).Run(0xc0000a2680, {0x71cfe6?, 0x51bbfc?}, 0x72d1a0)
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:1649 +0x3c8
testing.runTests.func1(0x94c4c0?)
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:2054 +0x3e
testing.tRunner(0xc0000a2680, 0xc0000f9bd8)
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:1595 +0xff
testing.runTests(0xc000090320?, {0x945f40, 0xd, 0xd}, {0xc0000f9c90?, 0x4105c9?, 0x94bbe0?})
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:2052 +0x445
testing.(*M).Run(0xc000090320)
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:1925 +0x636
github.com/cilium/ebpf/internal/testutils/fdtrace.TestMain(0xc0000366f0?)
	/home/runner/work/ebpf/ebpf/internal/testutils/fdtrace/fd.go:25 +0x70
github.com/cilium/ebpf/perf.TestMain(...)
	/home/runner/work/ebpf/ebpf/perf/reader_test.go:29
main.main()
	_testmain.go:77 +0x1c7

goroutine 27 [chan receive]:
testing.(*T).Parallel(0xc00026a000)
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:1403 +0x205
github.com/cilium/ebpf/perf.TestPause(0xc00026a000)
	/home/runner/work/ebpf/ebpf/perf/reader_test.go:408 +0x39
testing.tRunner(0xc00026a000, 0x72d168)
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:1595 +0xff
created by testing.(*T).Run in goroutine 1
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:1648 +0x3ad

goroutine 28 [syscall]:
syscall.Syscall6(0x1?, 0x77?, 0x1?, 0x97a120?, 0x77?, 0x0?, 0xc0000fba48?)
	/opt/hostedtoolcache/go/1.21.8/x64/src/syscall/syscall_linux.go:91 +0x30
golang.org/x/sys/unix.EpollWait(0x0?, {0xc0000d2150?, 0x446411?, 0xc00009a4e0?}, 0xc0000fba18?)
	/home/runner/go/pkg/mod/golang.org/x/sys@v0.15.0/unix/zsyscall_linux_amd64.go:55 +0x4f
github.com/cilium/ebpf/internal/unix.EpollWait(...)
	/home/runner/work/ebpf/ebpf/internal/unix/types_linux.go:129
github.com/cilium/ebpf/internal/epoll.(*Poller).Wait(0xc0000cc5a0, {0xc0000d2150?, 0x2, 0x2}, {0xc0000fbab0?, 0x1?, 0x0?})
	/home/runner/work/ebpf/ebpf/internal/epoll/poller.go:145 +0x2a5
github.com/cilium/ebpf/perf.(*Reader).ReadInto(0xc0000d46c0, 0x1?)
	/home/runner/work/ebpf/ebpf/perf/reader.go:362 +0x2c5
github.com/cilium/ebpf/perf.(*Reader).Read(...)
	/home/runner/work/ebpf/ebpf/perf/reader.go:336
github.com/cilium/ebpf/perf.checkRecord({0x785618, 0xc00026a1a0}, 0xf?)
	/home/runner/work/ebpf/ebpf/perf/reader_test.go:167 +0x6e
github.com/cilium/ebpf/perf.TestPerfReaderWakeupEvents(0xc00026a1a0)
	/home/runner/work/ebpf/ebpf/perf/reader_test.go:528 +0x527
testing.tRunner(0xc00026a1a0, 0x72d1a0)
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:1595 +0xff
created by testing.(*T).Run in goroutine 1
	/opt/hostedtoolcache/go/1.21.8/x64/src/testing/testing.go:1648 +0x3ad

How to reproduce

This seems fairly reproducible on my local machine when running

go test -exec sudo -timeout 5s -count 50 -v -run ^TestPerfReaderWakeupEvents$ github.com/cilium/ebpf/perf

Version information

main@9c1d099873a8

Dylan Reimerink · Answer 1 · Fri Apr 05 2024 21:53:42 GMT+0800 (China Standard Time)

I have been playing around with this a bit. The flaky behavior seems to originate in the kernels WakeupEvents logic. I have not looked into the kernel code yet, but the current test fails from time to time until I always add the WakeupEvents + 1 amount of events, then it consistently passes.

	// send followup events
	for i := 1; i < numEvents+1; i++ {
		_, _, err = prog.Test(internal.EmptyBPFContext)
		if err != nil {
			t.Fatal(err)
		}
	}

So perhaps this has to do with memory alignment of the map or something like that. I have tried varying the numEvents and sampleSize but changes there don't seem to change anything.

Dylan Reimerink · Answer 2 · Fri Apr 05 2024 22:44:51 GMT+0800 (China Standard Time)

I think I found the cause. The WakeupEvents limit is per ring, one per CPU. And when we execute BPF_PROG_RUN multiple times, we sometimes write 2 messages to different rings. If I log the CPU ID of the first and the followup events I see:

=== RUN   TestPerfReaderWakeupEvents
ret 7
ret 7
--- PASS: TestPerfReaderWakeupEvents (0.01s)
=== RUN   TestPerfReaderWakeupEvents
ret 7
ret 7
--- PASS: TestPerfReaderWakeupEvents (0.01s)
=== RUN   TestPerfReaderWakeupEvents
ret 7
ret 7
--- PASS: TestPerfReaderWakeupEvents (0.01s)
=== RUN   TestPerfReaderWakeupEvents
ret 7
ret 7
--- PASS: TestPerfReaderWakeupEvents (0.01s)
=== RUN   TestPerfReaderWakeupEvents
ret 7
ret 0
panic: test timed out after 1s

The numbers changes from run to run, and its seems pure luck that the +1 I mentioned earlier happens to land on the same CPU as one of the once before.

A potential fix would be to add the following to the start of the test:

import extUnix "golang.org/x/sys/unix"

...

func TestPerfReaderWakeupEvents(t *testing.T) {
	// Lock goroutine to thread
	runtime.LockOSThread()
	defer runtime.UnlockOSThread()

	// Save CPU affinity
	var set extUnix.CPUSet
	err := extUnix.SchedGetaffinity(0, &set)
	qt.Assert(t, qt.IsNil(err))
	// Schedule test to run on only CPU 0
	err = extUnix.SchedSetaffinity(0, &extUnix.CPUSet{1})
	qt.Assert(t, qt.IsNil(err))
	// Restore CPU affinity
	defer extUnix.SchedSetaffinity(0, &set)

Perhaps there are other alternatives (this doesn't win any beauty awards)

Bryce Kahle · Answer 3 · Sat Apr 06 2024 01:58:32 GMT+0800 (China Standard Time)

Could we send numCPUs * WakeupEvents events to ensure that at least one CPU gets woken up?

Dylan Reimerink · Answer 4 · Sat Apr 06 2024 02:16:30 GMT+0800 (China Standard Time)

Yea, that should also work, but I don't know if that defeats the purpose of the test, in my case you would be enqueue'ing 16 events to test a 2 event limit.

Bryce Kahle · Answer 5 · Sat Apr 06 2024 02:21:01 GMT+0800 (China Standard Time)

The test was more for making sure it didn't wakeup after 1 event.

Bryce Kahle · Answer 6 · Sat Apr 06 2024 02:22:37 GMT+0800 (China Standard Time)

I'm not sure we can control the CPU the eBPF program actually runs on by controlling the affinity of the userspace program.

Dylan Reimerink · Answer 7 · Sat Apr 06 2024 02:32:15 GMT+0800 (China Standard Time)

I'm not sure we can control the CPU the eBPF program actually runs on by controlling the affinity of the userspace program.

I tested the code I showed seems to work, at least locally. By default the BPF program executes on the CPU making the syscall. Although that isn't official so not guaranteed.

The Program.Run also has a parameter to pick a CPU to run on, but looking at the kernel, it only works for raw tracepoint programs, so if we can change the program type for our sample prog, then that might be an option. (torvalds/linux@1b4d60e)

Bryce Kahle · Answer 8 · Sat Apr 06 2024 02:44:00 GMT+0800 (China Standard Time)

it only works for raw tracepoint programs

That would constrain what kernel versions we can test on though.

Lorenz Bauer · Answer 9 · Mon Apr 08 2024 17:36:55 GMT+0800 (China Standard Time)

I'd be fine with both solutions. I remember that we have the same problem (samples submitted on the "wrong" CPU) in other places as well. Maybe we could reuse the user space code.

I think it's also fine to constrain this to a smaller number of kernel versions: we're testing that the plumbing we have ~ works. We don't need to / want to assert that the kernel isn't doing dodgy things (as we'd never see the end of it 😆 ).