goroutine leak
miparnisari opened this issue · comments
I wrote a benchmark for pool
and, unless my benchmark is wrong or my understanding of how the library works is wrong, there is a goroutine leak:
func BenchmarkPool(b *testing.B) {
b.Run("without_error", func(b *testing.B) {
fmt.Println("before", runtime.NumGoroutine())
for i := 0; i < b.N; i++ {
p := pool.New().WithMaxGoroutines(10)
for j := 0; j < 1000; j++ {
p.Go(func() {
r := rand.Intn(10)
time.Sleep(time.Duration(r) * time.Microsecond)
})
}
p.Wait()
}
fmt.Println("after", runtime.NumGoroutine())
})
If you run this: go test -v ./pool -run=XXX -bench=BenchmarkPool -count 10000
most of the times, I get
before 3
after 3
before 3
after 3
before 3
after 3
but I also saw
before 3
after 3
before 3
after 4 // <------- ?
before 3
after 3
I tested this on a Macbook Pro with Intel i5, it might be harder to reproduce on faster CPUs.
Thanks for the diligence! I expect this is simply a race condition between defer wg.Done()
and the actual exit of the goroutine. Calling wg.Done()
will wake up the waiting goroutine (the goroutine calling p.Wait()
) immediately, so if you call runtime.NumGoroutine()
between the time when wg.Done()
is called and the goroutine exits, then it will appear there is a goroutine leak. I don't think there is really anything we can do about that because there is no such thing as a goroutine "handle", so a signal right before goroutine exit is the best we can do, and that's fundamentally racy.
I did manage to reproduce once on my machine, but using a modified version of your benchmark that re-measures a few times after a failure. In my reproduction, sleeping for 1us was enough for the stray goroutine to exit.
b.Run("without_error", func(b *testing.B) {
before := runtime.NumGoroutine()
for i := 0; i < b.N; i++ {
p := pool.New().WithMaxGoroutines(10)
for j := 0; j < 1000; j++ {
p.Go(func() {
r := rand.Intn(10)
time.Sleep(time.Duration(r) * time.Microsecond)
})
}
p.Wait()
}
after := runtime.NumGoroutine()
if after != before {
time.Sleep(time.Microsecond)
after2 := runtime.NumGoroutine()
time.Sleep(time.Millisecond)
after3 := runtime.NumGoroutine()
time.Sleep(time.Second)
after4 := runtime.NumGoroutine()
// reproduction printed 3 4 3 3 3
b.Fatalf("%d %d %d %d %d", before, after, after2, after3, after4)
}
})