yarpc / yarpc-go

A message passing platform for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data race in gRPC / HTTP 2 stack

kriskowal opened this issue · comments

Encountered this rare data-race in the test suite.

ERROR: 2018/01/16 23:16:56 pickfirstBalancer: failed to NewSubConn: grpc: the client connection is closing
ERROR: 2018/01/16 23:16:56 pickfirstBalancer: failed to NewSubConn: grpc: the client connection is closing
==================
WARNING: DATA RACE
Write at 0x00c423035ff8 by goroutine 84:
  runtime.slicecopy()
      /usr/local/go/src/runtime/slice.go:160 +0x0
  bytes.(*Reader).Read()
      /usr/local/go/src/bytes/reader.go:43 +0x135
  bytes.(*Buffer).ReadFrom()
      /usr/local/go/src/bytes/buffer.go:209 +0x1dd
  go.uber.org/yarpc/transport/grpc.(*Outbound).invoke()
      /go/src/go.uber.org/yarpc/transport/grpc/outbound.go:137 +0x104
  go.uber.org/yarpc/transport/grpc.(*Outbound).Call()
      /go/src/go.uber.org/yarpc/transport/grpc/outbound.go:111 +0x1d4
  go.uber.org/yarpc/encoding/protobuf.(*client).Call()
      /go/src/go.uber.org/yarpc/encoding/protobuf/outbound.go:92 +0x1d7
  go.uber.org/yarpc/internal/examples/protobuf/examplepb.(*_KeyValueYARPCCaller).SetValue()
      /go/src/go.uber.org/yarpc/internal/examples/protobuf/examplepb/example.pb.yarpc.go:113 +0xe7
  go.uber.org/yarpc/transport/grpc.(*testEnv).SetValueYARPC()
      /go/src/go.uber.org/yarpc/transport/grpc/integration_test.go:357 +0x1aa
  go.uber.org/yarpc/transport/grpc.TestYARPCMaxMsgSize.func2()
      /go/src/go.uber.org/yarpc/transport/grpc/integration_test.go:166 +0xc0
  go.uber.org/yarpc/transport/grpc.doWithTestEnv()
      /go/src/go.uber.org/yarpc/transport/grpc/integration_test.go:214 +0x14a
  go.uber.org/yarpc/transport/grpc.TestYARPCMaxMsgSize()
      /go/src/go.uber.org/yarpc/transport/grpc/integration_test.go:160 +0x350
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:746 +0x16c

Previous read at 0x00c423035fff by goroutine 87:
  runtime.slicecopy()
      /usr/local/go/src/runtime/slice.go:160 +0x0
  go.uber.org/yarpc/vendor/golang.org/x/net/http2.(*Framer).WriteDataPadded()
      /go/src/go.uber.org/yarpc/vendor/golang.org/x/net/http2/frame.go:683 +0x35c
  go.uber.org/yarpc/vendor/golang.org/x/net/http2.(*Framer).WriteData()
      /go/src/go.uber.org/yarpc/vendor/golang.org/x/net/http2/frame.go:643 +0x8f
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.(*http2Client).itemHandler()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_client.go:1248 +0x13b3
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.(*http2Client).(go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.itemHandler)-fm()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_client.go:305 +0x55
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.loopyWriter()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/transport.go:742 +0x1dc
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.newHTTP2Client.func3()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_client.go:305 +0xc3

Goroutine 84 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:789 +0x568
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1004 +0xa7
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:746 +0x16c
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1002 +0x521
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:921 +0x206
  main.main()
      go.uber.org/yarpc/transport/grpc/_test/_testmain.go:158 +0x1d3

Goroutine 87 (finished) created at:
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.newHTTP2Client()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_client.go:304 +0x17f5
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.NewClientTransport()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/transport.go:518 +0xd7
  go.uber.org/yarpc/vendor/google.golang.org/grpc.(*addrConn).createTransport()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/clientconn.go:1139 +0x3f9
  go.uber.org/yarpc/vendor/google.golang.org/grpc.(*addrConn).resetTransport()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/clientconn.go:1100 +0x7b5
  go.uber.org/yarpc/vendor/google.golang.org/grpc.(*addrConn).connect.func1()
      /go/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/clientconn.go:829 +0x3c

Hoping this is familiar territory for @peter-edge or @willhug.

There is a real race in the grpc transport, this was also caught by #1408:

WARNING: DATA RACE
Write at 0x00c427841fe8 by goroutine 187:
  go.uber.org/yarpc/internal/bufferpool.overwriteData()
      /Users/prashant/gocode/src/go.uber.org/yarpc/internal/bufferpool/buffer.go:140 +0x45
  go.uber.org/yarpc/internal/bufferpool.(*Buffer).Release()
      /Users/prashant/gocode/src/go.uber.org/yarpc/internal/bufferpool/buffer.go:131 +0x110
  go.uber.org/yarpc/internal/bufferpool.Put()
      /Users/prashant/gocode/src/go.uber.org/yarpc/internal/bufferpool/bufferpool.go:87 +0x38
  go.uber.org/yarpc/transport/grpc.(*responseWriter).Close()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/response_writer.go:74 +0xb3
  go.uber.org/yarpc/transport/grpc.(*handler).handleUnary()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/handler.go:191 +0x3eb
  go.uber.org/yarpc/transport/grpc.(*handler).handle()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/handler.go:75 +0x612
  go.uber.org/yarpc/transport/grpc.(*handler).(go.uber.org/yarpc/transport/grpc.handle)-fm()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/inbound.go:99 +0x6d
  go.uber.org/yarpc/vendor/google.golang.org/grpc.(*Server).processStreamingRPC()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/server.go:1032 +0x1334
  go.uber.org/yarpc/vendor/google.golang.org/grpc.(*Server).handleStream()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/server.go:1100 +0x1d6f
  go.uber.org/yarpc/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/server.go:617 +0xac

Previous read at 0x00c427841fe8 by goroutine 193:
  runtime.slicecopy()
      /usr/local/Cellar/go/1.9.2/libexec/src/runtime/slice.go:160 +0x0
  go.uber.org/yarpc/vendor/golang.org/x/net/http2.(*Framer).WriteDataPadded()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/golang.org/x/net/http2/frame.go:683 +0x35c
  go.uber.org/yarpc/vendor/golang.org/x/net/http2.(*Framer).WriteData()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/golang.org/x/net/http2/frame.go:643 +0x8f
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.(*http2Server).itemHandler()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_server.go:1012 +0x1955
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.(*http2Server).(go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.itemHandler)-fm()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_server.go:256 +0x55
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.loopyWriter()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/transport.go:733 +0x1dc
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.newHTTP2Server.func1()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_server.go:256 +0xc3

Also happened with a slightly different stack

==================
WARNING: DATA RACE
Write at 0x00c4229a0000 by goroutine 94:
  go.uber.org/yarpc/internal/bufferpool.overwriteData()
      /Users/prashant/gocode/src/go.uber.org/yarpc/internal/bufferpool/buffer.go:140 +0x45
  go.uber.org/yarpc/internal/bufferpool.(*Buffer).Release()
      /Users/prashant/gocode/src/go.uber.org/yarpc/internal/bufferpool/buffer.go:131 +0x110
  go.uber.org/yarpc/internal/bufferpool.Put()
      /Users/prashant/gocode/src/go.uber.org/yarpc/internal/bufferpool/bufferpool.go:87 +0x38
  go.uber.org/yarpc/transport/grpc.(*Outbound).invoke()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/outbound.go:175 +0x703
  go.uber.org/yarpc/transport/grpc.(*Outbound).Call()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/outbound.go:111 +0x1d4
  go.uber.org/yarpc/encoding/protobuf.(*client).Call()
      /Users/prashant/gocode/src/go.uber.org/yarpc/encoding/protobuf/outbound.go:92 +0x1d7
  go.uber.org/yarpc/internal/examples/protobuf/examplepb.(*_KeyValueYARPCCaller).SetValue()
      /Users/prashant/gocode/src/go.uber.org/yarpc/internal/examples/protobuf/examplepb/example.pb.yarpc.go:113 +0xe7
  go.uber.org/yarpc/transport/grpc.(*testEnv).SetValueYARPC()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/integration_test.go:357 +0x1aa
  go.uber.org/yarpc/transport/grpc.TestYARPCMaxMsgSize.func1()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/integration_test.go:158 +0xb2
  go.uber.org/yarpc/transport/grpc.doWithTestEnv()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/integration_test.go:214 +0x14a
  go.uber.org/yarpc/transport/grpc.TestYARPCMaxMsgSize()
      /Users/prashant/gocode/src/go.uber.org/yarpc/transport/grpc/integration_test.go:157 +0x164
  testing.tRunner()
      /usr/local/Cellar/go/1.9.2/libexec/src/testing/testing.go:746 +0x16c

Previous read at 0x00c4229a0000 by goroutine 159:
  runtime.slicecopy()
      /usr/local/Cellar/go/1.9.2/libexec/src/runtime/slice.go:160 +0x0
  go.uber.org/yarpc/vendor/golang.org/x/net/http2.(*Framer).WriteDataPadded()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/golang.org/x/net/http2/frame.go:683 +0x35c
  go.uber.org/yarpc/vendor/golang.org/x/net/http2.(*Framer).WriteData()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/golang.org/x/net/http2/frame.go:643 +0x8f
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.(*http2Client).itemHandler()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_client.go:1249 +0x124b
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.(*http2Client).(go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.itemHandler)-fm()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_client.go:304 +0x55
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.loopyWriter()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/transport.go:733 +0x1dc
  go.uber.org/yarpc/vendor/google.golang.org/grpc/transport.newHTTP2Client.func3()
      /Users/prashant/gocode/src/go.uber.org/yarpc/vendor/google.golang.org/grpc/transport/http2_client.go:304 +0xc3

I've done through this, this is a complicated one because it seems like the buffer is being properly recycled in the same context - what commit/how to reproduce exactly?

I've got this reproduced on #1408 after rebasing dev and using go test -race ./transport/grpc, investigating.

More thoughts: this only happens on TestYARPCMaxMsgSize when I'm sending an unlimited message size and sending a buffer of size 8MB. I also think this is an issue internal to the grpc-go repository after some investigation, I think it has to do with https://godoc.org/google.golang.org/grpc#Stream SendMsg.

Will continue looking, there's a warning on SendMsg that But it is not safe to call SendMsg on the same stream in different goroutines.

I've set up a branch data-race to test this. After some semi-binary searching, I've seem to found that this happens around the 16384 byte barrer - anything less than 16334 consistently passes, anything greater than 16424 consistently fails, stuff in the middle is flaky. This seems to indicate a buffer growing issue. Will continue investigating

$ git diff pool_race
diff --git a/transport/grpc/integration_test.go b/transport/grpc/integration_test.go
index a563ec4d..f7b18964 100644
--- a/transport/grpc/integration_test.go
+++ b/transport/grpc/integration_test.go
@@ -170,6 +170,17 @@ func TestYARPCMaxMsgSize(t *testing.T) {
        })
 }

+func TestDataRace(t *testing.T) {
+       size := 16284 // change me
+       value := strings.Repeat("a", size)
+       doWithTestEnv(t, nil, nil, nil, func(t *testing.T, e *testEnv) {
+               assert.NoError(t, e.SetValueYARPC(context.Background(), "foo", value))
+               getValue, err := e.GetValueYARPC(context.Background(), "foo")
+               assert.NoError(t, err)
+               assert.Equal(t, value, getValue)
+       })
+}
+
 func TestApplicationErrorPropagation(t *testing.T) {
        t.Parallel()
        doWithTestEnv(t, nil, nil, nil, func(t *testing.T, e *testEnv) {

$ go test -run DataRace -race ./transport/grpc
ok  	go.uber.org/yarpc/transport/grpc	1.051s

I think I figured out the issue #1416