Data race during manager shutdown
ahmetb opened this issue · comments
I'd like to report a data race caught by go test -race
during my end-to-end test suite (which uses envtest
, but does not use Ginkgo) which ended up failing my test.
The data race has occurred during the shutdown sequence. It seems during this time, one of the reconcilers was doing an API call, which ended up getting cancelled due to ctx
cancellation I do as part of the tear down (to shut down envtest
and the Runnable
s).
I am not at all sure if it is related to controller-runtime
, as the concurrent reader/writer goroutine stacks are pointing to net/http
:
- Read routine https://github.com/golang/go/blob/go1.22.0/src/net/http/transport.go#L683
- Write routine https://github.com/golang/go/blob/go1.22.0/src/net/http/transport.go#L656
But here it goes.
2024-04-02T18:22:08Z INFO Tests finished {"duration": "2.049504249s", "exitCode": 0}
2024-04-02T18:22:08Z INFO Shutting down controllers
2024-04-02T18:22:08Z INFO Shutting down envtest
2024-04-02T18:22:08Z INFO Stopping and waiting for non leader election runnables
2024-04-02T18:22:08Z INFO Stopping and waiting for leader election runnables
2024-04-02T18:22:08Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "kubernetescontrolplane", "controllerGroup": "example.com", "controllerKind": "KubernetesControlPlane"}
2024-04-02T18:22:08Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "etcdcluster", "controllerGroup": "example.com", "controllerKind": "EtcdCluster"}
2024-04-02T18:22:08Z INFO All workers finished {"controller": "kubernetescontrolplane", "controllerGroup": "example.com", "controllerKind": "KubernetesControlPlane"}
==================
WARNING: DATA RACE
Read at 0x00c0001361c0 by goroutine 269:
net/http.rewindBody()
/export/home/tester/.go/1.22.0/src/net/http/transport.go:683 +0xcb
net/http.(*Transport).roundTrip()
/export/home/tester/.go/1.22.0/src/net/http/transport.go:557 +0x6db
net/http.(*Transport).RoundTrip()
/export/home/tester/.go/1.22.0/src/net/http/roundtrip.go:17 +0x33
k8s.io/client-go/transport.(*userAgentRoundTripper).RoundTrip()
/export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/transport/round_trippers.go:168 +0x4f6
net/http.send()
/export/home/tester/.go/1.22.0/src/net/http/client.go:259 +0x9a6
net/http.(*Client).send()
/export/home/tester/.go/1.22.0/src/net/http/client.go:180 +0x14e
net/http.(*Client).do()
/export/home/tester/.go/1.22.0/src/net/http/client.go:724 +0x11c8
net/http.(*Client).Do()
/export/home/tester/.go/1.22.0/src/net/http/client.go:590 +0x6a6
k8s.io/client-go/rest.(*Request).request()
/export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/rest/request.go:1023 +0x3ac
k8s.io/client-go/rest.(*Request).Do()
/export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/rest/request.go:1063 +0x119
sigs.k8s.io/controller-runtime/pkg/client.(*typedClient).PatchSubResource()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/client/typed_client.go:277 +0x924
sigs.k8s.io/controller-runtime/pkg/client.(*subResourceClient).Patch()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/client/client.go:595 +0x484
golnkd.in/k8s-compute-broker-controller/control-plane-manager/internal/controller.(*EtcdClusterReconciler).Reconcile()
/export/content/data/multiproduct-post-commit/i001/workspace/k8s-compute-broker-controller_a60f2169246cc0c4f7d68a8124dca5ad12006477/control-plane-manager/internal/controller/etcdcluster_controller.go:81 +0x78d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:119 +0x1a1
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:316 +0x59a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:266 +0x399
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:227 +0xb2
Previous write at 0x00c0001361c0 by goroutine 304:
net/http.(*readTrackingBody).Read()
/export/home/tester/.go/1.22.0/src/net/http/transport.go:656 +0x44
golang.org/x/net/http2.(*clientStream).writeRequestBody()
/export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1732 +0x5d9
golang.org/x/net/http2.(*clientStream).writeRequest()
/export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1440 +0xf32
golang.org/x/net/http2.(*clientStream).doRequest()
/export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1326 +0x33
golang.org/x/net/http2.(*ClientConn).RoundTrip.gowrap1()
/export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1232 +0x44
Goroutine 269 (running) created at:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:223 +0x80b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:234 +0x2b9
sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/manager/runnable_group.go:223 +0x1c6
sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.gowrap1()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/manager/runnable_group.go:226 +0x41
Goroutine 304 (running) created at:
golang.org/x/net/http2.(*ClientConn).RoundTrip()
/export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1232 +0x7d7
golang.org/x/net/http2.(*Transport).RoundTripOpt()
/export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:564 +0x644
golang.org/x/net/http2.(*Transport).RoundTrip()
/export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:513 +0x37
golang.org/x/net/http2.noDialH2RoundTripper.RoundTrip()
/export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:3136 +0x26
net/http.(*Transport).roundTrip()
/export/home/tester/.go/1.22.0/src/net/http/transport.go:553 +0x641
net/http.(*Transport).RoundTrip()
/export/home/tester/.go/1.22.0/src/net/http/roundtrip.go:17 +0x33
k8s.io/client-go/transport.(*userAgentRoundTripper).RoundTrip()
/export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/transport/round_trippers.go:168 +0x4f6
net/http.send()
/export/home/tester/.go/1.22.0/src/net/http/client.go:259 +0x9a6
net/http.(*Client).send()
/export/home/tester/.go/1.22.0/src/net/http/client.go:180 +0x14e
net/http.(*Client).do()
/export/home/tester/.go/1.22.0/src/net/http/client.go:724 +0x11c8
net/http.(*Client).Do()
/export/home/tester/.go/1.22.0/src/net/http/client.go:590 +0x6a6
k8s.io/client-go/rest.(*Request).request()
/export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/rest/request.go:1023 +0x3ac
k8s.io/client-go/rest.(*Request).Do()
/export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/rest/request.go:1063 +0x119
sigs.k8s.io/controller-runtime/pkg/client.(*typedClient).PatchSubResource()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/client/typed_client.go:277 +0x924
sigs.k8s.io/controller-runtime/pkg/client.(*subResourceClient).Patch()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/client/client.go:595 +0x484
golnkd.in/k8s-compute-broker-controller/control-plane-manager/internal/controller.(*EtcdClusterReconciler).Reconcile()
/export/content/data/multiproduct-post-commit/i001/workspace/k8s-compute-broker-controller_a60f2169246cc0c4f7d68a8124dca5ad12006477/control-plane-manager/internal/controller/etcdcluster_controller.go:81 +0x78d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:119 +0x1a1
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:316 +0x59a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:266 +0x399
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:227 +0xb2
==================
2024-04-02T18:22:08Z DEBUG Status updated {"controller": "etcdcluster", "controllerGroup": "example.com", "controllerKind": "EtcdCluster", "EtcdCluster": {"name":"test"}, "namespace": "", "name": "test", "reconcileID": "c175233f-a6c2-46c4-82af-49e68f48bdbb"}
2024-04-02T18:22:08Z ERROR Reconciler error {"controller": "etcdcluster", "controllerGroup": "example.com", "controllerKind": "EtcdCluster", "EtcdCluster": {"name":"test"}, "namespace": "", "name": "test", "reconcileID": "c175233f-a6c2-46c4-82af-49e68f48bdbb", "error": "failed to patch EtcdCluster status: Patch \"https://127.0.0.1:40187/apis/example.com/v1alpha1/etcdclusters/test/status\": context canceled", "errorCauses": [{"error": "failed to patch EtcdCluster status: Patch \"https://127.0.0.1:40187/apis/example.com/v1alpha1/etcdclusters/test/status\": context canceled"}]}
2024-04-02T18:22:08Z INFO All workers finished {"controller": "etcdcluster", "controllerGroup": "example.com", "controllerKind": "EtcdCluster"}
2024-04-02T18:22:08Z INFO Stopping and waiting for caches
2024-04-02T18:22:08Z INFO Stopping and waiting for webhooks
2024-04-02T18:22:08Z INFO Stopping and waiting for HTTP servers
2024-04-02T18:22:08Z INFO controller-runtime.metrics Shutting down metrics server with timeout of 1 minute
2024-04-02T18:22:08Z INFO Wait completed, proceeding to shutdown the manager
2024-04-02T18:22:08Z INFO Controllers stopped
2024-04-02T18:22:09Z INFO Teardown finished {"duration": "1.107332537s"}
Found 1 data race(s)
Maybe related to #2757
Are you able to update to latest v0.17 commit?