Attach leaks a goroutine
saschagrunert opened this issue · comments
What happened?
We print the active goroutines after each integration test. It looks like that the util.CopyDetachable
still leaves a running goroutine open:
2022-07-20T14:16:05.7893221Z 2 @ 0x44c076 0x45d112 0x4b882e 0x4b92e5 0x654416 0x6dc739 0x47dfc1
2022-07-20T14:16:05.7893648Z # 0x4b882d io.(*pipe).read+0x14d /opt/hostedtoolcache/go/1.18.4/x64/src/io/pipe.go:57
2022-07-20T14:16:05.7894140Z # 0x4b92e4 io.(*PipeReader).Read+0x64 /opt/hostedtoolcache/go/1.18.4/x64/src/io/pipe.go:136
2022-07-20T14:16:05.7895004Z # 0x654415 github.com/containers/common/pkg/util.CopyDetachable+0xb5 /home/runner/go/pkg/mod/github.com/containers/common@v0.48.1-0.20220705175712-dd1c331887b9/pkg/util/copy.go:16
2022-07-20T14:16:05.7895944Z # 0x6dc738 github.com/containers/conmon-rs/pkg/client.(*ConmonClient).setupStdioChannels.func2+0xb8 /home/runner/work/conmon-rs/conmon-rs/pkg/client/attach.go:201
conmon-rs/pkg/client/attach.go
Lines 198 to 204 in 13d4828
We should find a way to cleanup the goroutine on server shutdown as well.
What did you expect to happen?
No running goroutines for attach after the integration tests.
How can we reproduce it (as minimally and precisely as possible)?
Running the integration tests reveals the issue by printing the goroutines at the end.
Anything else we need to know?
No
conmon-rs version
$ conmonrs --version
version: 0.1.0-dev
tag: none
commit: 68252d0373b8e262bdf7ff8780fb2b5ab6f66c29
build: 2022-07-20 14:13:13 +00:00
rustc 1.62.1 (e092d0b6b 2022-07-16)
OS version
Not relevant
Additional environment details (AWS, VirtualBox, physical, etc.)
No
Is conn
closed?
Looks like it blocks on read() and I would expect read to return once the connection is closed.
On a second look not conn must be closed but rather cfg.Streams.Stdin
since we read from there.
@Luap99 yeah we also block on read in the tests, but I see that as not critical:
conmon-rs/pkg/client/suite_test.go
Lines 379 to 396 in cc1ea23
Do you consider working on a fix for that?
I assume cfg.Streams.Stdin
is supposed to be closed by the caller, i.e. the test here?
Also the goroutines in the ginkgo tests need a defer GinkgoRecover()
at the top in order to work properly, https://onsi.github.io/ginkgo/#mental-model-how-ginkgo-handles-failure
I know next to nothing about the conmon-rs internal but the test is not working.
change the testAttach function to in order to actually make sure it is executed:
func testAttach(stdinWrite io.Writer, stdoutRead, stderrRead io.Reader) {
wg := &sync.WaitGroup{}
wg.Add(3)
// Stdin
stdoutBuf := bufio.NewReader(stdoutRead)
stderrBuf := bufio.NewReader(stderrRead)
go func() {
defer GinkgoRecover()
defer wg.Done()
_, err := fmt.Fprintf(stdinWrite, "/busybox echo Hello world\r")
Expect(err).To(BeNil())
_, err = fmt.Fprintf(stdinWrite, "/busybox echo Hello world >&2\r\n")
Expect(err).To(BeNil())
fmt.Println("stdin done")
}()
go func() {
defer GinkgoRecover()
defer wg.Done()
// Stdout test
line, err := stdoutBuf.ReadString('\n')
Expect(err).To(BeNil())
fmt.Println(line)
Expect(line).To(ContainSubstring("Hello world"))
fmt.Println("stdout done")
}()
go func() {
defer GinkgoRecover()
defer wg.Done()
line, err := stderrBuf.ReadString('\n')
Expect(err).To(BeNil())
fmt.Println(line)
Expect(line).To(ContainSubstring("Hello world"))
fmt.Println("stderr done")
}()
wg.Wait()
}
There seems to be at least two problems that I can see:
- the terminal test cannot work with both stdour/err since a terminal only uses one stream.
It is failing with and just hangs.
2022-07-21T14:47:14.725434Z ERROR backend:create_container{container_id="7b5d1daad76b330d663e0cb73524170cff34f98a5a4c417ac4455126f5be6927" uuid="17e3bede-4368-4a3e-a2cf-a63e9d39d137"}:listen:read_loop: conmon::terminal: 210: Stdout read loop failure: write to attach endpoints: write to attach endpoint: Broken pipe (os error 32)
- The test without terminal read the input but never sends any output back and therefore also hangs
@Luap99 thank you for the investigations here. I think we can split-up the tests to test the terminal / non terminal cases in a more separated fashion.
I think the tests are still broken, even with #579. I have to dig more into that.