Corrupt data when transferring data using io.Copy in a Golang program
gmwiz opened this issue · comments
Description
To use sandboxed servers that only support accepting connections using a TCP socket, I wrote a tiny TCP-Unix domain socket bridge program in Golang, that will listen on a host Unix domain socket, and connect to a sandboxed application server that listens on a given TCP port, which is essentially the Golang equivalent of socat UNIX-LISTEN:unix.sock TCP-CONNECT:1234
. In my tests, I encountered an odd problem that I was able to later reproduce using a smaller program.
My Golang program used the io.Copy
function in 2 different Goroutines, one being io.Copy(tcpSock, unixSock)
and the other being io.Copy(unixSock, tcpSock)
. What I noticed is that after transferring all of the data, the data received within the sandbox was corrupt or that the connection was abruptly disconected. To overcome this, I modified my program to use the Reader
& Writer
interface directly. I noticed that when using io.Copy
, this causes the Golang program to call splice
internally and copy buffers between pipes & sockets, and when using Read
& Write
, this does not happen.
See reproducer below with attached runsc debug log.
Any help or guidance will be highly appreciated.
Steps to reproduce
Statically compile the following Golang program using Go 1.21.6:
cat << EOF > xfer.go
package main
import (
"context"
"fmt"
"io"
"net"
"os"
"strconv"
"sync"
)
func main() {
if len(os.Args) != 3 {
panic("Usage: <unix_listen_path> <tcp_connect_port>")
}
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
unixListenPath := os.Args[1]
tcpConnectPort, err := strconv.Atoi(os.Args[2])
if err != nil {
panic(err)
}
if err = os.Remove(unixListenPath); err != nil && !os.IsNotExist(err) {
panic(err)
}
unixListener := net.ListenConfig{}
unixServer, err := unixListener.Listen(ctx, "unix", unixListenPath)
if err != nil {
panic(err)
}
defer func() { _ = unixServer.Close() }()
unixClient, err := unixServer.Accept()
if err != nil {
panic(err)
}
defer func() { _ = unixClient.Close() }()
fmt.Println("unix client connected")
tcpDialer := net.Dialer{}
tcpClient, err := tcpDialer.DialContext(ctx, "tcp", fmt.Sprintf("127.0.0.1:%d", tcpConnectPort))
if err != nil {
panic(err)
}
defer func() { _ = tcpClient.Close() }()
fmt.Println("tcp client connected")
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
n, err := io.Copy(unixClient, tcpClient)
fmt.Printf("tcp->unix: transfered %d bytes (err: %v)\n", n, err)
}()
wg.Add(1)
go func() {
defer wg.Done()
n, err := io.Copy(tcpClient, unixClient)
fmt.Printf("unix->tcp: transfered %d bytes (err: %v)\n", n, err)
}()
fmt.Println("transfer started")
wg.Wait()
fmt.Println("transfer done")
}
EOF
docker run -it --rm -v $(pwd):/src -w /src -e CGO_ENABLED=0 golang:1.21.6-bookworm go build -ldflags "-w -s" -o xfer xfer.go
Install gVisor runtime:
sudo runsc install --runtime runsc-unix-debug -- \
--host-uds=all \
--debug \
--debug-log=/tmp/runsc-debug.log \
--strace \
--log-packets
Create a file to transfer host-blob.bin
, run sandboxed mock server and try to transfer host-blob.bin
inside sandboxed container:
sudo rm -f ./unix.sock ./host-blob.bin /tmp/runsc-debug.log
sudo dd if=/dev/urandom of=host-blob.bin bs=1M count=16
md5sum ./host-blob.bin
docker run --runtime=runsc-unix-debug -v $(pwd):/test --workdir /test --rm -it --entrypoint '' alpine/socat:1.8.0.0 sh -c 'rm -f /tmp/sandbox-blob.bin; socat TCP-LISTEN:1234 - > /tmp/sandbox-blob.bin & ./xfer ./unix.sock 1234 & wait; md5sum ./host-blob.bin /tmp/sandbox-blob.bin; ls -la ./host-blob.bin /tmp/sandbox-blob.bin'
In a different shell run:
sudo socat UNIX-CONNECT:unix.sock - < host-blob.bin
Example output:
16+0 records in
16+0 records out
16777216 bytes (17 MB, 16 MiB) copied, 0.0342477 s, 490 MB/s
52aebb94eea57f9f99394ce57afa5423 ./host-blob.bin
unix client connected
tcp client connected
transfer started
tcp->unix: transfered 0 bytes (err: <nil>)
unix->tcp: transfered 16777216 bytes (err: <nil>)
transfer done
52aebb94eea57f9f99394ce57afa5423 ./host-blob.bin
4f6ef3405ce77457524886896eee3285 /tmp/sandbox-blob.bin
-rw-r--r-- 1 root root 16777216 Jan 27 20:14 ./host-blob.bin
-rw-r--r-- 1 root root 16777216 Jan 27 20:14 /tmp/sandbox-blob.bin
As can be seen, sandbox-blob.bin
is 16MiB in size, but the hash is definitely different between host-blob.bin
and sandbox-blob.bin
.
runsc version
runsc version release-20240122.0
spec: 1.1.0-rc.1
docker version (if using docker)
Client:
Version: 20.10.25+dfsg1
API version: 1.41
Go version: go1.21.5
Git commit: b82b9f3
Built: Mon Jan 8 00:09:17 2024
OS/Arch: linux/amd64
Context: default
Experimental: true
Server:
Engine:
Version: 20.10.25+dfsg1
API version: 1.41 (minimum version 1.12)
Go version: go1.21.5
Git commit: 5df983c
Built: Mon Jan 8 00:09:17 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.24~ds1
GitCommit: 1.6.24~ds1-1
runc:
Version: 1.1.10+ds1
GitCommit: 1.1.10+ds1-1
docker-init:
Version: 0.19.0
GitCommit:
uname
Linux 6.6.9-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.6.9-1 (2024-01-01) x86_64 GNU/Linux
kubectl (if using Kubernetes)
No response
repo state (if built from source)
No response