golang / go

The Go programming language

Home Page:https://go.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

net: use splice for TCPConn.ReadFrom on Linux

philhofer opened this issue · comments

commented

sendfile only allows the source to be a mmap-able file descriptor. Using splice in the implementation of (net.Conn).ReadFrom allows net.Conn to get the same sort of performance benefits as *os.File when io.Copy is used. (In theory, the src can be any fd.)

Pros:

  • Uses the fastest (AFAIK) socket -> socket method available on Linux. (Fundamentally, this is how haproxy works.)
  • Can be non-blocking (in the sense that both sockets are talking to the netpoller.)
  • Transparent perf improvements to existing users of io.Copy.

Cons:

  • Increased implementation complexity: requires two calls to splice, and a call to pipe, along with associated pd.WaitRead()/pd.WaitWrite() business.
  • Architecture-specific.

For people writing proxies in Go, this kind of optimization could be huge.

If this sounds agreeable, I can send in a patch next week.

Can you provide some benchmark numbers to validate this. In the past the
grossness of layering violations have scuttled similar proposals.

On Tue, 26 May 2015 08:51 Philip Hofer notifications@github.com wrote:

sendfile only allows the source to be a mmap-able file descriptor. Using
splice in the implementation of (net.Conn).ReadFrom allows net.Conn to
get the same sort of performance benefits as *os.File when io.Copy is
used. (In theory, the src can be any fd.)

Pros:

  • Uses the fastest (AFAIK) socket -> socket method available on Linux.
    (Fundamentally, this is how haproxy works.)
  • Can be non-blocking (in the sense that both sockets are talking to
    the netpoller.)
  • Transparent perf improvements to existing users of io.Copy.

Cons:

  • Increased implementation complexity: requires two calls to splice,
    and a call to pipe, along with associated pd.WaitRead()/pd.WaitWrite()
    business.
  • Architecture-specific.

For people writing proxies in Go https://github.com/mailgun/vulcand,
this kind of optimization could be huge.

If this sounds agreeable, I can send in a patch next week.


Reply to this email directly or view it on GitHub
#10948.

My impression is that the splice system call won't do the right thing with non-blocking network connections. It will simply return EAGAIN. I'd be happy to hear otherwise.

If we can get better performance by transparently using splice on GNU/Linux, then we should try to do it for 1.6.

commented

@ianlancetaylor I have a working prototype here that doesn't seem to have that issue.

@minux Ok.

@davecheney Yeah, it's going to take a little while to write an appropriate benchmark. Also, some of the performance is destined to be NIC-dependent.

As an addendum, there is a BSD equivalent (sosplice/somove), although I don't know how it would interact with non-blocking sockets.

commented

re: benchmarks, here's a simple one:

On a digital ocean single-core VM running Ubuntu 14.04LTS, I tested splicing two TCP connections. In short:

  • go io.Copy(dst, src)
  • go io.Copy(ioutil.Discard, dst)
  • write into src in a loop
benchmark                     old ns/op     new ns/op     delta
BenchmarkSplice1KBchunk       2146          2095          -2.38%
BenchmarkSplice4KBchunk       5603          3753          -33.02%
BenchmarkSplice512KBchunk     579770        362927        -37.40%

benchmark                     old MB/s     new MB/s     speedup
BenchmarkSplice1KBchunk       476.95       488.56       1.02x
BenchmarkSplice4KBchunk       731.01       1091.37      1.49x
BenchmarkSplice512KBchunk     904.30       1444.61      1.60x

Since this is loopback, I'm not sure it's even hitting the NIC, but it's nice to see that the performance is there even without the hardware acceleration.

I'll have access to a 16-core linux machine with a phat NIC next week, so I'll get back to you with numbers more representative of "enterprise-grade" hardware.

We could also do better on file2file copies. relevant for the Linux part: http://yarchive.net/comp/linux/splice.html

commented

@nightlyone The difference ordinary splicing and socket splicing in Go is that sockets are non-blocking. The socket->socket splice is implemented optimally only if it is inside net. A file->file splice can be implemented outside of the standard library without any detriment. (Also a consideration: some filesystems (e.g. FUSE) don't support splice.)

@philhofer you are correct, sorry for the noise!

Just adding a +1 from us here at Comcast for the go1.6 timeframe. We have transfers that run in the 10+MB range as our bread and butter and playing with this for us is around 2x at those sizes.

FWIW, we rebuilt one of our proxy server with @philhofer 's patch and deployed it in production a few days ago. CPU and memory usage are down. Bandwidth is up. No issues were encountered so far.

Applying patches to the standard library is fairly easy for us as we're doing static builds but it would still be nice to see this feature make into the official tree.

@philhofer care to bring this up on the ML as @minux suggested?

@philhofer Can you add some example of source code? I have some trouble with understanding it. Is all what need to call io.Copy(dst, src) ? It wrap splice system call only for socket to socket copy operation?

Hi @philhofer.

Thank you for your patch.

I have make a simple TCP/IP proxy, it transfer small messages between client and backend by io.Copy().

After I apply your patch, the proxy always have 200ms latency.

At last, I found that the fSpliceMore flag in the writeTo() makes socket work like TCP_CORK enabled.

When I remove the flag, the latency gone and the performance better then old io.Copy().

This is the simple proxy:

package main

import (
    "io"
    "net"
)

type Proxy struct{}

func newProxy() Proxy {
    return Proxy{}
}

func (p Proxy) Transfer(conn1, conn2 net.Conn) error {
    errChan := make(chan error, 1)
    go func() {
        _, err := io.Copy(conn2, conn1)
        conn1.Close()
        conn2.Close()
        errChan <- err
    }()

    _, err1 := io.Copy(conn1, conn2)
    conn1.Close()
    conn2.Close()
    err2 := <-errChan

    if err1 != nil {
        return err1
    }
    return err2
}

The test case:

package main

import (
    "bytes"
    "io"
    "math/rand"
    "net"
    "testing"
)

func Test_Proxy(t *testing.T) {
    // Setup an echo server
    backendLsn, err := net.Listen("tcp", "0.0.0.0:0")
    if err != nil {
        t.Fatal(err)
    }
    go func() {
        conn, err := backendLsn.Accept()
        if err != nil {
            panic(err)
        }
        io.Copy(conn, conn)
    }()

    // Setup a proxy server
    proxyLsn, err := net.Listen("tcp", "0.0.0.0:0")
    if err != nil {
        t.Fatal(err)
    }
    go func() {
        conn, err := proxyLsn.Accept()
        if err != nil {
            panic(err)
        }

        // Dial to backend echo server
        agent, err := net.Dial("tcp", backendLsn.Addr().String())
        if err != nil {
            panic(err)
        }

        // Begin transfer
        proxy := newProxy()
        proxy.Transfer(conn, agent)
    }()

    conn, err := net.Dial("tcp", proxyLsn.Addr().String())
    if err != nil {
        t.Fatal(err)
    }

    for i := 0; i < 100000; i++ {
        b1 := RandBytes(256)
        b2 := make([]byte, len(b1))

        _, err := conn.Write(b1)
        if err != nil {
            t.Fatal(err)
        }

        _, err = io.ReadFull(conn, b2)
        if err != nil {
            t.Fatal(err)
        }

        if !bytes.Equal(b1, b2) {
            t.Fatal()
        }
    }
}

func RandBytes(n int) []byte {
    n = rand.Intn(n) + 1
    b := make([]byte, n)
    for i := 0; i < n; i++ {
        b[i] = byte(rand.Intn(255))
    }
    return b
}

Can you tell me how will you use syscall.Splice in proxy code to forward the request directly to Server by doing kernel level copying using Splice instead of io.Copy ?

Splice function is look like this :
func Splice(rfd int, roff *int64, wfd int, woff *int64, len int, flags int) (n int64, err error)

From the net.Conn or http.Request how will we get the value of readfd and writefd ?
Can you please help me to solve the problem ?

@SubrataPucsd you can see how go does sendfile opportunistically here: https://golang.org/src/net/sendfile_linux.go

I imagine it would be similar with this (attempt a type assertion to the required underlying type, bail out if it fails)

Any progress on this? The benefits seem to have been made apparent. At the very least this would reduce memory allocation.

If you know how to make splice get used automatically, like we do for sendfile already, without introducing any new API, great. I don't see how immediately but feel free to send a CL.

Is zero copy now part of golang 1.9?

@flavioaiello We don't use the issue tracker for general questions. Please ask on a forum; see https://golang.org/wiki/Questions.

If you are asking specifically whether we use the slice system call for TCPConn.ReadFrom, the answer is no.

Change https://golang.org/cl/107715 mentions this issue: net: add support for splice(2) in (*TCPConn).ReadFrom on Linux