google / gvisor

Application Kernel for Containers

Home Page:https://gvisor.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gVisor failed to use host network silently

iyenli opened this issue · comments

Description

Hi, I'm trying to use gVisor to run redis and use --network=host to improve network performance. Here is my daemon.json:

{
    "runtimes": {
        "runsc-kvm-host": {
            "path": "/usr/local/bin/runsc",
	    "runtimeArgs": [
                "--platform=kvm",
		"--network=host"
            ]
        }
    }
}

Then I tried boot a gvisor docker:

[Host: Raspi 3b+, debian]
$ sudo docker run --runtime=runsc-kvm-host -v /home/test_suite:/home --rm -it ubuntu sh
[gVisor]
$ apt update && apt install netcat-traditional
$ netcat -v -l -p 9090
listening on [any] 9090 ...
[Host]
$ telnet localhost 9090
Trying ::1...
Connection failed: Connection refused

gVisor failed to use host network silently.

I've checked #issue97, it seems not to work for me.

Thanks in advance for community's kind help.

runsc.log.20231108-161455.351768.boot.txt
runsc.log.20231108-161455.291708.create.txt

Steps to reproduce

See my description.

runsc version

runsc version release-20231030.0
spec: 1.1.0-rc.1

docker version (if using docker)

Client: Docker Engine - Community
 Version:           24.0.7
 API version:       1.43
 Go version:        go1.20.10
 Git commit:        afdd53b
 Built:             Thu Oct 26 09:08:15 2023
 OS/Arch:           linux/arm64
 Context:           default

uname

Linux raspberrypi 5.15.92-v8+ #337 SMP PREEMPT Wed Nov 8 10:09:33 CST 2023 aarch64 GNU/Linux

kubectl (if using Kubernetes)

No response

repo state (if built from source)

No response

runsc debug logs (if available)

No response

commented

you docker command doesn't expose 9090 port of the container.

instead,

docker run -p 9090:9090 --runtime=runsc-kvm-host -v /home/test_suite:/home --rm -it ubuntu sh 

should work for you

At the host, you will be able to see

$ telnet localhost 9090
Trying ::1...
Connected to localhost.
Escape character is '^]'.

From the container, I assume you will also something like

# netcat -v -l -p 9090
listening on [any] 9090 ...
getsockopt failed : Protocol not available
IP options: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  : Protocol not available
172.17.0.1: inverse host lookup failed: Unknown host
connect to [172.17.0.2] from (UNKNOWN) [172.17.0.1] 45068

But i think there is nothing it has to do with network here, they are some socket protocol not implemented by gVisor for syscall getsockopt.

If you are interested, please feel free to enable strace and implement the protocol.

Let me know how it works for you

@milantracy

Thanks for your kindly help. I'm sorry for the beginner's mistake I made due to my unfamiliarity with Docker. It led me to suspect that the reason for the failure in using the Host network might be the degraded performance of Redis compared to not using the Host network.

Specifically, when using runsc as the runtime, the Redis Ping performance decreased from 3101 ops/sec to 2504 ops/sec when using the Host network. In contrast, when using runc as the runtime, the Redis Ping performance increased from 6218 ops/sec to 10369 ops/sec when using the Host network.

I'm trying to identify the cause and suspect that it might be related to successfully using the Host network. Could you provide some guidance on this issue?

Did you by any chance flip runc and runsc in the above comment?

@hbhasker No, runsc as runtime, redis performance is even worse when using host network. Could you reproduce it?

I expect that with runsc especially for ping pong traffic. The reason is host network support is implemented by proxying system calls straight to the host kernel which involves a VMEXIT in runsc making it very expensive. Host mode is useful for high throughout but not for low latency ping pong traffic.

Also your comment has runsc flipped with runc.

"when using runsc as the runtime, the Redis Ping performance decreased from 3101 ops/sec to 2504 ops/sec when using the Host network. In contrast, when using runc as the runtime, the Redis Ping performance increased from 6218 ops/sec to 10369 ops/sec when using the Host network."

I believe some of the confusion in this thread is because Docker also has a flag --network=host. It's important to note that runsc --network=host and docker run --network=host mean different things.

  • docker run --network=host removes the use of network namespaces from Docker containers. This means a Docker container can bind to the host network interface directly. When using this flag, there is no need to use the -p flag, because the container has direct access to the host network interface and can bind on it directly. Therefore, Docker doesn't need to set up a port forwarding rule to the network namespace of the container (that's what -p does).
  • runsc --network=host changes the technique that gVisor uses to implement the network stack visible to the sandboxed application. The default value (runsc --network=sandbox) causes gVisor to reimplement its own network stack in Go. This is the most secure option (which is why it's the default), but it is also the slowest, because gVisor's network stack isn't as optimized as Linux's network stack it. By contract, runsc --network=host causes gVisor to pass through network-related system calls to the host kernel. This means that when a sandboxed application issues a network-related system call, gVisor still intercepts it, and then passes it on to the host kernel (Linux). This is a lower degree of protection, because Linux's network stack has historically had more vulnerabilities inside of it, and is written in a memory-unsafe language. However, it is still a strictly better degree of protection than using runc, because all system calls still get intercepted by gVisor and still need to go through gVisor's seccomp filters, which limits the total set of possible system calls that make it through. (At the same time, these reasons are also why this is still slower than running unsandboxed with runc.)

When you use runsc --network=host with docker run (without passing a --network option to docker run), the container still runs in a network namespace on the host. This means that you still need to add -p to forward the port, and this means Linux still adds overhead because of the forwarding rule it adds to cross over the network namespace boundary.

This means that if you want to compare unsandboxed Docker performance with --network=host set on the unsandboxed container, the most fair comparison for gVisor is to set both runsc --network=host and docker run --network=host. These flags are independent.


Now, one more thought about Redis benchmarking. When you run a Redis benchmark, it matters a lot how you run it. If you run both client and server on the same machine, then the comparison will not be fair when you use unsandboxed containers. This is because Linux will get to skip much of the network-related overhead it has to do, because packets get relayed entirely in the local network stack. By contrast, when running with runsc, Linux cannot do this optimization, and therefore the packet has to actually make all its way through the network stack and into gVisor's own network stack. For the comparison to be fair, the client and the server needs to live on separate machines, or they have to at least go through separate network devices.

I will close this issue since there is no gVisor bug here as far as I can tell. If you have more questions around benchmarking or are finding abnormal results, feel free to open a different issue or email the gvisor-users mailing list.