google / gvisor

Application Kernel for Containers

Home Page:https://gvisor.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

netstack: wrong MSS on accepted connection (solution provided)

amurchick opened this issue · comments

Description

Problem (see solution below)

I am use gvisor branch go for self application, not runsc.

I am create link endpoint with big MTU (8000).

When I am initiate connect from gvisor - both sides (100.122.0.6 - OS side, 100.122.0.1 - gvisor side) agree on mss and when transfer data from gvisor side - I am see packets with big payload size (7948) - this is ok:

IP 100.122.0.1.62482 > 100.122.0.6.5201: Flags [S], seq 1428769137, win 65528, options [mss 7880,nop,nop,TS val 2078661728 ecr 0,nop,wscale 3], length 0
IP 100.122.0.6.5201 > 100.122.0.1.62482: Flags [S.], seq 2589845980, ack 1428769138, win 63584, options [mss 7960,nop,nop,TS val 2525703279 ecr 2078661728,nop,wscale 7], length 0

...

IP 100.122.0.1.62482 > 100.122.0.6.5201: Flags [.], seq 38:7986, ack 1, win 8192, options [nop,nop,TS val 2078661738 ecr 2525703283], length 7948
IP 100.122.0.6.5201 > 100.122.0.1.62482: Flags [.], ack 7986, win 447, options [nop,nop,TS val 2525703289 ecr 2078661738], length 0
IP 100.122.0.1.62482 > 100.122.0.6.5201: Flags [.], seq 7986:15934, ack 1, win 8192, options [nop,nop,TS val 2078661738 ecr 2525703283], length 7948
IP 100.122.0.6.5201 > 100.122.0.1.62482: Flags [.], ack 15934, win 447, options [nop,nop,TS val 2525703290 ecr 2078661738], length 0
IP 100.122.0.1.62482 > 100.122.0.6.5201: Flags [.], seq 15934:23882, ack 1, win 8192, options [nop,nop,TS val 2078661738 ecr 2525703283], length 7948
IP 100.122.0.6.5201 > 100.122.0.1.62482: Flags [.], ack 23882, win 447, options [nop,nop,TS val 2525703290 ecr 2078661738], length 0
IP 100.122.0.1.62482 > 100.122.0.6.5201: Flags [.], seq 23882:31830, ack 1, win 8192, options [nop,nop,TS val 2078661738 ecr 2525703283], length 7948

When I am do connect to gvisor and gvisor accept connection - both sides agree on mss, but when transfer data from this accepted connection - I am see packets with small payload (1448) - this is NOT OK:

IP 100.122.0.6.29166 > 100.122.0.1.5201: Flags [S], seq 3808301897, win 63680, options [mss 7960,sackOK,TS val 2526138051 ecr 0,nop,wscale 7], length 0
IP 100.122.0.1.5201 > 100.122.0.6.29166: Flags [S.], seq 2152680396, ack 3808301898, win 65535, options [mss 7880,nop,nop,TS val 3883359573 ecr 2526138051], length 0

...

IP 100.122.0.1.5201 > 100.122.0.6.29166: Flags [.], seq 1449:2897, ack 38, win 65535, options [nop,nop,TS val 3883359581 ecr 2526138065], length 1448
IP 100.122.0.6.29166 > 100.122.0.1.5201: Flags [.], ack 2897, win 60784, options [nop,nop,TS val 2526138073 ecr 3883359581], length 0
IP 100.122.0.1.5201 > 100.122.0.6.29166: Flags [.], seq 2897:4345, ack 38, win 65535, options [nop,nop,TS val 3883359581 ecr 2526138065], length 1448
IP 100.122.0.6.29166 > 100.122.0.1.5201: Flags [.], ack 4345, win 59336, options [nop,nop,TS val 2526138073 ecr 3883359581], length 0
IP 100.122.0.1.5201 > 100.122.0.6.29166: Flags [.], seq 4345:5793, ack 38, win 65535, options [nop,nop,TS val 3883359581 ecr 2526138065], length 1448
IP 100.122.0.6.29166 > 100.122.0.1.5201: Flags [.], ack 5793, win 57888, options [nop,nop,TS val 2526138073 ecr 3883359581], length 0
IP 100.122.0.1.5201 > 100.122.0.6.29166: Flags [.], seq 5793:7241, ack 38, win 65535, options [nop,nop,TS val 3883359581 ecr 2526138065], length 1448

Solution

I am analyze gvisor code and found - when accepting connection, mss gets from syn cookie, not from tcp option TCP_MAXSEG:

MSS: mssTable[data],

image

I am make changes in code for correct mss assign (from tcp option TCP_MAXSEG):

-			MSS: mssTable[data],
+			MSS: e.userMSS,
image

And after this fix - all works fine - tcp payloads sizes are big (not 1448):

IP 100.122.0.6.45308 > 100.122.0.1.5201: Flags [S], seq 3082736529, win 63680, options [mss 7960,sackOK,TS val 2527105177 ecr 0,nop,wscale 7], length 0
IP 100.122.0.1.5201 > 100.122.0.6.45308: Flags [S.], seq 2341504349, ack 3082736530, win 65535, options [mss 7880,nop,nop,TS val 2737625125 ecr 2527105177], length 0

... 

IP 100.122.0.1.5201 > 100.122.0.6.45308: Flags [.], seq 7869:15737, ack 38, win 65535, options [nop,nop,TS val 2737625141 ecr 2527105209], length 7868
IP 100.122.0.1.5201 > 100.122.0.6.45308: Flags [.], seq 15737:23605, ack 38, win 65535, options [nop,nop,TS val 2737625141 ecr 2527105209], length 7868
IP 100.122.0.1.5201 > 100.122.0.6.45308: Flags [.], seq 23605:31473, ack 38, win 65535, options [nop,nop,TS val 2737625141 ecr 2527105209], length 7868

Steps to reproduce

  • create link endpoint, set mtu 8000 on link endpoint;
  • create listen endpoint, set big mss (for example, 7880) on listen endpoint;
  • accept tcp connection on listen endpoint and transfer some data stream from accepted endpoint and see that tcp payloads sizes are too small.

runsc version

https://github.com/google/gvisor/commit/e5774de31657e4b4360a6f8aba12eb0d0934d89d

docker version (if using docker)

No response

uname

Linux 6.3.2 (but this issue not depends from OS)

kubectl (if using Kubernetes)

No response

repo state (if built from source)

No response

runsc debug logs (if available)

No response

@amurchick thanks for the investigation and the proposed fix. Do you mind sending a PR for this? And if possible could you add a test for this?

Could you share the code you ran. If I had to guess the listen call was done with a backlog of 1 or 0. Which would trigger the syncookie path. I am reasonably sure if you specify a larger backlog it will work as expected.

I agree that this is likely the syncookie path. Looking at it now, we might not be handling options like MSS correctly.

A friendly reminder that this issue had no activity for 120 days.