Account for endpoints being delivered asynchronously when calculating accept queue size
arthurpi opened this issue · comments
Description
Currently, after we receive the final ACK from the handshake, the listening endpoint will move the new ESTABLISHED connection in the accept queue asynchronously. [1]
It means that if another client sends a SYN before that asynchronous operation above completes (but after a previous call to connect(2) returned), it'll look like the server accepted both connections.
[1]
gvisor/pkg/tcpip/transport/tcp/accept.go
Line 771 in 2aeab25
In comparison, Linux does this synchronously and the same behavior can't be reproduced:
https://github.com/torvalds/linux/blob/a9c9a6f741cdaa2fa9ba24a790db8d07295761e3/net/ipv4/tcp_minisocks.c#L570
-> tcp_check_req() (server processes ACK
from handshake)
https://github.com/torvalds/linux/blob/a9c9a6f741cdaa2fa9ba24a790db8d07295761e3/net/ipv4/tcp_minisocks.c#L786
-> inet_csk_complete_hashdance()
https://github.com/torvalds/linux/blob/a9c9a6f741cdaa2fa9ba24a790db8d07295761e3/net/ipv4/inet_connection_sock.c#L1143
-> inet_csk_reqsk_queue_add()
https://github.com/torvalds/linux/blob/a9c9a6f741cdaa2fa9ba24a790db8d07295761e3/include/net/sock.h#L936 -> sk_acceptq_added() (increment accept queue count)
Steps to reproduce
write a simple test, with:
listen(listener, 0);
connect(sock_a, ...); // always succeeds
connect(sock_b, ...); // may succeed (even though server will drop the final ACK)
in rare cases, the second call to connect
will succeed, because the SYN from that connect is processed before the accept queue was updated by the server.
(note: the server will drop the ACK from the second connect, so the behavior is not all that wrong)
runsc version
No response
docker version (if using docker)
No response
uname
No response
kubectl (if using Kubernetes)
No response
repo state (if built from source)
No response
runsc debug logs (if available)
No response