Large socket buffers with fifo queueing causes sockets to stall
stevenengler opened this issue · comments
If a host with a slow network has two sockets with large send buffers, one socket can prevent the other socket from sending data for a long period of time.
When the network interface can send a packet with fifo queueing (the default), it chooses the packet with the lowest priority. The packet priority is set when the application calls send()
, meaning that data from earlier send()
s will always be sent before data from later send()
s. This can be a problem if the socket has a large send buffer which gets filled by the application. Any other sockets on the host will need to wait for that socket to send all of its packets before the other sockets can send their packets. In extreme cases, this can causes a socket to not send any packets for over 2 minutes. This is an issue with Shadow's implementation of fifo queueing and this issue also applies to UDP sockets.
For example, here we have two sockets on a slow host (up/down bandwidth of 500 Kibps). Each socket is given a 100 MiB send buffer, which we write a total of 10 MiB to. One socket will have all of its data prioritized over the other.
import socket
import sys
import time
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# set send buffer length of 100 MiB
s.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 100*(1024**2))
s.connect((sys.argv[1], int(sys.argv[2])))
# send 10 MiB
size = 10*(1024**2)
buf = b'1'*size
while len(buf) != 0:
num = s.send(buf)
buf = buf[num:]
# try to work around https://github.com/shadow/shadow/issues/3100
time.sleep(10)
s.shutdown(socket.SHUT_WR)
assert b'' == s.recv(1)
s.close()
import asyncio
import socket
import sys
async def handle_client(client, addr):
loop = asyncio.get_event_loop()
while True:
buf = await loop.sock_recv(client, 1024)
if len(buf) == 0:
break
print("Connection closed:", addr)
client.close()
async def run_server(port):
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(('0.0.0.0', port))
server.listen(100)
server.setblocking(False)
loop = asyncio.get_event_loop()
while True:
client, addr = await loop.sock_accept(server)
print("New connection:", addr)
loop.create_task(handle_client(client, addr))
asyncio.run(run_server(int(sys.argv[1])))
general:
stop_time: 360s
experimental:
strace_logging_mode: standard
host_option_defaults:
pcap_enabled: true
network:
graph:
type: gml
inline: |
graph [
node [
id 0
host_bandwidth_down "500 Kbit"
host_bandwidth_up "500 Kbit"
]
edge [
source 0
target 0
latency "50 ms"
]
]
hosts:
server:
network_node_id: 0
processes:
- path: /usr/bin/python3
args: -u ../../../server.py 8080
start_time: 0s
expected_final_state: running
client:
network_node_id: 0
processes:
- path: /usr/bin/python3
args: -u ../../../client.py server 8080
start_time: 1s
- path: /usr/bin/python3
args: -u ../../../client.py server 8080
start_time: 1s
If we look at the client's pcap, we can see that one of the sockets had a period of ~173 seconds where it didn't send any packets.
It was only able to send packets after the other socket had finished sending all of its packets.