Performance issues when running on Linux
iBelieve opened this issue · comments
I've been using rust-websocket for a while now and have been happy with using it. However, I noticed it doesn't seem to perform as well as I would expect. I put together some tests with a simple echo server and a client that sends a message and times how long it takes to get it echoed back. Testing with a server and client running locally, rust-websocket seems to perform fine on macOS but on Linux I'm seeing an extra 80ms or so of extra round-trip delay just using a connection over localhost. My test programs and test results are here: https://github.com/lelander/websocket-tests
Testing in my production environment using armv7 devices on both ends, rust-websocket seems to have a round-trip time of around 340ms, while a client/server echo test using ws-rs only takes 90ms.
Server | Client | Time |
---|---|---|
ws | ws | 90ms |
websocket | ws | 211ms |
ws | websocket | 220ms |
websocket | websocket | 341ms |
Do I not have rust-websocket configured correctly, or am I doing something obviously wrong? I'm not sure why I'm seeing such significant performance differences with this library compared with ws-rs and the C++ and Python websocket libraries I've tried.
The main problem is Nagle algorithm:
$ target/release/server
...
$ target/release/client
Connected to websocket server
Round-trip time: 100446us
$ LD_PRELOAD=/mnt/src/git/libnodelay/libnodelay.so target/release/server
...
$ LD_PRELOAD=/mnt/src/git/libnodelay/libnodelay.so target/release/client
Connected to websocket server
Round-trip time: 151us
Secondary problem that is lack of proper buffering: it uses multiple syscalls to send one message piece by piece:
sendto(3, "\201", 1, MSG_NOSIGNAL, NULL, 0) = 1
sendto(3, "\215", 1, MSG_NOSIGNAL, NULL, 0) = 1
sendto(3, "(1\263\334", 4, MSG_NOSIGNAL, NULL, 0) = 4
sendto(3, "\31\4\200\344\30\0\202\350\35\6\212\355\35", 13, MSG_NOSIGNAL, NULL, 0) = 13
Note: using websocat, which is based on rust-websocket as a server is also fast:
$ websocat -Et ws-l:127.0.0.1:3000 mirror:
...
$ LD_PRELOAD=/mnt/src/git/libnodelay/libnodelay.so target/release/client
Connected to websocket server
Round-trip time: 140us
Maybe add async version of rust-websocket to the list?
Setting TCP_NODELAY using libnodelay or client.set_nodelay(true)
did the trick and brought round-trip times to around the same numbers that I was seeing from the other libraries I've tested. Thanks for the assistance!
Shall https://github.com/lelander/websocket-tests/blob/master/README.md be updated?