Performance Issue due to TCP Nagle's Algorithm write,write,read Situation

Question

Performance Issue due to TCP Nagle's Algorithm write,write,read Situation

trohwer opened this issue a year ago · comments

I am aware, that passing byte arrays via py4j is not (supposed to be) very efficient (although some copy operations along the path could be avoided).
Nevertheless I was initially surprised by the following performance difference (here in py4j-0.10.9.5.jar included in PySpark on Linux):

import time

b=spark.sparkContext._jvm.java.nio.ByteBuffer.allocate(4096)
t0=time.time()
for i in range(0,100):
u=b.array()
print(time.time()-t0)

0.04267597198486328

b=spark.sparkContext._jvm.java.nio.ByteBuffer.allocate(8192)
t0=time.time()
for i in range(0,100):
u=b.array()
print(time.time()-t0)

4.404087543487549

It turns out that the code suffers from Nagle algorithm here. E.g. in the CallCommand

	writer.write(returnCommand);
	writer.flush();

if writing returnCommand exceeds the buffer of the BufferedWriter, there are two writes to the socket output.

After disabling the Nagle algorithm for loopback sockets by adding the following in ClientServerConnection.java

super();
this.socket = socket;

// added
if (socket.getLocalAddress().isLoopbackAddress()) socket.setTcpNoDelay(true);

this.reader = new BufferedReader(new InputStreamReader(socket.getInputStream(), Charset.forName("UTF-8")));

I get the following run time measurements:

0.047772884368896484
0.07696914672851562

I think, that for loopback sockets disabling the algorithm does not have any disadvantages, since buffering occurs in the BufferedWriter. Possibly one could disable it in general.

Thomas Rohwer · Answer 1 · Wed Apr 26 2023 03:10:16 GMT+0800 (China Standard Time)

See pull request #517 .