py4j / py4j

Py4J enables Python programs to dynamically access arbitrary Java objects

Home Page:https://www.py4j.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Performance Issue due to TCP Nagle's Algorithm write,write,read Situation

trohwer opened this issue · comments

I am aware, that passing byte arrays via py4j is not (supposed to be) very efficient (although some copy operations along the path could be avoided).
Nevertheless I was initially surprised by the following performance difference (here in py4j-0.10.9.5.jar included in PySpark on Linux):

import time

b=spark.sparkContext._jvm.java.nio.ByteBuffer.allocate(4096)
t0=time.time()
for i in range(0,100):
u=b.array()
print(time.time()-t0)

0.04267597198486328

b=spark.sparkContext._jvm.java.nio.ByteBuffer.allocate(8192)
t0=time.time()
for i in range(0,100):
u=b.array()
print(time.time()-t0)

4.404087543487549

It turns out that the code suffers from Nagle algorithm here. E.g. in the CallCommand

	writer.write(returnCommand);
	writer.flush();

if writing returnCommand exceeds the buffer of the BufferedWriter, there are two writes to the socket output.

After disabling the Nagle algorithm for loopback sockets by adding the following in ClientServerConnection.java

super();
this.socket = socket;

// added
if (socket.getLocalAddress().isLoopbackAddress()) socket.setTcpNoDelay(true);

this.reader = new BufferedReader(new InputStreamReader(socket.getInputStream(), Charset.forName("UTF-8")));

I get the following run time measurements:

0.047772884368896484
0.07696914672851562

I think, that for loopback sockets disabling the algorithm does not have any disadvantages, since buffering occurs in the BufferedWriter. Possibly one could disable it in general.

See pull request #517 .