wantWrite() race
bturner opened this issue · comments
Consider a situation where NuProcess is managing a process to/from which input and output are being pumped to a remote connection (for example an HTTP or SSH request) which is using non-blocking I/O.
In such a setup, wantWrite()
would be called when the remote client sends some data that should be forwarded stdin
(perhaps triggered by something like Servlet 3.1's ReadListener.onDataReady
), and then onStdinReady
would be called when NuProcess is ready to read and forward those bytes. onStdinReady
may get invoked some number of times, if there's a lot of data available, but eventually that data runs out. At that point, onStdinReady
will return false
(because there's no more data ready for stdin
at that time), and the process will wait for another wantWrite()
when more data is received.
Unfortunately, the transition between onStdinReady
and wantWrite
is racy, when wantWrite
is being called from some thread other than the NuProcess pump thread (and it almost has to be, given the pump thread is blocked in epoll_wait
or similar). Specifically, this assignment is problematic:
boolean wantMore = processHandler.onStdinReady(inBuffer);
userWantsWrite.set(wantMore);
If the wantWrite()
call occurs "between" onStdinReady
returning and the call to userWantsWrite.set(wantMore)
, wantWrite()
sets userWantsWrite
to true
and then it's immediately set back to false
by wantMore
. This leads to a hang, because the HTTP/SSH request has signaled that it wanted to write and is waiting for data to be moved, but onStdinReady
will never be called because the process has accidentally cleared the userWantsWrite
flag.
In Servlet 3.1, ReadListener.onDataReady
is called exactly once, when the servlet container's internal buffer goes from no data available to some. At that point it won't be called again until ServletInputStream.isReady
has been called and returned false
, per its contract:
/**
* When an instance of the <code>ReadListener</code> is registered with a {@link ServletInputStream},
* this method will be invoked by the container the first time when it is possible
* to read data. Subsequently the container will invoke this method if and only
* if {@link javax.servlet.ServletInputStream#isReady()} method
* has been called and has returned <code>false</code>.
*
* @throws IOException if an I/O related error has occurred during processing
*/
public void onDataAvailable() throws IOException;
Some code snippets would look something like the following. On the ReadListener
implementation you'd have:
@Override
public void onDataReady() {
process.wantWrite();
}
Then your handler might look like this:
@Override
public boolean onStdinReady(@Nonnull ByteBuffer stdin) {
try {
while (stream.isReady() && stdin.hasRemaining()) {
//read here won't block as long as isReady returns true
int read = stream.read(buffer, 0, Math.min(buffer.length, stdin.remaining()));
if (read == -1) {
//We've hit the end-of-stream
stdin.flip();
process.closeStdin(false);
return false;
}
stdin.put(buffer, 0, read);
}
stdin.flip();
} catch (IOException e) {
log.warn("Failed to copy request input to process", e);
stdin.flip();
process.closeStdin(true);
return false;
}
if (stream.isFinished()) {
//If we've consumed all of the stdin from the request, close stdin after this buffer
process.closeStdin(false);
return false; //No more data to read
}
//Use isReady to decide whether we want another immediate callback
return stream.isReady();
}
The necessary isReady
call to trigger another onDataAvailable()
only happens when onStdinReady
gets called to actually consume some data. However, after isReady
is called in onStdinReady
, there's a (very small, certainly, but still present) window of time where the servlet container could call onDataReady
and signal wantWrite()
before the NuProcess thread has had a chance to assign the false
we returned.
This is fixed in NuProcess 2.0.1