brettwooldridge / NuProcess

Low-overhead, non-blocking I/O, external Process implementation for Java

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

stdout/err buffer overflow behaviour

huntc opened this issue · comments

I've noticed that for Posix, if the stdout buffer becomes full then a RuntimeException("stdout buffer has no bytes remaining") is thrown (https://github.com/brettwooldridge/NuProcess/blob/master/src/main/java/com/zaxxer/nuprocess/internal/BasePosixProcess.java#L483). I've observed that this exception is caught at https://github.com/brettwooldridge/NuProcess/blob/master/src/main/java/com/zaxxer/nuprocess/internal/BaseEventProcessor.java#L83 whereupon the process is then declared as stopped.

My understanding from the API doc is that it is fine for a stdout handler to return the buffer that it was passed in the event that it is not in a position to consume it and that the stdout handler would then be called at a later time. My expectation then was that the upstream process would block until it could write more to stdout.

I was hoping that you could confirm the actual design. It seems that stdout handlers must always consume in order to avoid seeing that the process is represented as one that has stopped.

The same behaviour applies to stderr. Thanks.

Note that #53 is related, although that focuses more on a solution. This issue pertains to unexpected behaviour in accordance with the existing API doc.

@huntc Currently, the handler must consume all data. There is no back pressure. There is work to implement back pressure on the streams branch, but AFAIR it is unfinished.

My understanding from the API doc is that it is fine for a stdout handler to return the buffer that it was passed in the event that it is not in a position to consume it and that the stdout handler would then be called at a later time.

I can see how one might read that into the NuProcessHandler Javadocs, but that's not how it works. You can return from the callback with some data still in the buffer, but you can't return with the buffer full. Consider a split multibyte UTF-8 character, for example. If you get the first 2 bytes of a 4 byte character, you might leave those 2 bytes in the buffer for the next callback. That sort of leftover data leaves plenty of room in the buffer for new data, though.

The problem with "would then be called at a later time" is: When would that "later time" be, and how would NuProcess know when it is? Currently it's listening for available output, and when there is some it calls back. If the handler doesn't consume that output, though, and the buffer fills up, then NuProcess:

  • Can't listen for more output, because it'd end up in a busy loop (we know there's data available already)
  • Can't buffer any more data, because there's nowhere to put it

NuProcess would need to grow something that's similar to the existing wantWrite that's used to register (or re-register) an interest in a stdin window, where, if your onStdout or onStderr returned with the buffer full, NuProcess would essentially conclude you're not interested in that stream right now and wait for you to signal you are before triggering any more callbacks.

The streams branch starts down one possible approach for this, but it requires a pretty comprehensive rewrite of NuProcess's public API (which would mean a 3.0 major release).

This may be something I, or some colleagues on Bitbucket Server, end up circling back to, though, because eventually we'd very much like to have some backpressure in NuProcess. Currently we implement it "around" NuProcess by using the NuProcessBuilder.run() introduced in 2.0.0 to pump processes directly on a calling thread, rather than the fixed background pool used by NuProcessBuilder.start(). That allows us to safely block in NuProcessHandler callbacks like onStderr and onStdout--effectively applying backpressure. That's how, for example, we've integrated NuProcess with gRPC, Apache SSHD and Tomcat.