Don't inspect stream._readableState.ended

Question

Don't inspect stream._readableState.ended

isaacs opened this issue 7 years ago · comments

The onclose handler is coupled to node core's stream implementation details, which creates some unfortunate problems for userland streams.

var onclose = function() {
  if (readable && !(rs && rs.ended)) return callback.call(stream, new Error('premature close'));
  if (writable && !(ws && ws.ended)) return callback.call(stream, new Error('premature close'));
};

If you have a readable stream that doesn't have a _readableState member (or writable/_writableState), then this always raises a premature close error.

While it was straightforward to work around in my case, it was surprising and unnecessarily tricky to debug.

I suggest either of the following approaches:

If a stream doesn't have a readable/writable state, don't add the "premature close" check at all. Then the line becomes:

if (readable && rs && !rs.ended) return callback.call(stream, new Error('premature close'));
if (writable && ws && !ws.ended) return callback.call(stream, new Error('premature close'));

If a stream doesn't hae a readable/writable state, throw a TypeError at the outset so that people don't attempt to use eos with Streams that don't derive from the core stream classes.

Mathias Buus · Answer 1 · Sun Apr 30 2017 13:15:12 GMT+0800 (China Standard Time)

Custom streams are supported. A premature close event is only raised if end, finish was not emitted before close for streams that does not have a _readableState object. Is that the case for your stream?

isaacs · Answer 2 · Sun Apr 30 2017 16:57:11 GMT+0800 (China Standard Time)

It's not unusual for close events to come before end though. The data might be filling up a buffer, because the user isn't reading from it fast enough.

Eos is detecting different things for custom streams than it is for core class streams.

For core streams, it's detecting if end() has been called on the writable, or an end event queued for a readable.

For custom (streams without the _typeState), it's detecting if the 'end' event or 'finish' event has happened, which is a much more restrictive check.

isaacs · Answer 3 · Wed Aug 16 2017 01:30:53 GMT+0800 (China Standard Time)

Hey, I just had to add yet another one of these hacks to put off emitting close until I can be sure that it comes after end and finish. It requires that my modules have to lie about what's going on, or else mississippi and eos will throw errors, making these events effectively pointless.

end means "you've consumed all the data". finish means "all of the data has been written". close means "an underlying resource has been released". This module should not know or care about close events, ever. Here's why:

It's completely reasonable to read a file into a buffer, but not consume it yet. I can't emit end, because the data hasn't been emitted. But the file descriptor has been closed, so close is the right thing to emit. Because of this module and the popularity of mississippi and eos, I have to defer the close event until after the end event, even if it actually happens "prematurely".

I'm skeptical that "premature close" is even a thing to be worried about. close is application-specific, it's just a way for a stream to say "I've released whatever underlying resource was being held", but the semantics and timing around that are completely arbitrary. Emitting close before end or finish is not universally indicative of a problem, and in fact, is quite common in many custom stream's normal operation.

What makes this so frustrating to me is that eos allows this exact semantic for built-in streams, but not for custom streams that lack a _readableState or _writableState property (which should be private anyway).

Would you be open to removing this check? Or perhaps make it opt-in? I'd be happy to send a PR with tests and documentation.

Mathias Buus · Answer 4 · Thu Aug 17 2017 04:34:19 GMT+0800 (China Standard Time)

There are many cases with node streams where close is the only event emitted in error cases. end-of-stream is a pragmatic module that tries to detect all errors and never not fire the callback.

Consider the following example

var fs = require('fs')
var stream = fs.createReadStream('some-big-file')

stream.on('close', () => console.log('onclose'))
stream.on('end', () => console.log('onend'))

setTimeout(function () {
  stream.destroy() // something destroys the stream
}, 1000)

Here close is the only event fired and the only way to know the stream isn't fully buffered internally is to check the .ended property. It needs to support cases like that.

I'm happy to accept a PR that makes it easier to support non core streams though as long as we don't break things like above :)

Mathias Buus · Answer 5 · Thu Aug 17 2017 04:37:06 GMT+0800 (China Standard Time)

I 100% agree that the close semantics can be confusing to implement (why i always use 3rd party modules that hide all this) but that's how the core streams work, re error handling.

isaacs · Answer 6 · Thu Aug 17 2017 04:48:19 GMT+0800 (China Standard Time)

I think the correct answer, then, would be to ignore "premature close" for any streams that aren't core streams, since there's no reliable way to know if the close is in fact premature. For core streams, with a _readableState and _writableState, you have destroyed and ended fields to know whether the close was premature or not.

Throwing an error in cases where there's nothing wrong is very disruptive. I get that you want to ensure, as much as possible, that the cb will fire in every case, but what's to stop a stream from not emitting close or end? For third-party streams, the cure is worse than the disease, and also not a cure.

Mathias Buus · Answer 7 · Thu Aug 17 2017 04:54:57 GMT+0800 (China Standard Time)

Nothing is stopping a stream from never emitting close or end no but that is the pragmatic part. All core streams emit close when destroyed and it is in the base classes as well.

As long as we don't break support for popular modules like request I'm fine changing the behaviour for non core streams (pr welcome).

I'm curious, how do you indicate a stream was destroyed in your stream impl?

isaacs · Answer 8 · Thu Aug 17 2017 05:07:12 GMT+0800 (China Standard Time)

MiniPass doesn't have a destroy method, and I don't often find occasion to use it. Imo, it was a mistake to put into fs streams in the first place, since they already have end and close methods to do whatever you need to, but 20/20 hindsight and all. In most cases where I'd be forcibly destroying stuff, I probably want to emit an error instead.

Lovell Fuller · Answer 9 · Tue Sep 24 2019 04:18:15 GMT+0800 (China Standard Time)

Am I right in thinking commit isaacs/minipass@40e1d61 implements the proposed removal of ended and therefore now breaks the ability for this module to detect the end of a minipass stream?

isaacs · Answer 10 · Tue Sep 24 2019 15:20:04 GMT+0800 (China Standard Time)

@lovell Yeah, that broke a bunch of stuff that was already using ended to mean something different. Sometimes a semver minor change turns out to be breaking, unfortunately.

Minipass streams have an emittedEnd getter that means the same thing, anyway. Just less pretty than "ended", imo. But also, if you add an end event handler on a Minipass stream, and it's already ended, it'll re-emit immediately, so it'll be like it just ended. Kind of like how a resolved promise automatically calls the cb you pass to .then().

isaacs · Answer 11 · Tue Sep 24 2019 15:20:30 GMT+0800 (China Standard Time)

Also, Minipass streams have destroy now.