stream-utils / raw-body

Get and validate the raw body of a readable stream

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Flowing streams

dougwilson opened this issue · comments

So the question arises from #23 of whether or not this module should accept a "flowing stream" or not. Node.js core defines a flowing stream as one that has been piped. Right now this will work as long as there are no errors, otherwise it will explode.

I'm leaning on saying we should reject flowing streams.

Example of passing a flowing stream:

req.pipe(my_stream) // makes it "flowing"
getRawBody(req, 'utf8', function (err, body) {
})

It seems this issue has a lot of gotchas; including the fact that this only occurs on Node.js 0.10+ and only if the stream was not put into old mode before the .pipe() was called. If the .pipe() was called after the stream is put into old mode, then .pause() is basically ignored. Thanks so much, Node.js

not much you can do if they pipe afterwards. but i think it should be if stream has pipes then unpipe everything. then pause. not much else we can do

I feel like if we unpipe stuff, it will be a really big surprise to people, but idk. I can only think of a couple realistic reasons people would pipe: to calculate a hash wile it's read and to write the data out to a log. The first case would make sense to unpipe, but not really the second case. The second case seem dubious, though. I think we should just go with unpipe everything and pause.

Really with request-body lib, this module ends up being a little too weird, pulling from the stream, don't you think? If this lib just returned a writable stream, then it wouldn't have to even muck around with unpiping and pausing, etc. Could be something like

var body = rawBody({length: 43, limit: 6000})
req.pipe(body)
body.on('error', function(err) { ... })
body.on('data', function (data) {
  // entire body!
})

Hmmm maybe no unpiping. I haven't seen a valid use case for using pipes with this lib though.

I had a version with pipes before. It was actually more convoluted, and pipe doesn't handle some cases (forgot which)

multiparty and I believe busboy both are a writable and don't seem to have issues, so it would make sense if this was that way as well. When this writable emits an error, the .pipe() impl. will automatically unpipe to this stream (and it'll be paused automatically if there are no other destinations). Node.js 0.8 does not auto-pause on error with a pipe, perhaps that's what you're thinking of?

aside from 0.8 support, there were a bunch of issues i ran into

  • you have to handle more events: initialization, pipe, ._write(), and finish vs. data and end
  • ran into some issue about emitting errors on initialization, after pipe, and on write because the errors always have to be emitted on next tick
  • have to handle the case where users do .write() and .end() directly since, after all, it's a stream

all in all, it requires more code to do the same thing.
the main issue for me is that .pipe()s just isn't a good API because of error handling. basically, it'll look like this:

function (req, done) {
  req('error', done) // or something
  req.pipe(rawBody({
    headers: req.headers
  }, done))
})

way easier just to do something like:

rawBody({headers: req.headers}).then(function (body) {

})

Thanks, @jonathanong make sense to me! It sounds like what we could possibly do is that we can keep rawBody(stream, options, cb) and internally pipe the stream into an internal writable stream, which could keep the API the same, but give us the automatic flow stopping on error, etc. This is what multiparty basically does with form.parse(req) and it seems to work well and without lots of gotchas. Thoughts?

i guess internally it wouldn't matter. externally i would say no. you might have to do some crazy shit still for 0.8 support.

and we could abstract the writable stream again. LOL. create some sort of concatenation stream that only accepts buffers, encodes it based on a specified encoding, then returns a single buffer/string.

you might have to do some crazy shit still for 0.8 support.

that's what I don't want to to :) See, if we internally pipe to a stream made from readable-stream, it will take care of the streams1/streams2 differences for us. That was why I suggested it :)

but the source stream won't automatically pause (i don't think). you'd still have to pause it.

but the source stream won't automatically pause (i don't think). you'd still have to pause it.

ah yes, for node.js 0.8 it doesn't (it does for node.js 0.10-style, since it stops the flow on the unpipe if we emit an error on our internal stream).

i need to look through the node 0.10 strams some more here. basically i would like to read in 0.10 style when possible without adding an entire chunk to this lib of if 0.8 then ... else ...

lol so confusing man! probably be easier just to wait until we drop 0.8 and then do the internal stream stuff