Obscure condition checks
binarykitchen opened this issue · comments
There is a bad bug, all my json requests do return false on req.is('json')
within the expressjs v4 code :(
I debugged and think these lines do not make any sense at all:
if (!(parseInt(headers['content-length'], 10)
|| 'transfer-encoding' in headers)) return;
In my requests I have no content-length yet and there is no transfer-encoding.
This here is my header:
{ host: 'localhost:8080',
connection: 'keep-alive',
'cache-control': 'no-cache',
pragma: 'no-cache',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36',
'content-type': 'application/json',
accept: '*/*',
dnt: '1',
referer: 'https://localhost:8080/',
'accept-encoding': 'gzip,deflate,sdch',
'accept-language': 'en-GB,en-US;q=0.8,en;q=0.6',
cookie: 'session=81e32809-7efe-447d-bbda-624acbbc2543; session.sig=d2gT3Yssb_ZLqYixqojzEGs53j8; _ga=GA1.1.1416672249.1394592952; language=en; express:sess=eyJwYXNzcG9ydCI6e319; express:sess.sig=bhxojHt1UQWbtKNR_ztU_-X2LLg' }
It must work with these!
what crazy client is making these requests?
The presence of a message-body in a request is signaled by the inclusion of a Content-Length or Transfer-Encoding header field in the request's message-headers.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.3
this sounds like a bug with your client
This client https://github.com/visionmedia/superagent
Has quite a lot of stars and forks.
And really, content-length is only relevant when it comes to the response header. But in my case I am asking the request if it's a JSON or not. Totally irrelevant what the content length of the response will be!
well it should at least have transfer-encoding
. we also do this because jquery sends the content-type
header without a body, then people complain when it throws a 400
error.
Does superagent set transfer-encoding
?
i'm pretty sure that's something the browser sends automatically. i don't think xhr
can set that header.
superagent deletes it:
https://github.com/visionmedia/superagent/blob/master/lib/node/utils.js#L149
{
"host" : "10.0.1.2:3000",
"accept-encoding" : "gzip, deflate",
"accept-language" : "en-us",
"accept" : "*/*",
"origin" : "http://10.0.1.2:3000",
"content-length" : "0",
"connection" : "keep-alive",
"user-agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/537.75.14",
"x-csrf-token" : "LBvWj8zfB0-cjpTd3oJzsxP8bLA4X0K+pA3PAY=",
"dnt" : "1"
}
that's how my request headers look like and i'm using superagent
that's node though
Alright. But why am I seeing no content-length here?? Grrr ...
And since you said, that's something the browser is sending, how can I check in my Chrome if Chrome is doing it correctly?
just open up the inspector. shows you request/response/body
Yes, I already do that. I only see the content-length in the response, but not in the request.
Yeah I see and believe you. But I do not have any content-length here at all!
Request URL:https://localhost:8080/video?size=preview&from=0&limit=9
Request Method:GET
Status Code:200 OK
Request Headers (view source)
Accept:*/*
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8,de;q=0.6
Connection:keep-alive
Content-Type:application/json
Cookie:session=2f6a66cd-8d04-4d72-9026-5fccef39d649; session.sig=O378FHdqAXw3JEvgnPSEMdFbKaM; connect.sess=s%3Aj%3A%7B%22passport%22%3A%7B%22user%22%3A%2211e3-bf90-86a1bfb0-9574-198ee2fcdadb%22%7D%7D.Qt%2BUAiwanyEVaYlsRKUiW%2BwZ%2BFTxY1uc%2F7LjkSy54yM; express:sess=eyJwYXNzcG9ydCI6e319; express:sess.sig=bhxojHt1UQWbtKNR_ztU_-X2LLg
DNT:1
Host:localhost:8080
Referer:https://localhost:8080/
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36
haha at this point, just ask stackoverflow or something. beyond what i know
I could email you a secret link to a demo site. Would you do this?
Just sent you an email ...
Sorry to bug you but this is really important.
If you look at the older ExpressJS code you see it is very different:
https://github.com/visionmedia/express/blob/3.x/lib/request.js#L322
It always worked like that for me until recently with the change to v4 of ExpressJS, using your type-is module. I fear it is breaking a lot of things ...
Can't you remove
if (!(parseInt(headers['content-length'], 10)
|| 'transfer-encoding' in headers)) return;
for now? IMO robustness is more important than respecting weird W3 standards that do not make much sense.
yeah i guess we could. anyone else have input? maybe we can change this library to only check a type
against an array of types
@jonathanong yes, please!
Or introduce a new option called strict
which is false by default and when true, check the content-length as usual.
It always worked like that for me until recently with the change to v4 of ExpressJS, using your type-is module. I fear it is breaking a lot of things ...
The purpose of the 3.x -> 4.x version number change was because stuff is going to break because it's not all backwards-compatible.
Please do not remove that logic out of rush. @binarykitchen you posted above a GET request... A GET request would never have a body and req.is
only works with bodies. Why would you use that on a GET request? Are you sure you didn't mean to make a POST request with some JSON or send an Accept
header and use req.accepts
?
Aaahhhh, @dougwilson, you nailed it. I had a mess between req.is
and req.accepts
- now I fixed it in my app and everything works fine now!
I wish superagent would have shown a warning in my case. I was setting the content type with .type()
for a GET request which was wrong. See http://visionmedia.github.io/superagent/#setting-the%20content-type
All good. Case closed. But I think we should document a little more. Note in capitals that type-is only works with bodies and so on.
@jonathanong Sorry about all that. Hope we all learned something here :)
that's weird. GET
requests could still have bodies (though htey shouldn't). but i guess your request really had no body and had the content-type
header?
but i guess your request really had no body and had the content-type header?
@binarykitchen was actually manually adding the content-type
header, which is how it got there.
GET requests could still have bodies (though htey shouldn't).
Yes, a GET request can certainly have a body, but who does that??
i'm going to remove the automatic body checking by default and expose it as a utility. i'd like to use this in non-HTTP settings.
Hey all,
We're also running into the same issue after updating to the latest express version. I checked other implementations of content type checking (e.g. Rack/Rails), and they correctly handle the case of an empty message body, and express
/type-is
should handle this correctly, too.
I'm happy to submit a PR that fixes that, as the current behaviour makes no sense: There's no way to send an empty content-body while specifying a content-type, and have the application handle that correctly.
setting a content-type without a body doesn't make any sense. either way, you can use typeis.is()
directly: https://github.com/expressjs/type-is/blob/master/index.js#L6 it's undocumented on the readme but it's not going to change anytime soon
@jonathanong From: http://www.w3.org/2001/tag/doc/whenToUseGet.html#safe
Note that it is possible to use POST even without supplying data in an HTTP message body. In this case, the resource is URI addressable, but the POST method indicates to clients that the interaction is unsafe or may have side-effects.
we're talking about content types + no body, not just having no message body. the request method is irrelevant. you can always send no body if you'd like.
it's like telling the server, "here's nothing, and i'm going to name it Amanda". it doesn't make sense, and ignoring it makes servers and developers less prone to errors
Also, from http://tools.ietf.org/html/rfc7230
A user agent SHOULD send a Content-Length in a request message when
no Transfer-Encoding is sent and the request method defines a meaning
for an enclosed payload body. For example, a Content-Length header
field is normally sent in a POST request even when the value is 0
(indicating an empty payload body).
that's irrelevant as well. this checks content-type. if the content-length is 0, then there's no content.
No content and empty content are two different things.
technically, yes, but for practical purposes, no. what's a case that you'll actually need "empty content" and why would a content-type for this "empty content" be actually useful?
We're also running into the same issue after updating to the latest express version
I assume you went from 3.x -> 4.x? That is bound to have breaking changes, of course (the reason for a major version number change in semver). The change talked about here was a deliberate change, not accidental.
@arthurschreiber thanks for hijacking this thread. Can you show a use-case where you would need to know the content-type of a zero-length body? Usually when people want to do that, they are using the incoming content-type to determine what they they should respond to the client in, which is invalid, because that's what the accept
header is for.
@dougwilson Hey there, thanks for chiming in. I hope I did not come off as cocky in my previous posts (and if I did so, I'm terribly sorry).
I understand that moving from 3.x
to 4.x
involves breaking changes, but I'd hope those would be in the API, and not in basic handling of HTTP requests/responses.
Usually when people want to do that, they are using the incoming content-type to determine what they they should respond to the client in, which is invalid, because that's what the accept header is for.
That's not what the spec says:
A request without any Accept header field implies that the user agent
will accept any media type in response.
There is no mention that using the Content-Type to sniff out what response type is expected is prohibited. I understand that this is probably not the right way to do, but my problem is that I can't influence what some clients of our application send (e.g. because they use some third party tools that are batshit crazy and don't follow "the right way").
From looking through the HTTP RFCs, I don't see any mentions that the behaviour of express 3.x was wrong or broken. It definitely was not encouraged, but also not discouraged, and it's what other frameworks like rails do.
I understand if you feel like this discussion stopped being useful, and I'll back off if that's the case. @jonathanong mentioned a workaround I could use in our app, so I'm ok on that front. Still, I wanted to understand where this sudden change was/is coming from, because I can't really find anything in the RFCs that would warrant it. 😄
Hey there, thanks for chiming in. I hope I did not come off as cocky in my previous posts (and if I did so, I'm terribly sorry).
No, it was mostly from the old thread resurrection, rather than creating a new issue (and perhaps linking to an old thread). It just makes looking back at old threads harder to follow.
There is no mention that using the Content-Type to sniff out what response type is expected is prohibited. I understand that this is probably not the right way to do, but my problem is that I can't influence what some clients of our application send (e.g. because they use some third party tools that are batshit crazy and don't follow "the right way").
Correct, if no accept
header is provided, you are technically free to sniff whatever header you want. In that cause, you may want to add Vary: Content-Type
in your response headers for caching proxies to not get confused.
From looking through the HTTP RFCs, I don't see any mentions that the behaviour of express 3.x was wrong or broken.
Correct, though RFC 2616 (which was what existed when the change was made) had a slightly stronger opinion that the newer HTTP/1.1 RFCs do.
Specifically, the change was made so people could do req.is
and then just parse the body. Most content-types are invalid when they have zero-length (think images, JSON, pretty much all binary formats, etc.). The express 4.x req.is
maps to this library's typeofrequest
which determines the type of a request as a whole, rather than just sniffing the content-type. I'm not familiar with Rails, but it seems likely the API you are referring to is like is
in this lib, rather than like typeofrequest
.
Can you point me to the API in Rack/Rails that is different from this? So far I only found request.content_type
, which in Node.js the equivalent of that is req.headers['content-type']
, but I may easily be missing what you are talking about since I'm not familiar with Ruby things :)
a workaround I could use in our app, so I'm ok on that front
Yea, that work-around is 100% valid and we don't intend to change that function (which is to look only at content-type header).
@dougwilson Check this out:
And more specifically, https://github.com/rails/rails/blob/master/actionpack/lib/action_dispatch/http/mime_negotiation.rb#L34-L44
and
https://github.com/rails/rails/blob/master/actionpack/lib/action_dispatch/http/mime_negotiation.rb#L19-L27
Basically, if no Accept
header is given (or it is empty), Rails falls back to the Content-Type
to determine the acceptable content type for the response.
I think you have the wrong library. I think you want https://github.com/expressjs/accepts
Maybe I'm missing something, but accepts
is for parsing the Accept
header, and won't help me in determining what content type I should respond with if no Accept
header was provided in the first place.
Again, I understand this is a pretty special edge case, that 99.9% of the users won't care about, and I don't want to pester you too much about this. 😄
Our current code tries req.accepts
, and then we try to fall back to req.is
, as a last resort.
OK, thanks for the links to the Rails code :) So I see where the confusion is. Yes, the request.formats
and request.accepts
do indeed correspond to the accepts
library (req.accepts
in express). What it is is Rails goes through extra length to fallback to content-type
sniffing. Basically sniffing the content-type when there is no accepts is actually a "feature" of Rails. In this case, if you wanted it built-in to express, it would actually be a feature request on the accepts
library, which would in turn use is
from this library :) I don't think that is a good thing to do, though, so I doubt we'll implement that weird fallback, though you are certainly free to do so.
Basically you just want the fallback behavior that Rails does in your express code, I certainly get it. I'd suggest adding this middleware in your express app to simplify your life:
var accepts = require('accepts')
var typeis = require('type-is')
app.use(function(req, res, next){
// Rails-like accepts with content-type fallback
req.accepts = railsAccepts
next()
})
function railsAccepts(){
var accept = accepts(this)
var contentType = req.headers['content-type']
return accept.types.apply(accept, arguments)
|| typeis.is.bind(null, contentType).apply(null, arguments)
}