h2o / h2o

H2O - the optimized HTTP/1, HTTP/2, HTTP/3 server

Home Page:https://h2o.examp1e.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

H2O does not validate `Content-Length` values for HTTP/1.1 requests that contain a `Transfer-Encoding: chunked` header.

kenballus opened this issue · comments

The bug

RFC 9110 uses the following ABNF rule to define the acceptable values of the Content-Length header:

Content-Length = 1*DIGIT

When H2O receives a request with no Transfer-Encoding header, and an invalid Content-Length header, it correctly rejects the request.

When H2O receives a request with a Transfer-Encoding: chunked header, and an invalid Content-Length header, it strips the Content-Length header out of the request, but does not reject the request.

RFC 9110 is clear that this is something a send MUST not do:

Likewise, a sender MUST NOT forward a message with a Content-Length header field value that does not match the ABNF above, with one exception: a recipient of a Content-Length header field value consisting of the same decimal value repeated as a comma-separated list (e.g, "Content-Length: 42, 42") MAY either reject the message as invalid or replace that invalid field value with a single instance of the decimal value, since this likely indicates that a duplicate was generated or combined by an upstream message processor.

Reproducing the bug

To verify this for yourself, try sending the following request to H2O:

POST / HTTP/1.1\r\n
Host: a\r\n
Content-Length: blahblahblah\r\n
Transfer-Encoding: chunked\r\n
\r\n
1\r\n
Z\r\n
0\r\n
\r\n

It should forward something like the following to its backend:

POST / HTTP/1.1\r\n
host: a\r\n
connection: keep-alive\r\n
content-length: 1\r\n
x-forwarded-proto: http\r\n
x-forwarded-for: 172.19.0.1\r\n
via: 1.1 a\r\n
\r\n
Z

You can reproduce my experimental setup exactly using the HTTP Garden. Just build the Garden, and run the following command at the repl:

transducers h2o_proxy; payload 'POST / HTTP/1.1\r\nHost: a\r\nContent-Length: blahblahblah\r\nTransfer-Encoding: chunked\r\n\r\n1\r\nZ\r\n0\r\n\r\n'; transduce

You should see something like this:

[1]: 'POST / HTTP/1.1\r\nHost: a\r\nContent-Length: blahblahblah\r\nTransfer-Encoding: chunked\r\n\r\n1\r\nZ\r\n0\r\n\r\n'
    ⬇️ h2o_proxy
[2]: 'POST / HTTP/1.1\r\nhost: a\r\nconnection: keep-alive\r\ncontent-length: 1\r\nx-forwarded-proto: http\r\nx-forwarded-for: 172.19.0.1\r\nvia: 1.1 a\r\n\r\nZ'

The fact that the message was forwarded demonstrates the bug.

You can also see how H2O interprets this request when it's acting as an origin server:

servers h2o; payload 'POST / HTTP/1.1\r\nHost: a\r\nContent-Length: blahblahblah\r\nTransfer-Encoding: chunked\r\n\r\n1\r\nZ\r\n0\r\n\r\n'; fanout

You should see something like this:

h2o: [
    HTTPRequest(
        method=b'POST', uri=b'/', version=b'1.1',
        headers=[
            (b'content_length', b'1'),
            (b'host', b'a'),
        ],
        body=b'Z',
    ),
]

Again, the Content-Length header was not validated.

Versions

$ h2o --version
h2o version 2.3.0-DEV@d90da70f3
OpenSSL: OpenSSL 3.0.11 19 Sep 2023
mruby: YES
fusion: YES
ssl-zerocopy: YES
ktls: YES
$ uname -a
Linux 69a420df15e4 6.7.2-arch1-2 #1 SMP PREEMPT_DYNAMIC Wed, 31 Jan 2024 09:22:15 +0000 x86_64 GNU/Linux

Thank you for opening the issue.

RFC 9112 Section 6.3 describes the steps for HTTP/1.1 endpoints to determine the message body length. The section places what to do when both Content-Length and Transfer-Encoding exists (in step 3), before the detection of invalid Content-Length (in step 6).

Considering that, I believe what we are doing is correct?

RFC 9112 Section 6.3 describes the steps for HTTP/1.1 endpoints to determine the message body length. The section places what to do when both Content-Length and Transfer-Encoding exists (in step 3), before the detection of invalid Content-Length (in step 6).

Step 6 addresses messages with invalid Content-Length and no Transfer-Encoding header. The scenario I'm concerned about is when there is an invalid Content-Length and a valid Transfer-Encoding header. This is therefore covered by step 3.

The full text of step 3 is the following: (emphasis mine)

If a message is received with both a Transfer-Encoding and a Content-Length header field, the Transfer-Encoding overrides the Content-Length. Such a message might indicate an attempt to perform request smuggling (Section 11.2) or response splitting (Section 11.1) and ought to be handled as an error. An intermediary that chooses to forward the message MUST first remove the received Content-Length field and process the Transfer-Encoding (as described below) prior to forwarding the message downstream.

Thus, an intermediary is correct in removing the Content-Length header, but only once it has otherwise chosen to forward the message.

In RFC 9110, section 8.6, we find that intermediaries MUST NOT forward messages with invalid Content-Length headers:

Likewise, a sender MUST NOT forward a message with a Content-Length header field value that does not match the ABNF above, with one exception: a recipient of a Content-Length header field value consisting of the same decimal value repeated as a comma-separated list (e.g, "Content-Length: 42, 42") MAY either reject the message as invalid or replace that invalid field value with a single instance of the decimal value, since this likely indicates that a duplicate was generated or combined by an upstream message processor.

Combining these two paragraphs, we have that intermediaries that receive messages containing both Content-Length and Transfer-Encoding headers must validate the Content-Length header, reject the message if it is invalid, and forward it otherwise.

For further evidence of this, note that nearly all other HTTP implementations reject messages with invalid Content-Length headers, regardless of the presence of a Transfer-Encoding header. Among these are AIOHTTP, Apache httpd, Bun, CherryPy, Daphne, Deno, FastHTTP, Gunicorn, Hyper, Hypercorn, Jetty, Lighttpd, Mongoose, Nginx, Node.js, LiteSpeed, Passenger, Tomcat, Tornado, OpenWrt uhttpd, Unicorn, Uvicorn, WEBrick, OpenBSD httpd, HAProxy, nghttpx, Pound, Varnish, Akamai, AWS Cloudfront, Cloudflare, Fastly, and OpenBSD relayd.

The full text of step 3 is the following: (emphasis mine)

If a message is received with both a Transfer-Encoding and a Content-Length header field, the Transfer-Encoding overrides the Content-Length. Such a message might indicate an attempt to perform request smuggling (Section 11.2) or response splitting (Section 11.1) and ought to be handled as an error. An intermediary that chooses to forward the message MUST first remove the received Content-Length field and process the Transfer-Encoding (as described below) prior to forwarding the message downstream.

Thus, an intermediary is correct in removing the Content-Length header, but only once it has otherwise chosen to forward the message.

Yeah, step 3 says two things, that i) Transfer-Encoding overrides Content-Length, and that ii) if the message is forwarded Content-Length is removed.

If the message is not to be forwarded, we go to step 4 that discusses how the message is to be handled if it has a transfer-encoding. A message that has both Content-Length and Transfer-Encoding falls into this handling, because, as stated in the first part of step 3, Transfer-Encoding overrides Content-Length.

IIRC, the steps described in RFC 9112 comes from RFC 7230, we have not had any issues with the approach. The phrase of RFC 9110 that you point out was added because we split the semantics and wire-encoding.

For further evidence of this,...

Other implementations can certainly do that. I know that at least some of them try to parse the value of Content-Length header as they split the header section into each header. For such a design, it totally make sense to reject HTTP requests with invalid Content-Length values even if they have Transfer-Encoding.

The fact does not mean that decoding the Content-Length is required even when Transfer-Encoding exists.