django / daphne

Django Channels HTTP/WebSocket server

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Daphne allows invalid characters within header names

kenballus opened this issue · comments

The bug

RFC 9110 defines the header names must consist of a single token, which is defined as follows:

  token          = 1*tchar
  tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
                 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
                 / DIGIT / ALPHA
                 ; any VCHAR, except delimiters

When Daphne receives a request with a header name containing any or all of the following characters, it does not reject the message:

\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f "(),/;<=>?@[/]{}\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef

Thus, when you send the following request:

GET / HTTP/1.1\r\n
\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f "(),/;<=>?@[/]{}\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef: whatever\r\n
\r\n

Daphne's interpretation is as follows:

[
    HTTPRequest(
        method=b'GET', uri=b'/', version=b'1.1',
        headers=[
            (b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f "(),/;<=>?@[/]{}\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef', b'whatever'),
        ],
        body=b'',
    ),
]

(i.e. the invalid header name made it into the request unchanged)

Nearly all other HTTP implementations reject the above request, including Uvicorn, Hypercorn, AIOHTTP, Apache httpd, Bun, CherryPy, Deno, FastHTTP, Go net/http, Gunicorn, H2O, Hyper, Jetty, libsoup, Lighttpd, Mongoose, Nginx, Node.js LiteSpeed, Passenger, Puma, Tomcat, OpenWrt uhttpd, Uvicorn, Waitress, WEBrick, and OpenBSD httpd.

Your OS and runtime environment:

$ uname -a
Linux de8a63fbb2f4 6.7.2-arch1-2 #1 SMP PREEMPT_DYNAMIC Wed, 31 Jan 2024 09:22:15 +0000 x86_64 GNU/Linux

pip freeze:

asgiref==3.7.2
attrs==23.2.0
autobahn==23.6.2
Automat==22.10.0
cffi==1.16.0
constantly==23.10.4
cryptography==42.0.2
Cython==0.29.32
daphne @ file:///app/daphne
h2==4.1.0
hpack==4.0.0
hyperframe==6.0.1
hyperlink==21.0.0
idna==3.6
incremental==22.10.0
priority==1.3.0
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycparser==2.21
Pygments==2.14.0
pyOpenSSL==24.0.0
python-afl==0.7.4
PyYAML==6.0
service-identity==24.1.0
six==1.16.0
Twisted==23.10.0
txaio==23.1.1
typing_extensions==4.9.0
zope.interface==6.1

(Using Daphne built from main, with the latest commit being 993efe6 at the time of writing)

How you're running Channels (runserver? daphne/runworker? Nginx/Apache in front?)

Directly invoking Daphne from the command line.

Console logs and full tracebacks of any errors

N/A

Of particular note is that \x00, \t, and are permitted, which have historically been sources of exploitable parsing discrepancies between origin servers and gateway servers.

@kenballus thanks for the report. Are you up for making a PR here?

h11 gives a regex for a header name here:

https://github.com/python-hyper/h11/blob/a2c68948accadc3876dffcf979d98002e4a4ed27/h11/_abnf.py#L10-L21C1

That's then validated when parsing the headers here: https://github.com/python-hyper/h11/blob/a2c68948accadc3876dffcf979d98002e4a4ed27/h11/_headers.py#L163

Gunicorn does similar here https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/gunicorn/http/message.py#L92-L93 (but matching bad headers, with what looks like a more complex regex.)

We'd handle that in the http and ws protocols:

# Sanitize and decode headers, potentially extracting root path
self.clean_headers = []
self.root_path = self.server.root_path
for name, values in self.requestHeaders.getAllRawHeaders():
# Prevent CVE-2015-0219
if b"_" in name:
continue
for value in values:
if name.lower() == b"daphne-root-path":
self.root_path = unquote(value.decode("ascii"))
else:
self.clean_headers.append((name.lower(), value))

# Sanitize and decode headers
self.clean_headers = []
for name, value in request.headers.items():
name = name.encode("ascii")
# Prevent CVE-2015-0219
if b"_" in name:
continue
self.clean_headers.append((name.lower(), value.encode("latin1")))

Need to investigate whether this should be handled by twisted. https://github.com/twisted/twisted/blob/446ee139189440e890b26a29af256e9b9d0e8eba/src/twisted/web/http_headers.py

Should be addressed by #500.