ninenines / cowboy

Small, fast, modern HTTP server for Erlang/OTP.

Home Page:https://ninenines.eu

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is Cowboy affected by the HTTP/2 Rapid Reset attack?

eproxus opened this issue · comments

I believe cowboy is already protected against this by this HTTP/2 option, which by default permits resetting 10 streams in 10 seconds:

max_reset_stream_rate ({10, 10000})

Maximum reset stream rate per connection. This can be used to protect against misbehaving or malicious peers that do not follow the protocol, leading to the server resetting streams, by limiting the number of streams that can be reset over a certain time period. The rate is expressed as a tuple {NumResets, TimeMs}. This is similar to a supervisor restart intensity/period.

I have not tested this though.

[Edit] I was wrong. This option only controls resets done by the server. Loïc explains it in this comment: #1615 (comment)

That's actually an old one. Not sure why there's a new CVE. See #1398 and the original CVE https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-9514

The relevant test is https://github.com/ninenines/cowboy/blob/master/test/security_SUITE.erl#L205-L230

Oh nevermind that's not exactly the same. This one is the client doing request/cancel very fast many times. I don't think we have a protection against that one.

I think we basically need a max_cancel_stream_rate that works more or less the same as max_reset_stream_rate. Should be an easy enough contribution if anyone wants to do it.

There's a protocol-level draft in https://martinthomson.github.io/h2-stream-limits/draft-thomson-httpbis-h2-stream-limits.html that would likely prove better long term but in the meantime a max_cancel_stream_rate will remain necessary.

Note that these protections are not enough on their own. When Cowboy closes a connection when a reset/cancel rate has been reached, nothing prevents the misbehaving client from connecting again. Cowboy does not provide block lists or other features to fully prevent the DoS, it can only make it a little more expensive to the attacker. The protocol-level draft would work out much better against attack scenarios.

I believe if you have nginx, haproxy or Apache httpd in front of Cowboy you are not affected by this issue.

I think we basically need a max_cancel_stream_rate that works more or less the same as max_reset_stream_rate. Should be an easy enough contribution if anyone wants to do it.

What's the difference between "cancel" and "reset"? The frame to cancel a stream is called RST_STREAM and this is what the existing option controls. CANCEL is only an error code, which can be used in RST_STREAM or GOAWAY. There is no other mention of "cancel" in rfc7540. What am I missing?

Does Cowboy parse all stream messages in one packet before acting on them? If so, it could cancel out all request/cancel pairs that essentially are no-ops before handling them

What the option controls is when the server resets the stream due to an error. From Cowboy's point of view, if the stream has to be stopped due to an error it has to reset it, but if it's the client resetting then it has to cancel the request handling.

Right now Cowboy has protections against bad requests that it has to repeatedly reset, but there's nothing for the client doing the resets (cancelling requests).

Does Cowboy parse all stream messages in one packet before acting on them? If so, it could cancel out all request/cancel pairs that essentially are no-ops before handling them

No and that would only work for request -> cancel -> request -> cancel anyway. Clients could do N requests -> N cancels.

Two more relevant points.

Depending on the server, and particularly in proxies, a stream may be reset by the client but a lot of resources may still be in use while the request is being shut down. In vanilla Cowboy there is no such problem because Cowboy will shutdown the request process. This process by default does not trap exits and so will terminate immediately.

The other angle of attack is the way streams are counted. HTTP/2 has a configurable stream limit. The number of current streams is the active streams, meaning the streams which have not ended and not been reset. In the scenarios where resources stay up for a while after the reset, this becomes a big problem because those reset streams do not count toward the limit, and therefore the attacker wastes a lot of resources with each canceled request.

So applications that are likely to be affected are those that have request handlers that trap exits. Others will see a waste of resources similar to a simple flood of requests.

Is it documented anywhere that Cowboy handler processes can get terminated basically at any point?

Not sure exactly but related options like shutdown_timeout are documented.

Note that these protections are not enough on their own. When Cowboy closes a connection when a reset/cancel rate has been reached, nothing prevents the misbehaving client from connecting again. Cowboy does not provide block lists or other features to fully prevent the DoS, it can only make it a little more expensive to the attacker. The protocol-level draft would work out much better against attack scenarios.

I believe if you have nginx, haproxy or Apache httpd in front of Cowboy you are not affected by this issue.

We use end-to-end TLS connections and terminate TLS on the Erlang node, and we use a load balancer (nginx, haproxy, etc.) only on the TCP layer. This is to keep the data encrypted even within a semi-trusted environment. With this setup, the load balancer can handle and block DoS attacks when many connections are created, but it cannot see the rapid reset attack, which is just a single TCP connection.

I have opened a PR.

One existing config that limits the attack seems to be max_received_frame_rate with the default limit of 10000 frames in 10 seconds (limiting the rapid reset attack to 5000 pairs of new stream and reset stream).

We use end-to-end TLS connections and terminate TLS on the Erlang node, and we use a load balancer (nginx, haproxy, etc.) only on the TCP layer. This is to keep the data encrypted even within a semi-trusted environment. With this setup, the load balancer can handle and block DoS attacks when many connections are created, but it cannot see the rapid reset attack, which is just a single TCP connection.

That's fine as long as it detects the many connections we keep closing and acts on that. That's what I meant.

One existing config that limits the attack seems to be max_received_frame_rate with the default limit of 10000 frames in 10 seconds (limiting the rapid reset attack to 5000 pairs of new stream and reset stream).

Sounds good but fairly expensive still to the server.

I have opened a PR.

I will take a look next week when I get back to work. Thanks!

PR was merged. This will be in upcoming Cowboy 2.11. If a guide to tweaking security configuration for Cowboy is needed, please open a separate ticket or better yet a PR. Closing this, thanks!