quicwg / load-balancers

In-progress version of draft-ietf-quic-load-balancers

Home Page:https://quicwg.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Expand to 3 Config ID bits

martinduke opened this issue · comments

Today, the draft has two bits reserved to identify the config (usually just the specific key) used to encrypt the routing information, with one codepoint reserved for 5-tuple routing (i.e. the connection ID was generated without an active config). This leaves 3 codepoints to support key rotation.

After a long discussion with people running a very large fleet of servers, there is concern that 3 codepoints are not enough, and that there should be seven (3 bits) instead. In a large deployment, some servers being badly out of sync is rare but not unheard of. Worse yet, this likely results in a black hole.

  1. Client first flight has random CID1, which is unroutable, so the load balancer uses some arbitrary f(CID1).
  2. Server generates a CID2 with an badly outdated config, which is used in the client's second flight
  3. The Load Balancer uses the correct key for the codepoint, which is not the key the server used. It is likely unroutable, so the load balancer applies f(CID2) and almost certainly gets a different result.
  4. The new server sees an undecryptable long header packet. At best, it can send a stateless reset and the client already received the stateless reset token. If the client does not have the token, or the server does not send stateless reset, the client will black hole: it has received a first flight from the server, and no further packets.

Going to three bits has a couple of downsides:

  1. Length self-encoding is limited to 32 byte connection IDs instead of 64. This is not an issue with existing QUIC (limited to 20 bytes) but if some future version has larger connection IDs with post-quantum keys, this is a problem, if a highly speculative one.
  2. Having more codepoints can encourage insecure behavior. In particular, a CDN with multiple customers behind an IP address can assign different codepoints and keys to each customer's servers. This would easily identify the target customer of any packet and largely neuter the benefits of ECH.

Regarding the concern I do no think they are significant, due to the following reasons:

  • I do not expect CDNs to implement capabilities for supporting just 7 customers.
  • ECH is about hiding a tree in the forest, difference of 1 bit should not cause problems. If it does, that is either a problem of ECH or the deployment being too small.