Recent kernels enforce system-wide unique L2TPv3 session IDs

Question

Recent kernels enforce system-wide unique L2TPv3 session IDs

RalfJung opened this issue 7 years ago · comments

When running tunneldigger on the current Debian stable kernel (4.9.51), only one client can connect. The second client fails because the l2tp tunnel interface does not appear. After fixing a bug in the netlink interface (#50), one can see that the kernel sends an EEXIST in reply to the session_create.

A lot of digging through the linux kernel sources uncovered the source of the issue: L2TPv3 session IDs have to be unique system-wide. Tunneldigger hard-codes a session ID of 1 for every connection. That used to work due to a bug in the kernel, which meant that the kernel failed to actually ensure uniqueness of the session ID. That bug got fixed by https://github.com/linux-stable/linux-stable/commit/dbdbc73b44782e22b3b4b6e8b51e7a3d245f3086, which was backported to a few stable series, in particular, to 4.9.36.

Proposed fix

Fixing this in a compatible way will require protocol changes: Both ends of the tunnel have to know each others session ID, so they have to negotiate whether they use 1 or something more unique. I started working on a fix at https://github.com/freifunk-saar/tunneldigger/tree/wlanslovenija. The approach is summarized in the commit message over there, copied here for reference:

This patch adds unique session IDs to tunneldigger in a backwards-compatible
way.  If both ends of the tunnel agree to use a unique session ID, they both
will use the tunnel ID as the session ID.  To manage this mutual agreement, two
messages in the protocol are changed:

CONTROL_TYPE_PREPARE gains a new optional byte at the end that clients use to
indicate to the server whether they want to use a unique session ID.  Old
servers will just ignore this additional byte.  New servers now know they are
talking with a modern client, and use unique session IDs for this connection.
New servers talking with old clients will notice the absence of this request and
use 1 as the session ID.

Furthermore, CONTROL_TYPE_TUNNEL gains a new optional byte at the end that
servers use to tell clients that they acknowledge using unique session IDs.  Old
clients will never see this additional byte, as the server only sends it if
unique session IDs were requested in CONTROL_TYPE_PREPARE.  New clients know,
upon seeing this byte, that they are talking to a new server, and will hence use
unique session IDs.  If a new client talks to an old server, it will receive an
old-style CONTROL_TYPE_TUNNEL and hence know that it has to use session ID 1.

So, both old a new clients can talk with both old and new servers.  However, of
course, if the server has a recent enough kernel, even though it can communicate
with old clients, it still can only support one old client at a time.

I am running two of our four servers with this fix, so compatibility with old clients is tested already. However, due to #55, I can't say anything about long-time stability yet. I also couldn't yet test new clients as I am still fighting my firmware build system. (The client uses such an ancient version of libnl that I can't build it on the host.)

Open problems

As the last paragraph in the commit description says, there still is a potential problem: Once we upgrade one of our servers to a kernel including the problematic bugfix, only one old client will be able to talk to it at a time. There is nothing we can do about this, but what want to avoid is a client trying to connect to a new server and failing, while there are old servers (with higher usage) that could still support this client. I first tried to (ab)use CONTROL_TYPE_USAGE to let the client indicate whether it supports unique session IDs, so that the server could report "I am full" to old clients and steer them elsewhere. However, clients actually seem to send some rather arbitrary data alongside that message (UUUUUUUU, to be precise -- wtf?!?), so I am worried that attaching meaningful bytes here will not work very well. We could introduce a CONTROL_TYPE_USAGE2, but I think I have a better idea.

Clients already have a retry loop to connect again if the connection to the broker failed. I think clients should remember which broker failed, and exclude that one in the next round. Only once all brokers got excluded that way, they will be enabled again. This will, I think, improve client behavior in general, not just for this particular issue. It will also solve this issue as (after #50), brokers will send an error to clients when the session ID is already used, making the client try some other server. So, as long as one of the available brokers still has an old kernel, old clients will reliably be able to connect. Furthermore, even if all servers are on a new kernel, there can still be N old clients connected at the same time (and hopefully, they will fetch an auto-update and then become new clients).

I started implementing this, but got stuck yesterday due to the aforementioned build system issues.

Jernej Kos · Answer 1 · Mon Oct 30 2017 22:28:20 GMT+0800 (China Standard Time)

However, clients actually seem to send some rather arbitrary data alongside that message (UUUUUUUU, to be precise -- wtf?!?), so I am worried that attaching meaningful bytes here will not work very well.

Such padding is used in get cookie (and usage?) protocol messages in order to ensure a specific message length. This is so that the protocol does not enable traffic amplification attacks (e.g. where sending a small (with possibly spoofed source) message would generate a larger message in return).

At least in get cookie messages this was the reasoning. I did not implement the "usage" functionality, so I am not sure how it is used there, but it should be similar.

Ralf Jung · Answer 2 · Mon Oct 30 2017 22:43:58 GMT+0800 (China Standard Time)

Such padding is used in get cookie (and usage?) protocol messages in order to ensure a specific message length. This is so that the protocol does not enable traffic amplification attacks (e.g. where sending a small (with possibly spoofed source) message would generate a larger message in return).

I see, thanks. However, there also is a "minimum package length" thing going on in context_send_packet. However, that one doesn't increase the length field sent inside the package, so it does not create any issues with possible protocol extensions.
Also, the server doesn't check the msg length for CONTROL_TYPE_USAGE, so this does not even help. (It does for CONTROL_TYPE_COOKIE though.) That's probably a bug then?

Jernej Kos · Answer 3 · Mon Oct 30 2017 22:47:47 GMT+0800 (China Standard Time)

Also, the server doesn't check the msg length for CONTROL_TYPE_USAGE, so this does not even help. (It does for CONTROL_TYPE_COOKIE though.) That's probably a bug then?

Yes, I am not sure what the reasoning is with CONTROL_TYPE_USAGE having 8 bytes of padding (which is not even checked), as the response is only two additional bytes. For CONTROL_TYPE_COOKIE the padding length is the same as the cookie length. This probably needs to be fixed.

Ralf Jung · Answer 4 · Tue Oct 31 2017 18:17:57 GMT+0800 (China Standard Time)

Actually, while the "clients don't retry a broken broker" may still be a good idea, it doesn't help here. The problem is old clients, and obviously patching this now won't help for them.

So, I am still inclined to have a client indicate, in CONTROL_TYPE_USAGE, after the 8 bytes of padding, whether it supports unique session IDs. It's not nice, but it is the only thing I can think of which makes sure that old clients keep working after we have some (but not all) of our servers updated to the new kernel. If we don't do think, it may happen that the machines with new kernels all have lower usage, and then old clients will never even try to connect to the one machine with the old kernel that would still work for them.

Jernej Kos · Answer 5 · Tue Oct 31 2017 19:01:49 GMT+0800 (China Standard Time)

So we should probably add some kind of generic "supported features" field (e.g. a 32-bit bitmap field), which would indicate to the broker what kind of features are supported.

The question is where to put this field. If we put it in CONTROL_TYPE_USAGE messages, then there is no way to signal supported features if you don't use the usage-based broker selection. So in this case we would need to put it into CONTROL_TYPE_COOKIE messages as well (in this case, the broker can just reply with an error if the client doesn't support certain features).
Nevermind, I see that usage requests are always sent.

I don't like how this "usage" part was added to the protocol. It would be much better for the "usage" to be indicated as a feature in CONTROL_TYPE_COOKIE message and the broker would then include the usage message. But we can't really change this now without breaking the protocol.

Thinking about it, going forward, we should consider updating the protocol (there is already an 8-bit version field provided for that in all messages) so that the messages are based on a sequence of TLVs instead of fixed fields. This would make extending the protocol much easier in the future.

Ralf Jung · Answer 6 · Wed Nov 01 2017 00:45:24 GMT+0800 (China Standard Time)

Nevermind, I see that usage requests are always sent.

Right, ever since 88df1c5.

Thinking about it, going forward, we should consider updating the protocol (there is already an 8-bit version field provided for that in all messages) so that the messages are based on a sequence of TLVs instead of fixed fields. This would make extending the protocol much easier in the future.

I don't disagree. But I wouldn't want to do this as part of fixing this problem.

So we should probably add some kind of generic "supported features" field (e.g. a 32-bit bitmap field), which would indicate to the broker what kind of features are supported.

What about the following: For now, we make it an 8-bit field. It is sent both in the COOKIE (after the 8-byte padding) and the PREPARE message (so the server does not need state). Absence of the field means all-0. The first bit (b & 0x1) is "supports unique session IDs". The other ones are not used. (I'd like to enforce this, but it would not be good if old servers would bail on new clients setting more flags.)

I mean, we could use 4 bytes, but that seems wasteful?

Jernej Kos · Answer 7 · Wed Nov 01 2017 01:05:46 GMT+0800 (China Standard Time)

I don't disagree. But I wouldn't want to do this as part of fixing this problem.

Yes I agree that we should fix this first.

What about the following: For now, we make it an 8-bit field. It is sent both in the COOKIE (after the 8-byte padding) and the PREPARE message (so the server does not need state). Absence of the field means all-0. The first bit (b & 0x1) is "supports unique session IDs". The other ones are not used. (I'd like to enforce this, but it would not be good if old servers would bail on new clients setting more flags.)

Ok, sounds good.

Mitar · Answer 8 · Wed Nov 01 2017 02:18:19 GMT+0800 (China Standard Time)

I mean, we could use 4 bytes, but that seems wasteful?

Aren't these bytes send only in the first message? This is not too much. Maybe making it future proof to have more "supported features" if needed might be a good thing.

Ralf Jung · Answer 9 · Wed Nov 01 2017 02:33:27 GMT+0800 (China Standard Time)

I don'r really care, so sure. Seems like I will have some fun with endianess then :/

Ralf Jung · Answer 10 · Thu Nov 09 2017 06:10:56 GMT+0800 (China Standard Time)

I have a fix for this at https://github.com/freifunk-saar/tunneldigger/tree/wlanslovenija. The server side of this is currently already being tested in our network; for the client side I will hopefully soon have an experimental firmware but there are some other issues I want to look at.

Also, it is based on #61, so I am waiting for that to get merged.

Jorrit Poelen · Answer 11 · Thu Feb 15 2018 05:26:33 GMT+0800 (China Standard Time)

Seems like links to github commit with fix to linux-stable is broken.

Here's another reference to the patch https://patchwork.kernel.org/patch/9843481/ .