flatcar / Flatcar

Flatcar project repository for issue tracking, project documentation, etc.

Home Page:https://www.flatcar.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FlatCar Beta 3913.1.0 with systemd 255 enables DHCP rapid commit by default

daMupfel opened this issue · comments

Description

The new Beta FlatCar with version 3913.1.0 updated systemd to version 255. With this new version comes support for DHCP RapidCommit which seems to be enabled by default:

RapidCommit=

    Takes a boolean. The DHCPv4 client can obtain configuration parameters from a DHCPv4 server through a rapid two-message exchange (discover and ack). When the rapid commit option is set by both the DHCPv4 client and the DHCPv4 server, the two-message exchange is used. Otherwise, the four-message exchange (discover, offer, request, and ack) is used. The two-message exchange provides faster client configuration. See [RFC 4039](https://tools.ietf.org/html/rfc4039) for details. Defaults to true when Anonymize=no and neither AllowList= nor DenyList= is specified, and false otherwise.

    Added in version 255.

Our cloud provider (CloudSigma) seems to have a faulty implementation of DHCPv4 rapid commit which means that we are no longer getting an IP address.

This can be fixed (for existing servers) by copying the default config from /etc/systemd/network/za-dhcp-no-rapid-commit.network as an own config and adapting the DHCPv4 section as follows:

[DHCPv4]
RoutesToDNS=false
RapidCommit=false

Impact

Not getting an IP address. Because the CloudInit process for CloudSigma requires an assigned lease this also means that the whole setup doesn't work anymore.

Environment and steps to reproduce

  1. Upload current beta FlatCar CloudSigma vendor image to CloudSigma
  2. Create a new machine
  3. No public IP is assigned and the CloudInit process never runs

Expected behavior

Server correctly setup with IP and CloudInit config.

Additional information

We are also in discussions with CloudSigma in order to fix their DHCP implementation. Not sure when and how this will go though.

This is not really a bug on Flatcars side but rather a break for us because the network config is now different with the new version.

The question is how this could be fixed (if you are open to do it on the FlatCar side). I currently see the following options:

  • Update the default network config to disabled rapid commit
  • Add a custom network config file to the vendored CloudSigma image

I would like to get some feedback for this and probably can provide a PR if you would be fine with one of the proposed solutions :).

Add a custom network config file to the vendored CloudSigma image

this would definitely be a good idea if the default does not cause widespread problems for other platforms

@jepio if added only to oem-cloudsigma it shouldn't affect other platforms, should it? And it potentially affects all CloudSigma deployments the way I read the summary.

@daMupfel I would argue that implementing this should be done as an OEM sysext so the change is also distributed to existing nodes when these update (@pothos please keep me honest).
Using an OEM sysext would also allow to change the config with future updates if required. As sysexts cover /usr, the config should go to /usr/lib/systemd/network/.
This is slightly (but only slightly) more complicated than just dropping a config file to the oem-cloudsigma provider. The biggest challenge is to introduce OEM sysext to the cloudsigma image as this image is currently not using OEM sysexts afaict. But that shouldn't keep you from working on a PR, OEM sysexts are used for most other images. The concept should be easily portable to cloudsigma.

I think the OEM sysext might get loaded too late? For most clouds the small network config files are part of the base image because they need to be in bootengine and in init.

Hmmm, good point, re-reading the summary it states that bootstrap configuration fails, so this is required in the initrd. No sysext then.

Hi, thanks for the feedback so far :).

When adding it to the oem image it won't be updated on existing installations (the oem partition seems to keep the state of the original install), is that correct? At least that was my observation so far.
If so, are there any options to make this work for existing installations which update?