siderolabs / omni

SaaS-simple deployment of Kubernetes - on your own hardware.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support Omni join config via machine config

smira opened this issue · comments

Talos since 1.5.0 support partial machine config submitted as user-data, which includes machine configuration for SideroLink join and event sink.

When Talos runs on the cloud platform with user-data support, we can use that to use "vanilla" image of Talos with join config submitted via user-data.

The flow goes like that:

  1. Vanilla Talos boots on the cloud.
  2. Talos discovers join config as partial machine config in the cloud user-data.
  3. Talos joins Omni.
  4. Omni pushes full machine config to the machine, which should include join configuration.
  5. Omni resets the STATE partition, wiping the full machine config.
  6. Talos reboots, and re-loads the join config from userdata, cycle repeats.

First goal: make Omni respect join config, that is submit it to the machine with the other bits of the machine config (see also siderolabs/omni-archive#1459).

Note: in maintenance mode, there's no way to get machine configuration from Talos, so Omni should synthesize and submit join config as one of the "config patches" if it discovers a machine joined to Omni e.g. without proper kernel args.

Second goal: make Omni publish expected join config in the UI/as a resource.

Putting on hold until the feature is released (v1.6.5, v1.5.7): siderolabs/talos#8326

Testing the full cycle requires a release with this fix: siderolabs/talos@1bb6027

Currently, with a partial machine config, the initial install works, but when the machine is reset and tried to be re-added to a new cluster, the triggered maintenance upgrade causes Talos to crash due to the partial config it holds.

The fix was merged, so it should be unblocked?

The fix was merged, so it should be unblocked?

Yep, I'm back at this, but now the way I'm doing, the integration test I'm adding won't even require the fix.