siderolabs / omni

SaaS-simple deployment of Kubernetes - on your own hardware.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make `MachineStatusSnapshot` use `MachineStatus` to get the current state of the machine

Unix4ever opened this issue · comments

Problem

MachineStatus is the current state of the machine. Updates to MachineStatus resource are sent by Talos as events and delivered over SideroLink to Omni.

If the machine can't send events, it might skip sending events, and an update might be lost.

Also we can watch MachineStatus resource over Talos API, but this requires the API to be up. Talos API is down e.g. when initial install is performed.

Solution

Do not rely on the machine events, but pull the current state from the runtime.MachineStatus resource.

We need to discuss the problem of getting MachineStatus updates when the API is down: e.g. installing phase.

idk if we should somehow "merge" updates from both sources?

Proposal

We have two event receiving mechanisms:

  1. Events sent over SideroLink - they have ID which is xid, and contains a timestamps.
  2. Events watched using resource API - they have updated_at timestamp.

Both timestamps are generated by Talos, so even if the clock is not in sync, they are comparable.

So we can merge both sources:

  • if the update came over same channel as before, use it
  • if the update came over a different channel than before, and the timestamp is newer than previous update, use it
  • otherwise, drop it