Make `MachineStatusSnapshot` use `MachineStatus` to get the current state of the machine
Unix4ever opened this issue · comments
Problem
MachineStatus
is the current state of the machine. Updates to MachineStatus
resource are sent by Talos as events and delivered over SideroLink to Omni.
If the machine can't send events, it might skip sending events, and an update might be lost.
Also we can watch MachineStatus
resource over Talos API, but this requires the API to be up. Talos API is down e.g. when initial install is performed.
Solution
Do not rely on the machine events, but pull the current state from the runtime.MachineStatus
resource.
We need to discuss the problem of getting MachineStatus updates when the API is down: e.g. installing phase.
idk if we should somehow "merge" updates from both sources?
Proposal
We have two event receiving mechanisms:
- Events sent over SideroLink - they have
ID
which isxid
, and contains a timestamps. - Events watched using resource API - they have
updated_at
timestamp.
Both timestamps are generated by Talos, so even if the clock is not in sync, they are comparable.
So we can merge both sources:
- if the update came over same channel as before, use it
- if the update came over a different channel than before, and the timestamp is newer than previous update, use it
- otherwise, drop it