onedr0p / home-ops

Wife approved HomeOps driven by Kubernetes and GitOps using Flux

Home Page:https://onedr0p.github.io/home-ops/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ongoing updates to kube-vip after initial cluster bootstrap

lucidph3nx opened this issue · comments

Hi, I have been trying to figure out how kube-vip ends up being updated using your cluster setup.

I can see that it is deployed initially using the k3s server pod manifests which deploys this static pod, which has a tagged image version when it is initially copied over. But from there it seems like it is never imported into flux and cannot be updated using something like renovate to create PRs.

I see over at flux-cluster-template you appear to have a (possibly unreferenced?) file for updating kube-vip using anisble but I'm not sure if this is how it is intended to be used.
Ideally, Flux should take over management of kube-vip, so it can stay up to date with everything else in the cluster.

Are you able to clarify my understanding here? perhaps I have missed something.

Love your work by the way, I've learned so much about kubernetes from your projects.

Flux cannot manage static pods. The way I update isn't ideal, I just ssh into the master nodes and update the static pod manifest for kube-vip. The Ansible playbook in the template repo should also work but the amount of times I need to upgrade kube vip isn't enough for me to really dig in and automate it any other way.

That makes sense, consistent with my findings, and you're right, it is a rare occurrence to need to update kube-vip.
I'm curious though, can I ask why it isn't being installed the same way as cilium and coredns? Using server manifests which are then imported into flux later? Is it because they are not installed using a helm chart? or possibly that there is a bootstrap race condition that expects the control plane vip to already exist before that?

I don't have a great understanding of how the bootstrap process works since I tend to build a cluster and let it run for a very long time. I should probably do it more regularly.

or possibly that there is a bootstrap race condition that expects the control plane vip to already exist before that?

This exactly. If kube-vip wasn't a static pod the following would happen... coredns would be forever in a pending state due to cilium not working due to kube vip not working due to coredns not working.

You can read thru comments on this PR where I made the changes.

onedr0p/cluster-template#740

Thanks for clarifying. I apologise, I completely missed that PR thread, it makes it quite clear why you've gone that particular route.

I do hope in future we can figure out a flux-only solution for ongoing management, but its not a big deal for now, its worth it for the benefits Cilium provides.

Thanks again, issue closed