rancher / rancher

Complete container management platform

Home Page:http://rancher.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unexpected node loss after intentionally rebooting other node

jberger opened this issue · comments

In the same situation as in #40249, I have deployed an RKE2 cluster using the rancher ui and though I have a kube-vip included rke2 itself is unaware of it as I didn't know how to tell rancher about this "fixed registration address" (if I had done it manually it would have been the "server" field.

I was doing maintenance on node 3 of the cluster and had drained it, finally I rebooted it and as soon as the node went down node 4 went unavailable as well. The node was alive, just not communicating with the cluster. I then (like the previous issue) looked at /etc/rancher/rke2/config.yaml.d/50-rancher.yaml on node 4 and sure enough its "server" was node 3. As it happens, nodes 2 and 3 refer to node 1 as their server. If I bring node 1 down will I lose the whole cluster? This seems unexpected and undesirable. As in that previous issue I would like to have a way to specify that the cluster has a vip that it can use, is there any way?