kubernetes / kubeadm

Aggregator for issues filed against kubeadm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

make the temporary directory used for backing up static pod manifests during upgrade configurable

shubham-yewalekar opened this issue · comments

What keywords did you search in kubeadm issues before filing this one?

TempDirForKubeadm and KubernetesDir

I see values for these variables are hardcoded, so we cannot use a different directory than /etc/kubernetes/tmp for taking the backup of manifest files for static pods during upgrade

Is this a BUG REPORT or FEATURE REQUEST?

FEATURE REQUEST

We should allow a custom backup directory for backing up the manifests for static pods during upgrade, we hit issues with upgrade if the /etc/kubernetes/tmp directory doesn't have enough space and blocks the upgrade, instead we should accept custom directory for taking the manifest backups

Versions

Kubeadm version: 1.22 onwards (but I think so this problem will be for all the kubeadm releases, I observed it on 1.22 kubeadm version)

NOTE: I would like to work on the code changes required for adding this functionality if this feature proposal is accepted as a valid request. Thanks!!

I see values for these variables are hardcoded, so we cannot use a different directory than /etc/kubernetes/tmp for taking the backup of manifest files for static pods during upgrade

i'm generally not in favor of such a change. you could use symlink, correct?

I see values for these variables are hardcoded, so we cannot use a different directory than /etc/kubernetes/tmp for taking the backup of manifest files for static pods during upgrade

i'm generally not in favor of such a change. you could use symlink, correct?

No symlink doesn't work. It fails with an error

[upgrade/etcd] Failed to upgrade etcd: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [rename /etc/kubernetes/manifests/etcd.yaml /etc/kubernetes/tmp/kubeadm-backup-manifests-2024-02-13-23-27-53/etcd.yaml: invalid cross-device link, rename /etc/kubernetes/tmp/kubeadm-backup-manifests-2024-02-13-23-27-53/etcd.yaml /etc/kubernetes/manifests/etcd.yaml: invalid cross-device link] [upgrade/etcd] Waiting for previous etcd to become available I0213 23:27:56.716523 52085 etcd.go:463] [etcd] attempting to see if all cluster endpoints ([https://1<IP>:<port> https://<IP>:<port> https://<IP>:<port>]) are available 1/10 [upgrade/etcd] Etcd was rolled back and is now available [rename /etc/kubernetes/manifests/etcd.yaml /etc/kubernetes/tmp/kubeadm-backup-manifests-2024-02-13-23-27-53/etcd.yaml: invalid cross-device link, rename /etc/kubernetes/tmp/kubeadm-backup-manifests-2024-02-13-23-27-53/etcd.yaml /etc/kubernetes/manifests/etcd.yaml: invalid cross-device link] couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced

This implies that the filesystem for /etc/kubernetes and the filesystem redirected by the symlink for /etc/kubernetes/tmp are not located in the same filesystem. Therefore, creating a separate filesystem for /etc/kubernetes/tmp would not alleviate the issue and would likely result in the same error.

I also faced similar issue while using symlink for the backup dir /etc/kubernetes/tmp invalid cross-device link

Its an issue with the os.Rename function which is used to move the manifest files during the upgrade from the backup directory

I also had similar issue, symlink doesn't work unless has it big filesystem and we cannot have big / partition.

[upgrade/etcd] Failed to upgrade etcd: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [rename /etc/kubernetes/manifests/etcd.yaml /etc/kubernetes/tmp/kubeadm-backup-manifests-2024-02-13-23-27-53/etcd.yaml: invalid cross-device link, rename /etc/kubernetes/tmp/kubeadm-backup-manifests-2024-02-13-23-27-53/etcd.yaml /etc/kubernetes/manifests/etcd.yaml: invalid cross-device link] [upgrade/etcd] Waiting for previous etcd to become available I0213 23:27:56.716523

that seems like something we can fix with a better rename function?
https://gist.github.com/var23rav/23ae5d0d4d830aff886c3c970b8f6c6b?permalink_comment_id=4431960#gistcomment-4431960

another thing you can try is to use is --rootfs (the kubeadm flag), to have your whole kubernetes tree on a different volume?
unclear if chroot (which --rootfs uses) would even work for this.,

We should allow a custom backup directory for backing up the manifests for static pods during upgrade, we hit issues with upgrade if the /etc/kubernetes/tmp directory doesn't have enough space and blocks the upgrade, instead we should accept custom directory for taking the manifest backups

what disk sizes are we talking about here? how much is the disk and how much is the tmp dir taking out of it?
manifests files don't contribute to size, is the etcd backup that is the culprit here?
if you have old etcd backups you should delete them or manually move them somewhere.

$ sudo du -sh /etc/kubernetes/tmp/kubeadm-backup-etcd-2024-02-15-21-59-40
124M	/etc/kubernetes/tmp/kubeadm-backup-etcd-2024-02-15-21-59-40

$ sudo du -sh /etc/kubernetes/tmp/kubeadm-backup-manifests-2024-02-15-21-59-40
16K	/etc/kubernetes/tmp/kubeadm-backup-manifests-2024-02-15-21-59-40

we are in the process of developing a new upgradeconfiguration v1beta4.
kubernetes/kubernetes#123068

so now would be the time to add such new options, for example:

  • tmpPath String
    but maybe we want the symlink working instead.
    currently none of these paths for /etc/kubernetes and /var/lib/kubelet and /var/lib/etcd are configurable.

cc @pacoxu @calvin0327