rancher / rancher-docs

Rancher Documentation

Home Page:https://ranchermanager.docs.rancher.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Expand/Include more information on how to do rke2 restore for custom clusters

mrolmedo opened this issue · comments

Related Issues

Hi team! In our docs, when we need to proceed with an rke2 restore when all controlplanes and etcd are completely unavailable, we follow this guide:https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/backup-restore-and-disaster-recovery/restore-rancher-launched-kubernetes-clusters-from-backup#restoring-a-cluster-from-a-snapshot-when-the-controlplaneetcd-are-completely-unavailable.
Although, It might be worthy to include some more extra information

  1. Step 2: After all etcd nodes are removed, add a new etcd node that you are planning to restore from.

Can we include that the new node has to be etcd, controlplane and worker?
CF ==> https://support.tools/post/rke2-with-rancher-disaster-recovery/

  1. Step 3:Restore from an etcd snapshot.
    For local snapshots, restore using the UI is not available. In the upper right corner, click ⋮ > Edit YAML. Define spec.cluster.rkeConfig.etcdSnapshotRestore.name as the filename of the snapshot on disk in /var/lib/rancher/<k3s/rke2>/server/db/snapshots/.

2.1 The path, as it is a clean node, does not exist, and it has to be created manually. It might be worth including that.

2.2 Define spec.cluster.rkeConfig.etcdSnapshotRestore.name as the filename of the snapshot on disk in /var/lib/rancher/<k3s/rke2>/server/db/snapshots/.

This led to confusion many times. Usually this is included under the etcd configuration and using the entire path. Can we include a screenshot or something like that?
rke2-restore

Thanks for your time.

Summary

Hello @mrolmedo! I am working on a PR to add this information and wanted to clarify the necessary YAML for the etcdSnapshotRestore configuration. I have as an example the following code block that could be added:

...
rkeConfig:
  etcdSnapshotRestore:
    name: <string> # Refers to the filename of the associated etcdsnapshot object.
    generation: <int> # Changing the generation initiates a snapshot restore.
    restoreRKEConfig: <string> # Set to either none (or empty string), all, or kubernetesVersion.
...

Are generation and restoreRKEConfig required for the restore, and are there any other variables needed to be set here? Any guidance much appreciated, thanks!

Hello @sunilarjun. I'm on vacation the entire week. I will get you back on Monday, 6th.

Hello @sunilarjun. I'm on vacation the entire week. I will get you back on Monday, 6th.

Sounds good! Enjoy your break!

Hi @sunilarjun
Are generation and restoreRKEConfig required for the restore?
The Generation field is not requiered

The restoreRKEConfig has 3 values when using the Rancher UI to create a restore.

  • none

  • kubernetesVersion

  • all

    1 none ==> Equal to restore only etcd

image
  1. kubernetesVersion ==> Restore kubernetes version and etcd
image
  1. all ==> Restore Cluster config, Kubernetes version and etcd
image

I need to do more tests from my side. I will update with more info.

Hi @sunilarjun, regarding step 3, after the tests I´ve done, the only information that has to be included is the name, as the documentation states now:
image

So, here the change would be only the format for clarity:
image

rkeConfig:
  etcdSnapshotRestore:
    name: <string> # Refers to the filename of the associated etcdsnapshot object.

I mean, replace this section
image

with

rkeConfig:
  etcdSnapshotRestore:
    name: <string> # Refers to the filename of the associated etcdsnapshot object.

Thank you @mrolmedo for the clarification! Getting the subsequent PR updated with this information.