Backing-up, restoring and migrating Kubernetes cluster with Velero📦 .
Velero is a light weight tool to safely backup, restore, handle cluster-failovers, and migrate Kubernetes cluster resources and persistent volumes.
Kind is a light weight utility to create single-node Kubernetes cluster on a Docker container for testing purposes. Using Kind will allow us to create two test cluster of different versions quickly, thus allowing us to simulate cluster-migrations.
For installing kind:
- checkout kind-quickstart
- Or watch this video
CAUTION 🛑 🛑 :
- Make sure Docker is installed on your machine.
- If you are using Kubernetes(version 1.17), do check if coredns is working. For verifying status of coredns check this post here.
Instructions for creating a service account with necessary permission here.
Velero requires a storage site for pushing back-up files and retrieving them back in case of restoration. We'll be using Google cloud storage bucket for this tutorial, but you can explore wide variety of storage plugin offered by Velero here.
You can grab Terraform CLI from here or else use a Docker container that comes pre-installed with terraform. The infrastructure files for terraform are placed inside storage folder. Make sure your credentials.json
is present inside gcpServiceAccount
folder.
docker run -it --rm -v ${PWD}/storage:/storage -w /storage akshit8/terraform
Note: akshit8/terraform is a Debian docker container installed with Terrafotm CLI (v 0.14.7).
Once the container has been created, run the following commands to create a storage bucket on GCP.
# to download gcp provider and any dependency
terraform init
# to apply changes on the cloud
terraform apply
If no error is thrown, you'll be able to see a newly created bucket in your cloud console.
With storage bucket in place, let us create a test-cluster
with Kubernetes version 1.18
kind create cluster --name test-cluster --image kindest/node:v1.18.0
To install both CLI's we can use a Debian Docker container.
docker run -it --rm -v ${HOME}:/root/ -v ${PWD}:/work -w /work --net host debian:buster
mounting $HOME directory provides access to KUBE_CONFIG generated by Kind CLI.
- Installing Kubectl
curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl
chmod +x ./kubectl
mv ./kubectl /usr/local/bin/kubectl
To verify kubectl and our test-cluster
run following command
root@my-vm:/work# kubectl get nodes
NAME STATUS ROLES AGE VERSION
test-cluster-control-plane Ready master 5m15s v1.18.0
- Installing velero CLI
curl -L -o /tmp/velero.tar.gz https://github.com/vmware-tanzu/velero/releases/download/v1.5.1/velero-v1.5.1-linux-amd64.tar.gz
tar -C /tmp -xvf /tmp/velero.tar.gz
mv /tmp/velero-v1.5.1-linux-amd64/velero /usr/local/bin/velero
chmod +x /usr/local/bin/velero
Kubernetes objects used for this tutorial is located in k8s-objects folder.
kubectl create ns sample
kubectl -n sample apply -f ./k8s-objects
Using Velero CLI installed previously, we need to deploy some components(that velero use) inside our cluster and configure them, so that Velero can access our cloud storage bucket.
# setting the bucket name
export BUCKET=velero-akshit
# installing velero with provider gcp
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.1.0 \
--bucket $BUCKET \
--secret-file ./gcpServiceAccount/credentials.json
Note: it will create a new namespace velero to hold all components.
To verify above installation, run following commands
root@my-vm:/work# kubectl -n velero get pods
NAME READY STATUS RESTARTS AGE
velero-86bb45cdfb-987ps 1/1 Running 0 23s
kubectl logs deployment/velero -n velero
If installation and connection to our storage bucket is successful, no error messages would be there inside deployment logs.
For adding sample
namespace to Velero backup pool
velero backup create sample-namespace-backup --include-namespaces sample
velero backup describe sample-namespace-backup
If some error occurs, inspect the backup logs
velero backup logs sample-namespace-backup
Listing backups
root@my-vm:/work/velero# velero get backups
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
sample-namespace-backup Completed 0 0 2021-02-24 07:44:11 +0000 UTC 29d default <none>
Verify on Google Cloud Console
Our bucket contain backup files of all Kuberntes objects that were deployed inside sample
namespace.
kubectl -n sample delete -f ./k8s-objects
Let's now recover the deleted objects with Velero
velero restore create sample-namespace-backup --from-backup sample-namespace-backup
In case of any error, refer the logs
velero backup logs sample-namespace-backup
Let's verify whether sample
namespace has been restored or not
root@my-vm:/work/velero# kubectl get all -n sample
NAME READY STATUS RESTARTS AGE
pod/sample-app-6ffc75c46-g6bbg 1/1 Running 0 24s
pod/sample-app-6ffc75c46-nsg8d 1/1 Running 0 24s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/sample-service ClusterIP 10.104.123.76 <none> 3000/TCP 24s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/sample-app 2/2 2 2 24s
NAME DESIRED CURRENT READY AGE
replicaset.apps/sample-app-6ffc75c46 2 2 2 24s
As before we'll use kind to spin another light weight cluster with version 1.19
kind create cluster --name test-cluster-2 --image kindest/node:v1.19.0
Check if the cluster is ready and accessible:
root@my-vm:/work# kubectl get nodes
NAME STATUS ROLES AGE VERSION
test-cluster-2-control-plane Ready master 6m1s v1.19.0
- repeat the above steps to install Velero again
- make sure deployment logs displays no error
- verify all components inside namespace velero are running.
List the backups present inside storage bucket
root@my-vm:/work/velero# velero get backup
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
sample-namespace-backup Completed 0 0 2021-02-24 07:44:11 +0000 UTC 29d default <none>
Starting the restore
velero restore create sample-namespace-backup --from-backup sample-namespace-backup
Verifying the restore
root@my-vm:/work/velero# velero restore describe sample-namespace-backup
Phase: Completed
Started: 2021-02-24 09:52:47 +0000 UTC
Completed: 2021-02-24 09:52:48 +0000 UTC
Checking if all the components has been recovered
kubectl get all -n sample
Note: during migration, velero syncs with our storage bucket to get list of all backups, but it doesn't apply or creates these backups automatically in your cluster.
- We have successfully simulated a cluster-failover and migration, while restoring our cluster back to original state.
- Velero can also backup stateful workloads and volumes. The focus of this tutorial was to only backup stateless workloads.
Akshit Sadana akshitsadana@gmail.com
- Github: @Akshit8
- LinkedIn: @akshitsadana
Licensed under the MIT License