This document provides a comprehensive guide for taking backups of the Kubernetes etcd data and restoring it in case of failures. The steps ensure cluster data is securely saved and can be restored as needed.
Backups are stored in the path /data2/backup/etcd_backup
of syhydsrv001
server.
The etcd snapshot saves the current state of your Kubernetes cluster. Run the following command to take a backup:
ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/server.crt \
--key /etc/kubernetes/pki/etcd/server.key
This creates a snapshot file named snapshot.db
in the current directory.
Ensure the snapshot is successfully created by checking its status:
ETCDCTL_API=3 etcdctl snapshot status --write-out=table snapshot.db
The output will display the snapshot’s metadata, such as size and revision number, confirming a successful backup.
To ensure no certificate data is lost, compress and back up the etcd certificate files:
tar -zcvf etcd.tar.gz /etc/kubernetes/pki/etcd
The compressed file etcd.tar.gz
contains all the necessary etcd certificates.
Follow these steps to restore etcd data from a backup:
Unpack the previously compressed etcd certificate files to their original location:
tar -zxvf etcd.tar.gz -C /
This restores the etcd certificate files to /etc/kubernetes/pki/etcd
.
Run the following command to restore the etcd snapshot:
ETCDCTL_API=3 etcdctl --data-dir="/var/lib/etcd_bkp" \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot restore snapshot.db
--data-dir
specifies the target directory for the restored etcd data (/var/lib/etcd_bkp
in this example).Edit the etcd configuration file to point to the restored data directory:
Open the etcd manifest file:
nano /etc/kubernetes/manifests/etcd.yaml
Locate the --data-dir
parameter and update it to the restored directory:
--data-dir=/var/lib/etcd_bkp
Save and exit the file.
After updating the configuration, wait a few minutes for etcd to restore and reflect the changes. You can verify the cluster status once etcd is back online.
By following these steps, you can reliably back up and restore etcd data, ensuring high availability and fault tolerance for your Kubernetes cluster.