Skip to main content

Kubernetes Etcd Snapshot: Etcd Snapshot and Restore with Etcdctl, Verify Etcd Member Health; Etcdctl Commands

1130 words·
Kubernetes Kubectl Etcdctl
Table of Contents
Kubernetes-Components - This article is part of a series.
Part 26: This Article

Overview
#

In this tutorial I’m using the following Kubernetes cluster deployed with Kubeadm:

NAME      STATUS   ROLES           AGE   VERSION    INTERNAL-IP     EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION     CONTAINER-RUNTIME
ubuntu1   Ready    control-plane   69d   v1.28.11   192.168.30.10   <none>        Ubuntu 24.04 LTS   6.8.0-36-generic   containerd://1.7.18
ubuntu2   Ready    worker          69d   v1.28.11   192.168.30.11   <none>        Ubuntu 24.04 LTS   6.8.0-36-generic   containerd://1.7.18
ubuntu3   Ready    worker          69d   v1.28.11   192.168.30.12   <none>        Ubuntu 24.04 LTS   6.8.0-36-generic   containerd://1.7.18

Etcd Snapshot
#

Install Etcdctl
#

# Install etcdctl
sudo apt install etcd-client
# Verify etcdctl is installed / list version
etcdctl version

# Sehll output:
etcdctl version: 3.4.30
API version: 3.4

Etcd Details
#

Etcdctl needs the following details to create an etcd snapshot:

  • etcd endpoint (–endpoints): Specified in the --listen-client-urls and / or --advertise-client-urls options
  • CA certificate (–cacert): Specified in the --trusted-ca-file option
  • Server certificate (–cert): Specified in the --cert-file option
  • Server key (–key): Specified in the --key-file option

List Etcd Configuration Details
#

# List etcd server configuration
sudo cat /etc/kubernetes/manifests/etcd.yaml

# Shell output:
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.168.30.10:2379
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://192.168.30.10:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --experimental-initial-corrupt-check=true
    - --experimental-watch-progress-notify-interval=5s
    - --initial-advertise-peer-urls=https://192.168.30.10:2380
    - --initial-cluster=ubuntu1=https://192.168.30.10:2380
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379,https://192.168.30.10:2379
    - --listen-metrics-urls=http://127.0.0.1:2381
    - --listen-peer-urls=https://192.168.30.10:2380
    - --name=ubuntu1
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    image: registry.k8s.io/etcd:3.5.12-0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /health?exclude=NOSPACE&serializable=true
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: etcd
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /health?serializable=false
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /var/lib/etcd
      name: etcd-data
    - mountPath: /etc/kubernetes/pki/etcd
      name: etcd-certs
  hostNetwork: true
  priority: 2000001000
  priorityClassName: system-node-critical
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  volumes:
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
  - hostPath:
      path: /var/lib/etcd
      type: DirectoryOrCreate
    name: etcd-data
status: {}

Etcd Pod Details
#

Alternative the Etcd configuration details can also be found by describing the Etcd pod:

# List pods in "kube-system" namespace
kubectl get pod -n kube-system

# Shell output:
NAME                               READY   STATUS    RESTARTS      AGE
cilium-dd9vk                       1/1     Running   1 (35m ago)   69d
cilium-dvc8h                       1/1     Running   1 (34m ago)   69d
cilium-operator-579c6c96c4-6gtvz   1/1     Running   1 (35m ago)   69d
cilium-qvjpb                       1/1     Running   1 (35m ago)   69d
coredns-5dd5756b68-8smln           1/1     Running   1 (35m ago)   69d
coredns-5dd5756b68-jlmhf           1/1     Running   1 (35m ago)   69d
etcd-ubuntu1                       1/1     Running   1 (35m ago)   69d
kube-apiserver-ubuntu1             1/1     Running   1 (35m ago)   69d
kube-controller-manager-ubuntu1    1/1     Running   1 (35m ago)   69d
kube-proxy-6z669                   1/1     Running   1 (35m ago)   69d
kube-proxy-cfhvt                   1/1     Running   1 (34m ago)   69d
kube-proxy-rkg4z                   1/1     Running   1 (35m ago)   69d
kube-scheduler-ubuntu1             1/1     Running   1 (35m ago)   69d
# List Etcd master node pod details
kubectl describe pod etcd-ubuntu1 -n kube-system

# Shell output:
Name:                 etcd-ubuntu1
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 ubuntu1/192.168.30.10
Start Time:           Fri, 13 Sep 2024 09:58:52 +0000
Labels:               component=etcd
                      tier=control-plane
Annotations:          kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.168.30.10:2379
                      kubernetes.io/config.hash: aca294370ff6585d2a5cd9590e82f594
                      kubernetes.io/config.mirror: aca294370ff6585d2a5cd9590e82f594
                      kubernetes.io/config.seen: 2024-07-05T17:36:16.190325136Z
                      kubernetes.io/config.source: file
Status:               Running
SeccompProfile:       RuntimeDefault
IP:                   192.168.30.10
IPs:
  IP:           192.168.30.10
Controlled By:  Node/ubuntu1
Containers:
  etcd:
    Container ID:  containerd://f458f93327bb07ed1b2382d3d1b6b002e50356e43254b7d8b1886bb3a78119a4
    Image:         registry.k8s.io/etcd:3.5.12-0
    Image ID:      registry.k8s.io/etcd@sha256:44a8e24dcbba3470ee1fee21d5e88d128c936e9b55d4bc51fbef8086f8ed123b
    Port:          <none>
    Host Port:     <none>
    Command:
      etcd
      --advertise-client-urls=https://192.168.30.10:2379
      --cert-file=/etc/kubernetes/pki/etcd/server.crt
      --client-cert-auth=true
      --data-dir=/var/lib/etcd
      --experimental-initial-corrupt-check=true
      --experimental-watch-progress-notify-interval=5s
      --initial-advertise-peer-urls=https://192.168.30.10:2380
      --initial-cluster=ubuntu1=https://192.168.30.10:2380
      --key-file=/etc/kubernetes/pki/etcd/server.key
      --listen-client-urls=https://127.0.0.1:2379,https://192.168.30.10:2379
      --listen-metrics-urls=http://127.0.0.1:2381
      --listen-peer-urls=https://192.168.30.10:2380
      --name=ubuntu1
      --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
      --peer-client-cert-auth=true
      --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
      --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
      --snapshot-count=10000
      --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
      ...

Create Etcd Snapshot
#

# Create a snapshot directory
sudo mkdir -p /opt/backup/etcd/
# Create an Etcd Snapshot
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://192.168.30.10:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /opt/backup/etcd/etcd-$(date +%Y%m%d).db

# Shell output:
Snapshot saved at /opt/backup/etcd/etcd-20240913.db
  • ETCDCTL_API=3 Current and recommended API version

Verify the Etcd Snapshot
#

# List snapshot details
sudo ETCDCTL_API=3 etcdctl --write-out=table snapshot status /opt/backup/etcd/etcd-20240913.db

# Shell output:
+----------+----------+------------+------------+
|   HASH   | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| c50fb43b |     6493 |       1916 |      10 MB |
+----------+----------+------------+------------+



Restore Etcd
#

Restore Etcd from Snapshot
#

Restore Etcd from the previsous created Etcd snapshot:

# Restore Etcd
sudo ETCDCTL_API=3 etcdctl snapshot restore /opt/backup/etcd/etcd-20240913.db

# Shell output:
{"level":"info","ts":1726223939.9055638,"caller":"snapshot/v3_snapshot.go:306","msg":"restoring snapshot","path":"/opt/backup/etcd/etcd-20240913.db","wal-dir":"default.etcd/member/wal","data-dir":"default.etcd","snap-dir":"default.etcd/member/snap"}
{"level":"info","ts":1726223939.9282627,"caller":"mvcc/kvstore.go:388","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":5465}
{"level":"info","ts":1726223939.938816,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"cdf818194e3a8c32","local-member-id":"0","added-peer-id":"8e9e05c52164694d","added-peer-peer-urls":["http://localhost:2380"]}
{"level":"info","ts":1726223939.9411979,"caller":"snapshot/v3_snapshot.go:326","msg":"restored snapshot","path":"/opt/backup/etcd/etcd-20240913.db","wal-dir":"default.etcd/member/wal","data-dir":"default.etcd","snap-dir":"default.etcd/member/snap"}

Verify Etcd Health
#

List Etcd Members
#

This command lists all the members of the etcd cluster:

# List all members of the Etcd cluster
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://192.168.30.10:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  member list

# Shell output:
ca9d099d8255e45c, started, ubuntu1, https://192.168.30.10:2380, https://192.168.30.10:2379, false
# List all members of the Etcd cluster: Format output
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://192.168.30.10:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  member list --write-out=table

# Shell output:
+------------------+---------+---------+----------------------------+----------------------------+------------+
|        ID        | STATUS  |  NAME   |         PEER ADDRS         |        CLIENT ADDRS        | IS LEARNER |
+------------------+---------+---------+----------------------------+----------------------------+------------+
| ca9d099d8255e45c | started | ubuntu1 | https://192.168.30.10:2380 | https://192.168.30.10:2379 |      false |
+------------------+---------+---------+----------------------------+----------------------------+------------+
  • PEER ADDRS URL used by other etcd members to communicate with this member

  • CLIENT ADDRS URL used by etcd clients to interact with this member.

  • IIS LEARNER: false A learner is a type of member that can serve as a non-voting member in the cluster. Learners can receive data and help in read operations but do not contribute to the quorum and therefore do not affect the clusters resilience.


List Etcd Members
#

# List Etcd member health
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://192.168.30.10:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint health

# Shell output:
https://192.168.30.10:2379 is healthy: successfully committed proposal: took = 4.672119ms
# List Etcd member health: Format output
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://192.168.30.10:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint health --write-out=table

# Shell output:
+----------------------------+--------+-----------+-------+
|          ENDPOINT          | HEALTH |   TOOK    | ERROR |
+----------------------------+--------+-----------+-------+
| https://192.168.30.10:2379 |   true | 4.44304ms |       |
+----------------------------+--------+-----------+-------+



Etcd Commands
#

Export Environment Variables
#

Export the Etcd configuration details to keep the Etcd commands short:

# Switch to root user:
sudo su

# Export the environemt variables
export ETCDCTL_API=3
export ETCDCTL_ENDPOINTS=https://192.168.30.10:2379
export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key

List Keys and their Value
#

# List Etcd keys
etcdctl get "" --prefix --keys-only

# List both the keys and their corresponding values
etcdctl get "" --prefix
  • get "" Retrieves keys from the root of etcd (empty string means the root directory)

  • --prefix Ensures all keys with the specified prefix (in this case, the root) are returned

  • --keys-only Limits the output to keys without showing the values


Pod Key and Value Example
#

Create Example Namespace
#

# Create a new namespace with the name "example-namespace"
kubectl create ns example-namespace

Deploy Example Pod
#

# Create a pod manifest
vi example-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: example-pod
  namespace: example-namespace
  labels:
    app: nginx
spec:
  containers:
    - image: nginx:latest
      name: nginx
      ports:
        - containerPort: 80
# Deploy the pod
kubectl apply -f example-pod.yaml

List Pod Key and Value
#

# List Etcd keys: Pods
etcdctl get /registry/pods/ --prefix --keys-only

# Shell output:
/registry/pods/example-namespace/example-pod
/registry/pods/ingress-nginx/ingress-nginx-controller-6dfcb8658d-94vg6
...
# List Etcd key and value: Pods in "example-namespace"
etcdctl get /registry/pods/example-namespace/ --prefix --keys-only
Kubernetes-Components - This article is part of a series.
Part 26: This Article