Kubernetes Horizontal Pod Autoscaling: Install Kubernetes Metrics Server, Example Deployment with Horizontal Pod Autoscaler (HPA)

I’m using a K8s Kubernetes cluster with MetalLB, that was deployed with Kubespray on Debian 12 servers based on VMware Workstation VMs. It was necessary to increase to number of CPU cores on the worked node VMs, otherwise the HPA could not start more than 5 pods because of insufficient CPU.

Metrics Server
#

Metrics Server Components
#

# Download official metrics server YAML manifest from SIGs (Kubernetes Special Interest Groups)
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Note: It was necessary to add the --kubelet-insecure-tls option in the “Deployment” > “containers” > “args” section.

vi components.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=10250
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls  # Add this line
        image: registry.k8s.io/metrics-server/metrics-server:v0.7.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 10250
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
          seccompProfile:
            type: RuntimeDefault
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

Deploy the Metrics Server
#

# Deploy the metrics server
kubectl apply -f components.yaml

Verify the Metric Server Resources
#

# List deployments: Wait till the "metrics-server" deployment is ready
kubectl get deployments --namespace kube-system

# Shell output:
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
...
metrics-server            1/1     1            1           28s

Horizontal Pod Autoscaler
#

Example Deployment & Service
#

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      containers:
      - name: php-apache
        image: registry.k8s.io/hpa-example
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
  - port: 80
  selector:
    run: php-apache

# Deploy the deployment a serivce
kubectl apply -f https://k8s.io/examples/application/php-apache.yaml

Verify the Deployment
#

# List deployments
kubectl get deployments

# Shell output:
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
php-apache   1/1     1            1           108s

# List services
kubectl get services

# Shell output:
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
php-apache   ClusterIP   10.233.27.169   <none>        80/TCP    110s

Horizontal Pod Autoscaler (HPA)
#

This will increase and decrease the number of replicas to maintain an average CPU utilization across all Pods of 50%:

# Deploy a pod autoscaler: Maintains between 1 and 10 pod replicas
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10 --name=hpa-example

Verify the HPA Status
#

# List the HPA status: Wait till the HPA gets a target output from the metrics server
kubectl get hpa


# Shell output:
NAME          REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/php-apache   <unknown>/50%   1         10        0          7s

# Shell output:
NAME          REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/php-apache   0%/50%    1         10        1          26s

Increase the Load
#

The following pod acts as a client that is sending queries to the php-apache service:

# Deploy the pod that acts as a load generator: Run in seperate shell
kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

Watch the HPA Load
#

# Wacht the load of the HPA
kubectl get hpa hpa-example --watch

# Shell output:
NAME          REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/php-apache   0%/50%    1         10        1          2m50s
hpa-example   Deployment/php-apache   223%/50%   1         10        1          3m15s
hpa-example   Deployment/php-apache   250%/50%   1         10        4          3m30s
hpa-example   Deployment/php-apache   177%/50%   1         10        5          3m45s
hpa-example   Deployment/php-apache   83%/50%    1         10        5          4m1s
hpa-example   Deployment/php-apache   80%/50%    1         10        7          4m16s
hpa-example   Deployment/php-apache   71%/50%    1         10        8          4m31s
hpa-example   Deployment/php-apache   54%/50%    1         10        8          4m46s

Verify Example Deployments Pods
#

Verify the increased number of pods in the example deployments:

# List the pods of the example deployment
kubectl get deployment php-apache

# Shell outpt:
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
php-apache   8/8     8            8           4m58s

# List pods
kubectl get pods -l run=php-apache

# Shell output:
NAME                          READY   STATUS    RESTARTS   AGE
php-apache-598b474864-6vbjl   1/1     Running   0          96s
php-apache-598b474864-749nz   1/1     Running   0          5m8s
php-apache-598b474864-75k62   1/1     Running   0          81s
php-apache-598b474864-b88j7   1/1     Running   0          50s
php-apache-598b474864-cnqzh   1/1     Running   0          50s
php-apache-598b474864-fqsr7   1/1     Running   0          96s
php-apache-598b474864-kz6b4   1/1     Running   0          96s
php-apache-598b474864-m72vl   1/1     Running   0          35s

List HPA Details
#

# List HPA details
kubectl describe hpa hpa-example

# Shell output:
Events:
  Type     Reason                        Age                  From                       Message
  ----     ------                        ----                 ----                       -------
  Normal   SuccessfulRescale             2m13s (x2 over 20m)  horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
  Normal   SuccessfulRescale             118s (x2 over 20m)   horizontal-pod-autoscaler  New size: 5; reason: cpu resource utilization (percentage of request) above target
  Normal   SuccessfulRescale             88s (x2 over 19m)    horizontal-pod-autoscaler  New size: 8; reason: cpu resource utilization (percentage of request) above target

Delete Resources
#

# Delete the HPA
kubectl delete hpa hpa-example

# Delete the example deployment and service
kubectl delete -f https://k8s.io/examples/application/php-apache.yaml

Links
#

# Official Documentation
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/

# Metrics Server Configuration from SIGs (Kubernetes Special Interest Groups)
https://github.com/kubernetes-sigs/metrics-server

Kubernetes-Components - This article is part of a series.

Part 1: Kubernetes Configuration Manifests: Create Manifests from RAW Output and Dry-Run Command

Part 2: Kubernetes Non-Disruptive & Disruptive Configuration Updates: Kubectl Apply, Edit, Patch & Replace; Update Rollouts and Rollbacks with Set Image Command

Part 3: Kubernetes Pods: Create Pods with Run-Command and YAML Configuration; Single & Multi Container Pods, Port-Forwarding, Find Container on Worker Node

Part 4: Kubernetes Pods: Init & Sidecar Container Overview, Init Container Examples

Part 6: Kubernetes Services: Example ClusterIP, NodePort & LoadBalancer Services with Expose-Command and YAML Configuration; Service for External Endpoint

Part 8: Kubernetes Monitoring & Logs: Monitor Applications with top, Monitor Events Pod specific and Cluster wide, Container STDOUT and STDERR Logs

Part 9: Kubernetes Security: Immutable Deployment - Deploy Container with ReadOnly-Filesystem and Writable-Volume

Part 10: Kubernetes Security: Pod Security Admission (PSA) - Overview, Enforce Pod Security Standard at a Namespace; Example Nginx Pod SecurityContext for Restricted PSS

Part 12: Kubernetes Secrets: Opaque Secret Configuration, Pod Examples with Environment Variable Secrets and Volume Secrets; SSH Authentication Secret with Pod Example

Part 13: Kubernetes ConfigMaps: Mount ConfigMap to Pod as Volume, Mount ConfigMap as Environment Variable

Part 14: Kubernetes Sets - ReplicaSets & DaemonSets: Overview, Example ReplicaSet, Example DaemonSet with and without NodeSelector / Node Labeling