Overview #
Kubernetes Cluster #
In this tutorial I’m using the following Kubernetes cluster with two worker nodes:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ubuntu1 Ready control-plane 124d v1.28.11 192.168.30.10 <none> Ubuntu 24.04 LTS 6.8.0-48-generic containerd://1.7.18
ubuntu2 Ready worker 124d v1.28.11 192.168.30.11 <none> Ubuntu 24.04 LTS 6.8.0-36-generic containerd://1.7.18
ubuntu3 Ready worker 124d v1.28.11 192.168.30.12 <none> Ubuntu 24.04 LTS 6.8.0-36-generic containerd://1.7.18
Labels & Annotations #
Both labels and annotations can be added to almost every Kubernetes resource, including:
- Pods
- Deployments
- Services
- Ingress
- ConfigMaps
- Secrets
- Namespaces
- Nodes
- Persistent Volume Claims (PVCs)
- DaemonSets, StatefulSets, Jobs, and CronJobs
Kubernetes Labels #
-
Use for organizing and selecting resources, e.g. for querying and filtering.
-
Labels on nodes can be used to assign pods to specific worker nodes using the
nodeSelector
field
Kubernetes Annotations #
-
Used to store metadata or non-identifying information that doesn’t influence selection or filtering.
-
Annotations can hold larger and more complex data than labels.
-
Can be used for information like configurations, logging information, documentation or information about the creator or the last modifier of a resource.
Node Affinity #
- Controls which nodes a pod can be scheduled on based compley rules and on labels assigned to nodes.
Pod Anti-Affinity #
-
Controls how pods are distributed across nodes to avoid placing certain pods together.
-
Typically used to spread replicas of the same application across different nodes, therefore enhancing availability.
Taints and Tolerations #
-
While labels and affinity rules are about placement preferences and selection based on conditions and key-value pairs, taints and tolerations provide a mechanism for enforcing exclusions, evictions, and strict isolation.
-
They offer more control in complex scheduling scenarios where certain nodes need to be protected from running specific workloads.
Node Labeling #
Add Node Label #
# Add a label to a node
kubectl label node ubuntu3 env=dev
Remove Node Label #
# Remove the "env=dev" label from the node
kubectl label node ubuntu3 env-
List Nodes and Node Labels #
# List nodes and their labels
kubectl get nodes --show-labels
# Shell output:
NAME STATUS ROLES AGE VERSION LABELS
ubuntu1 Ready control-plane 124d v1.28.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=ubuntu1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node.kubernetes.io/exclude-from-external-load-balancers=
ubuntu2 Ready worker 124d v1.28.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=ubuntu2,kubernetes.io/os=linux,kubernetes.io/role=worker
ubuntu3 Ready worker 124d v1.28.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,env=dev,kubernetes.io/arch=amd64,kubernetes.io/hostname=ubuntu3,kubernetes.io/os=linux,kubernetes.io/role=worker
# List nodes with specific label
kubectl get nodes --show-labels | grep env=dev
# Shell output:
ubuntu3 Ready worker 124d v1.28.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,env=dev,kubernetes.io/arch=amd64,kubernetes.io/hostname=ubuntu3,kubernetes.io/os=linux,kubernetes.io/role=worker
Pod Labeling #
Create Pod with One or Several Labels #
# Create a pod with several labels
kubectl run example-pod --image=nginx --labels=app=nginx,environment=production
# Create a pod without labels
kubectl run example-pod2 --image=nginx
YAML version
apiVersion: v1
kind: Pod
metadata:
name: example-pod
labels: # Pod Labels
app: nginx
environment: production
spec:
containers:
- name: nginx
image: nginx:latest
nodeSelector: # Node Labels
env: dev
Add Label After Pod Creation #
# Add another label after the pod was created
kubectl label pod example-pod role=web-server
Remove Label After Pod Creation #
# Remove the "role=web-server" label from the pod
kubectl label pod example-pod role-
List Pods and Pod Labels #
# List pods and pod labels
kubectl get pods --show-labels
# Shell output:
NAME READY STATUS RESTARTS AGE LABELS
example-pod 1/1 Running 0 16s app=nginx,environment=production,role=web-server
# List pods and display specific labels
kubectl get pods -L app,environment
# Shell output:
NAME READY STATUS RESTARTS AGE APP ENVIRONMENT
example-pod 1/1 Running 0 51s nginx production
example-pod2 1/1 Running 0 14s
List Pods with Specific Label #
# List pods with "app=nginx" label
kubectl get pods -l app=nginx
# Shell output:
NAME READY STATUS RESTARTS AGE
example-pod 1/1 Running 0 2m11s
List Pods without Specific Label #
# List pods: exclude pods with "environment=production" label
kubectl get pods -l environment!=production
# Shell output:
kubectl get pods -l environment!=production
Annotations #
Example Pod Annotation #
apiVersion: v1
kind: Pod
metadata:
name: example-pod
labels:
app: nginx
annotations:
description: "This is an Nginx webserver"
last-modified: "2024-11-07"
spec:
containers:
- name: nginx
image: nginx:latest
nodeSelector:
env: dev
View Pod Annotations #
# Describe the pod
kubectl describe pod example-pod
# Shell output:
...
Labels: app=nginx
Annotations: description: This is an Nginx webserver
last-modified: 2024-11-07
# Alternative filder for annotations (display 2 lines following each matching line)
kubectl get pod example-pod -o yaml | grep "annotations" -A 2
# Shell output:
annotations:
description: This is an Nginx webserver
last-modified: "2024-11-07"
NodeSelector #
Overview #
-
Limited to exact matching for a single label or a set of labels.
-
All specified labels must match for a pod to be scheduled on a node.
-
Conditions like “OR” or “NOT” cannot be used.
Example #
Label Node #
# Add the following labels "environment: production" and "disk: ssd"
kubectl label node ubuntu2 environment=production disk=ssd
Example Pod #
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
containers:
- name: nginx
image: nginx:latest
nodeSelector:
environment: production
disk: ssd
Verify Pod Placement #
# List pods
kubectl get pod -o wide
# Shell output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
example-pod 1/1 Running 0 4s 10.0.1.48 ubuntu2 <none> <none>
Node Affinity #
Example Pod: In Operator #
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
containers:
- name: nginx
image: nginx:latest
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: environment
operator: In
values: # Environment can be production or staging
- production
- staging
- key: disk
operator: In
values:
- ssd
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: region
operator: In
values:
- us-east
- us-west
-
requiredDuringSchedulingIgnoredDuringExecution
Enforces that nodes must have eitherenvironment=production
orenvironment=staging
anddisk=ssd
. -
preferredDuringSchedulingIgnoredDuringExecution
Makesregion=us-east
orregion=us-west
preferred but not required. -
weight
Assigns a preference level to the rule. Higher weight is a stronger preference, the range is 1 - 100.
Pod Anti-Affinity #
Example Deployment #
In this example pods with the same label app: nginx
are not scheduled on the same node:
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nginx
topologyKey: "kubernetes.io/hostname"
-
podAntiAffinity
Defines the anti-affinity rule, which means that the scheduler will try to avoid placing pods with matching labels on the same node. -
requiredDuringSchedulingIgnoredDuringExecution
Scheduler will only place the pod on a node if it satisfies this rule, if no node meets this anti-affinity requirement, the pod will not be scheduled. -
matchExpressions
Rule applies to pods with the labelapp=nginx
. -
topologyKey: "kubernetes.io/hostname"
Represents individual nodes in he scope of the anti-affinity rule.
Verify Pod Scheduling #
Since my Kubernetes cluster has only two worker nodes, the third pod remains pending because it would violate the rule to place two app=nginx
labeled pods on the same node.
# List pods
kubectl get pod -o wide
# Shell output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
example-deployment-5d6587cd58-5lf5p 0/1 Pending 0 6s <none> <none> <none> <none>
example-deployment-5d6587cd58-js75x 1/1 Running 0 6s 10.0.2.22 ubuntu3 <none> <none>
example-deployment-5d6587cd58-mbztj 1/1 Running 0 6s 10.0.1.213 ubuntu2 <none> <none>
Taints and Tolerations #
Overview #
# Taint syntax
kubectl taint nodes node-name taint-key=taint-value:taint-effect
- The taint effect defines how a tainted node reacts to a pod without appropriate toleration.
Taint Effects:
-
NoSchedule
The pod will not get scheduled to the node without a matching toleration. -
NoExecute
This will immediately evict all the pods without the matching toleration from the node. -
PerferNoSchedule
Softer version of NoSchedule, not a strict requirement.
Taint a Node #
This taint prevents pods from being scheduled on the “ubuntu2” node unless they have a toleration for gpu=true
with the NoSchedule
effect.
# Taint node "ubuntu"
kubectl taint node ubuntu2 gpu=true:NoSchedule
Remove Node Taint #
# Remove taint from node
kubectl taint node ubuntu2 gpu:NoSchedule-
Verify Node Taint #
# List node details & grep for taints
kubectl describe node ubuntu2 | grep Taints
# Shell output:
Taints: gpu=true:NoSchedule
# Kust node taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
# Shell output:
NAME TAINTS
ubuntu1 [map[effect:NoSchedule key:node-role.kubernetes.io/control-plane]]
ubuntu2 [map[effect:NoSchedule key:gpu value:true]]
ubuntu3 <none>
Example Toleration: Operator Equal #
Example Deployment #
This toleration allows the pod to tolerate / ignore the gpu=true:NoSchedule
taint, so it can be scheduled on nodes with this taint:
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-deployment
spec:
replicas: 5
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
operator: Equal
Only applies if the taint’s “key” and “value” match exactly.
Verify the Pods #
The pod is scheduled on all nodes:
# List pods
kubectl get pods -o wide
# Shell output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
example-deployment-774595f7ff-4mch4 1/1 Running 0 5s 10.0.2.99 ubuntu3 <none> <none>
example-deployment-774595f7ff-8mwlz 1/1 Running 0 5s 10.0.1.64 ubuntu2 <none> <none>
example-deployment-774595f7ff-d7ns2 1/1 Running 0 5s 10.0.2.118 ubuntu3 <none> <none>
example-deployment-774595f7ff-p29lg 1/1 Running 0 5s 10.0.1.33 ubuntu2 <none> <none>
example-deployment-774595f7ff-z2vk6 1/1 Running 0 5s 10.0.1.110 ubuntu2 <none> <none>
Example Pod Toleration: Operator Exists #
Example Deployment #
This toleration would tolerate any taint with the key gpu
regardless of the value (e.g., gpu=true
, gpu=false
)
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-deployment
spec:
replicas: 5
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
tolerations:
- key: "gpu"
operator: "Exists"
effect: "NoSchedule"
operator: Exists
Requires that only the taint’s “key” matches, regardless of the “value”.
Deployment without Toleration #
Example Deployment #
Remove the toleration:
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-deployment
spec:
replicas: 5
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
# tolerations:
# - key: "gpu"
# operator: "Exists"
# effect: "NoSchedule"
Verify the Pods #
The pods are only scheduled on the untained worker node:
# List pods
kubectl get pods -o wide
# Shell output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
example-deployment-56fcf95486-5v56f 1/1 Running 0 7s 10.0.2.65 ubuntu3 <none> <none>
example-deployment-56fcf95486-857wr 1/1 Running 0 7s 10.0.2.143 ubuntu3 <none> <none>
example-deployment-56fcf95486-btl7p 1/1 Running 0 7s 10.0.2.15 ubuntu3 <none> <none>
example-deployment-56fcf95486-ff4bl 1/1 Running 0 7s 10.0.2.253 ubuntu3 <none> <none>
example-deployment-56fcf95486-pzhs7 1/1 Running 0 7s 10.0.2.61 ubuntu3 <none> <none>
Multible Tolerations #
Taint the Second Worker Node #
# Taint node "ubuntu"
kubectl taint node ubuntu3 project=intern:NoSchedule
Example Deployment #
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-deployment
spec:
replicas: 5
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
- key: "project"
operator: "Equal"
value: "intern"
effect: "NoSchedule"
Verify the Pods #
The pod can be scheduled on nodes with:
-
Only the
gpu=true:NoSchedule
taint -
Only the
project=intern:NoSchedule
taint -
Or both taints
# List pods
kubectl get pods -o wide
# Shell output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
example-deployment-6fc49584c8-459rh 1/1 Running 0 8s 10.0.2.73 ubuntu3 <none> <none>
example-deployment-6fc49584c8-9nxpd 1/1 Running 0 8s 10.0.2.124 ubuntu3 <none> <none>
example-deployment-6fc49584c8-bdx6g 1/1 Running 0 8s 10.0.2.149 ubuntu3 <none> <none>
example-deployment-6fc49584c8-dvfnx 1/1 Running 0 8s 10.0.1.124 ubuntu2 <none> <none>
example-deployment-6fc49584c8-x59fv 1/1 Running 0 8s 10.0.1.51 ubuntu2 <none> <none>