Skip to main content

Nomad: Deploy Example Cluster with one Server and 3 Client Nodes, Example Job for Nginx Container

1672 words·
Nomad HashiCorp Debian DNS Nginx
Table of Contents

Overview & My Setup
#

This tutorial deploys a minimal Nomad cluster with one server and three client nodes.

I’m using VMware Workstation Pro based Debian 12 servers in this setup.

# Nomad cluster nodes
192.168.30.130 nomad01.jklug.work # Server / Leader (Controller in K8s)
192.168.30.131 nomad02.jklug.work # Client (Worker in K8s)
192.168.30.132 nomad03.jklug.work # Client (Worker in K8s)
192.168.30.133 nomad04.jklug.work # Client (Worker in K8s)

192.168.30.1 # DNS server



Nomad Nodes Prerequisites
#

DNS Configuration
#

Install & Configure resolvconf
#

Verify that systemd-resolved is inactive:

# Check systemd-resolved status
systemctl is-active systemd-resolved

# Shell output:
inactive

Setup resolvconf for DNS resolution:

# Install resolvconf package
sudo apt install resolvconf

# Start and enable service
sudo systemctl start resolvconf &&
sudo systemctl enable resolvconf

# Verify the service status
systemctl status resolvconf.service

Configure the DNS servers:

  • The following message inside the /etc/resolvconf/resolv.conf.d/head configuration can be ignored, it is a leftover from the previous DNS resolver:
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# 127.0.0.53 is the systemd-resolved stub resolver.
# run "resolvectl status" to see details about the actual nameservers.
# Edit DNS configuration
sudo vi /etc/resolvconf/resolv.conf.d/head

# Define DNS servers
nameserver 192.168.30.1
nameserver 1.1.1.1
# Apply the configuration
sudo resolvconf -u

# Verify the configuration
cat /etc/resolv.conf

# Shell output: (The old DNS servers are removed after a reboot)
nameserver 192.168.30.1
nameserver 1.1.1.1

Verify DNS Resolution
#

Make sure each node in a Nomad cluster is able to resolve the other nodes:

# Test the DNS resolution on the Nomad nodes:
nslookup nomad03.jklug.work

# Shell output:
Server:         192.168.30.1
Address:        192.168.30.1#53

Name:   nomad03.jklug.intern
Address: 192.168.30.132



Docker Installation
#

  • Use the following script to install Docker on all the nodes
#!/bin/bash

# Install Docker and Docker on Debian
sudo apt update
sudo apt install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin -y
# Add the current user to the docker group
sudo usermod -aG docker $USER



Nomad Cluster Setup
#

Nomad Installation (Deb)
#

  • Install Nomad on all cluster nodes:
# Install the required packages
sudo apt update && sudo apt install wget gpg coreutils -y

# Add the HashiCorp GPG key
wget -O- https://apt.releases.hashicorp.com/gpg | \
  sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg

# Add the official HashiCorp Linux repository
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" \
| sudo tee /etc/apt/sources.list.d/hashicorp.list

# Install Nomad
sudo apt update && sudo apt install nomad -y

Verify Installation & Check Status
#

# Verify installation / list version
nomad -v

# Shell output:
Nomad v1.10.0
BuildDate 2025-04-09T16:40:54Z
Revision e26a2bd2acac2dcdcb623f4d293bac096beef478

Server Node Configuration (Controller Node)
#

# Adapt the Nomad configuration for the Server node "nomad01.jklug.work"
sudo vi /etc/nomad.d/nomad.hcl

The original configuration looks like this:

# Copyright (c) HashiCorp, Inc.
# SPDX-License-Identifier: BUSL-1.1

# Full configuration options can be found at https://developer.hashicorp.com/nomad/docs/configuration

data_dir  = "/opt/nomad/data"
bind_addr = "0.0.0.0"

server {
  # license_path is required for Nomad Enterprise as of Nomad v1.1.1+
  #license_path = "/etc/nomad.d/license.hclic"
  enabled          = true
  bootstrap_expect = 1
}

client {
  enabled = true
  servers = ["127.0.0.1"]
}

Paste the new server configuration:

# Server Node
name        = "nomad01.jklug.work"
datacenter  = "jkw"
data_dir    = "/opt/nomad/data"
bind_addr   = "0.0.0.0"
region      = "global"

# Used by other Nomad nodes for communication
advertise {
  http = "192.168.30.130:4646"
  rpc  = "192.168.30.130:4647"
  serf = "192.168.30.130:4648"
}

server {
  enabled          = true
  bootstrap_expect = 1 # Set to number of servers

  server_join {
    retry_join = ["nomad01.jklug.work"]
  }
}

# Enable ACL
acl {
  enabled = true
}

Client Nodes Configuration (Worker Node)
#

# Adapt the Nomad configuration for the Client nodes
sudo vi /etc/nomad.d/nomad.hcl
  • nomad02.jklug.work
name        = "nomad02.jklug.work"
datacenter  = "jkw"
data_dir    = "/opt/nomad/data"
bind_addr   = "0.0.0.0"
region      = "global"

advertise {
  http = "192.168.30.131:4646"
  rpc  = "192.168.30.131:4647"
  serf = "192.168.30.131:4648"
}

client {
  enabled = true
  servers = ["nomad01.jklug.work"]

  options = {
    "driver.raw_exec.enable" = "1"
    "driver.docker.enable"   = "1"
  }
}
  • nomad03.jklug.work
name        = "nomad03.jklug.work"
datacenter  = "jkw"
data_dir    = "/opt/nomad/data"
bind_addr   = "0.0.0.0"
region      = "global"

advertise {
  http = "192.168.30.132:4646"
  rpc  = "192.168.30.132:4647"
  serf = "192.168.30.132:4648"
}

client {
  enabled = true
  servers = ["nomad01.jklug.work"]

  options = {
    "driver.raw_exec.enable" = "1"
    "driver.docker.enable"   = "1"
  }
}
  • nomad04.jklug.work
name        = "nomad04.jklug.work"
datacenter  = "jkw"
data_dir    = "/opt/nomad/data"
bind_addr   = "0.0.0.0"
region      = "global"

advertise {
  http = "192.168.30.133:4646"
  rpc  = "192.168.30.133:4647"
  serf = "192.168.30.133:4648"
}

client {
  enabled = true
  servers = ["nomad01.jklug.work"]

  options = {
    "driver.raw_exec.enable" = "1"
    "driver.docker.enable"   = "1"
  }
}

Create Data Directory
#

# Create the Nomad data directory on all nodes
sudo mkdir -p /opt/nomad/data

Start and Enable Nomad
#

# Start and enable the nomad service on all nodes
sudo systemctl start nomad &&
sudo systemctl enable nomad

Configuration Troubleshooting
#

# Output Nomad errors
sudo nomad agent -config /etc/nomad.d

# Check the logs
sudo journalctl -u nomad -n 50

Verify Nomad Service Status
#

# Verify the Nomad status on all nodes
sudo systemctl status nomad

# Shell output:
● nomad.service - Nomad
     Loaded: loaded (/lib/systemd/system/nomad.service; enabled; preset: enabled)
     Active: active (running) since Sun 2025-04-20 13:38:33 CEST; 3min 27s ago
       Docs: https://nomadproject.io/docs/
   Main PID: 4176 (nomad)
      Tasks: 9
     Memory: 30.2M
        CPU: 531ms
     CGroup: /system.slice/nomad.service
             └─4176 /usr/bin/nomad agent -config /etc/nomad.d

Bootstrap ACLs
#

# Bootstrap ACLs on the server node "nomad01.jklug.work"
nomad acl bootstrap

# Shell output:
Accessor ID  = ce9ed149-dec9-55a6-4643-7992fdf00940
Secret ID    = 4424a7aa-a0d7-351a-5823-e483e0a0ddfd
Name         = Bootstrap Token
Type         = management
Global       = true
Create Time  = 2025-04-20 11:55:33.494735773 +0000 UTC
Expiry Time  = <none>
Create Index = 15
Modify Index = 15
Policies     = n/a
Roles        = n/a
# Export the "Secret ID token"
export NOMAD_TOKEN=4424a7aa-a0d7-351a-5823-e483e0a0ddfd

Verify Nomad Cluster Status
#

Verify Server Nodes
#

Verify the Nomad cluster status on the server node “nomad01.jklug.work”:

# List members: Export the token first
export NOMAD_TOKEN=4424a7aa-a0d7-351a-5823-e483e0a0ddfd
nomad server members

# List members: Alternative add the token to the command
nomad server members -token=4424a7aa-a0d7-351a-5823-e483e0a0ddfd


# Shell output:
Name                       Address         Port  Status  Leader  Raft Version  Build   Datacenter  Region
nomad01.jklug.work.global  192.168.30.130  4648  alive   true    3             1.10.0  jkw         global

==> View and manage Nomad servers in the Web UI: http://127.0.0.1:4646/ui/servers

Verify Client Nodes
#

# List client nodes
nomad node status

# Shell output:
ID        Node Pool  DC   Name                Class   Drain  Eligibility  Status
665cce1c  default    jkw  nomad04.jklug.work  <none>  false  eligible     ready
f0804f36  default    jkw  nomad03.jklug.work  <none>  false  eligible     ready
f8d89d99  default    jkw  nomad02.jklug.work  <none>  false  eligible     ready

==> View and manage Nomad clients in the Web UI: http://127.0.0.1:4646/ui/clients



Nomad Web UI
#

Per default the GUI can be accessed on the server node on port 4646:

# Open the Nomad web UI
http://192.168.30.130:4646

# Login with the secret ID token "4424a7aa-a0d7-351a-5823-e483e0a0ddfd"
http://192.168.30.130:4646/ui/settings/tokens

Example Job
#

Create Nginx Job Configuration
#

# Create a configuration file for the example Nginx job
vi nginx-example.nomad
# Example Nginx job
job "nginx-example" {
  datacenters = ["*"]

  group "servers" {
    count = 3  # Scale to 3 instances

    network {
      port "www" {
        to = 80
      }
    }

    service {
      provider = "nomad"
      port     = "www"
    }

    task "web" {
      driver = "docker"
      config {
        image   = "nginx:latest"
        ports   = ["www"]
      }

      # Specify the maximum resources required to run the task
      resources {
        cpu    = 50
        memory = 64
      }
    }
  }
}

Run the Job
#

# Export the "Secret ID token"
export NOMAD_TOKEN=4424a7aa-a0d7-351a-5823-e483e0a0ddfd
# Create the Nginx example job
nomad job run \
  nginx-example.nomad

# Shell output:
  ✓ Deployment "cf68d4d1" successful

    2025-04-20T15:29:06+02:00
    ID          = cf68d4d1
    Job ID      = nginx-example
    Job Version = 0
    Status      = successful
    Description = Deployment completed successfully

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    servers     3        3       3        0          2025-04-20T15:39:05+02:00

Verify the Job Status
#

CLI
#

# Verify the job status
nomad job status nginx-example

# Shell output:
ID            = nginx-example
Name          = nginx-example
Submit Date   = 2025-04-20T15:28:38+02:00
Type          = service
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
servers     0       0         3        0       0         0     0

Latest Deployment
ID          = cf68d4d1
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
servers     3        3       3        0          2025-04-20T15:39:05+02:00

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
a19c866a  665cce1c  servers     0        run      running  58s ago  31s ago
a63f1961  f8d89d99  servers     0        run      running  58s ago  37s ago
e9fab417  f0804f36  servers     0        run      running  58s ago  47s ago
# Verify the job allocation to find the container port
nomad alloc status a19c866a
nomad alloc status a63f1961
nomad alloc status e9fab417

# Shell output: (a19c866a)
ID                  = a19c866a-68f0-1f1c-3da1-8e7d3f397214
Eval ID             = a244675c
Name                = nginx-example.servers[1]
Node ID             = 665cce1c
Node Name           = nomad04.jklug.work
Job ID              = nginx-example
Job Version         = 0
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 1m19s ago
Modified            = 52s ago
Deployment ID       = cf68d4d1
Deployment Health   = healthy

Allocation Addresses:
Label  Dynamic  Address
*www   yes      192.168.30.133:28538 -> 80

Task "web" is "running"
Task Resources:
CPU       Memory          Disk     Addresses
0/50 MHz  4.7 MiB/64 MiB  300 MiB

Task Events:
Started At     = 2025-04-20T13:28:55Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                       Type        Description
2025-04-20T15:28:55+02:00  Started     Task started by client
2025-04-20T15:28:38+02:00  Driver      Downloading image
2025-04-20T15:28:38+02:00  Task Setup  Building Task Directory
2025-04-20T15:28:38+02:00  Received    Task received by client

Web UI
#

Verify the Nginx example job via the web UI:


Access the Nginx Container
#

# Access the Nginx container
curl http://192.168.30.133:28538

# Shell output:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>

Delete the Job
#

# Delete the job
nomad job stop -purge nginx-example



Links #

# Nomad official documentation: Installation
https://developer.hashicorp.com/nomad/docs/install
https://developer.hashicorp.com/nomad/tutorials/enterprise/production-deployment-guide-vm-with-consul#download-nomad