Skip to main content

Ceph Cluster: Deploy a Ceph Cluster with Cephadm, Add Nodes and OSDs; Setup a Storage Pool, Create and Mount RBD Image, Create User for RBD Image Mount

2119 words·
Ceph High-availability Cluster Rocky Linux
Table of Contents
Ceph - This article is part of a series.
Part 1: This Article

Overview
#

I’m using the following storage layout on all three nodes:

# List block devices
lsblk

# Shell output:
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sr0          11:0    1 1024M  0 rom
nvme0n1     259:0    0   20G  0 disk
├─nvme0n1p1 259:1    0    1G  0 part /boot
└─nvme0n1p2 259:2    0   19G  0 part
  ├─rl-root 253:0    0   17G  0 lvm  /var/lib/containers/storage/overlay
  │                                  /
  └─rl-swap 253:1    0    2G  0 lvm  [SWAP]
nvme0n2     259:3    0   20G  0 disk
nvme0n3     259:4    0   20G  0 disk

The VMs are based on Rocky Linux 9.4, with 4 CPU cores and 8 GB RAM.

192.168.30.100 rocky1 # Initial / Bootstrap Node
192.168.30.101 rocky2 # Node 2
192.168.30.102 rocky3 # Node 3

Prerequisites
#

Add hosts entries and install the dependencies on all nodes.

Hosts Entry
#

For Ceph it’s recommended to use simple hostnames rather than fully qualified domain names (FQDNs).

# Add hosts entry
sudo tee -a /etc/hosts<<EOF
192.168.30.100 rocky1
192.168.30.101 rocky2
192.168.30.102 rocky3
EOF

Install Dependencies
#

# Upgrade packages
sudo dnf update -y && sudo dnf upgrade -y

# Install dependencies
sudo dnf install python3 lvm2 podman -y

Initialize Ceph Cluster
#

On the first node, install Cephadm und bootstrap the cluster.

Install Cephadm & Ceph-Common
#

# Download Cephadm & change permission
curl --silent --remote-name --location https://download.ceph.com/rpm-18.2.2/el9/noarch/cephadm &&
chmod +x cephadm

# Add Ceph "reef" repository
./cephadm add-repo --release reef
# Optional: Install Cephadm (install binary to "/usr/sbin/cephadm")
./cephadm install

# Verify the installation
which cephadm
# Install Ceph-Common (For CLI usage)
dnf update -y && dnf install ceph-common -y

# Verify the Ceph CLI (Ceph-Command) / check version
ceph --version

Bootstrap the Ceph Cluster
#

# Bootstrap the Cluster: Create initial monitor and manager node
cephadm bootstrap --mon-ip 192.168.30.100 \
                  --initial-dashboard-user admin \
                  --initial-dashboard-password my-secure-pw
# Shell output:
Ceph Dashboard is now available at:

             URL: https://rocky1:8443/
            User: admin
        Password: my-secure-pw

Enabling client.admin keyring and conf on hosts with "admin" label
Saving cluster configuration to /var/lib/ceph/c21564f0-3200-11ef-85f9-000c29ad85a9/config directory
Enabling autotune for osd_memory_target
You can access the Ceph CLI as following in case of multi-cluster or non-default config:

        sudo /usr/sbin/cephadm shell --fsid c21564f0-3200-11ef-85f9-000c29ad85a9 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Or, if you are only running a single cluster on this host:

        sudo /usr/sbin/cephadm shell

Please consider enabling telemetry to help improve Ceph:

        ceph telemetry on

For more information see:

        https://docs.ceph.com/en/latest/mgr/telemetry/

Bootstrap complete.

Verify the Cluster Status
#

# Check the cluster status
ceph status

# Shell output:
  cluster:
    id:     c21564f0-3200-11ef-85f9-000c29ad85a9
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum rocky1 (age 75s)
    mgr: rocky1.ybgqbk(active, starting, since 0.0665046s)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:
# List Ceph services
ceph orch ps

# Shell output:
NAME                  HOST    PORTS             STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID
alertmanager.rocky1   rocky1  *:9093,9094       running (9s)      6s ago  48s    15.5M        -  0.25.0   c8568f914cd2  a275295ebed3
ceph-exporter.rocky1  rocky1                    running (58s)     6s ago  58s    5855k        -  18.2.2   3c937764e6f5  1e8865d0ca47
crash.rocky1          rocky1                    running (57s)     6s ago  57s    6656k        -  18.2.2   3c937764e6f5  cc41d87bec04
grafana.rocky1        rocky1  *:3000            running (7s)      6s ago  31s    39.4M        -  9.4.7    954c08fa6188  907e90c75a03
mgr.rocky1.ybgqbk     rocky1  *:9283,8765,8443  running (91s)     6s ago  91s     458M        -  18.2.2   3c937764e6f5  fae8713cc499
mon.rocky1            rocky1                    running (92s)     6s ago  94s    31.0M    2048M  18.2.2   3c937764e6f5  ec9e8e556943
node-exporter.rocky1  rocky1  *:9100            running (54s)     6s ago  54s    15.4M        -  1.5.0    0da6a335fe13  2a631d670969
prometheus.rocky1     rocky1  *:9095            running (23s)     6s ago  23s    31.3M        -  2.43.0   a07b618ecd1d  454f15c54703

Ceph Dashboard
#

Custom TLS Certificate
#

As always I’m using Let’s Encrypt wildcard certificates.

# Upload custom TLS certificate
ceph dashboard set-ssl-certificate -i ./fullchain.pem &&
ceph dashboard set-ssl-certificate-key -i ./privkey.pem

# Shell output:
SSL certificate updated
SSL certificate key updated
# Restart Dashboard module
ceph mgr module disable dashboard &&
ceph mgr module enable dashboard

Hosts Entry
#

# Create a hosts entry for the Ceph Dashboard
192.168.30.100 ceph.jklug.work

Access the Dashboard
#

# Access the Ceph Dashboard
https://ceph.jklug.work:8443

Use the user and password defined in the bootstrap command:

# User
admin

# Password
my-secure-pw



Add Cluster Resources
#

Add OSDs (Object Storage Daemons)
#

List Available Devices
#

# List available storage devices
ceph orch device ls

# Shell output:
HOST    PATH          TYPE  DEVICE ID                                   SIZE  AVAILABLE  REFRESHED  REJECT REASONS
rocky1  /dev/nvme0n2  ssd   VMware_Virtual_NVMe_Disk_VMware_NVME_0000  20.0G  Yes        32s ago
rocky1  /dev/nvme0n3  ssd   VMware_Virtual_NVMe_Disk_VMware_NVME_0000  20.0G  Yes        32s ago

Add all Available Devices
#

# Add any available and unused device
ceph orch apply osd --all-available-devices

# Shell out:
Scheduled osd.all-available-devices update...

Add specific Device
#

# Add specific OSD
ceph orch daemon add osd rocky1:/dev/nvme0n2
ceph orch daemon add osd rocky1:/dev/nvme0n3

# Shell output:
Created osd(s) 0 on host 'rocky1'
Created osd(s) 1 on host 'rocky1'

List OSD Devices
#

# List cluster OSDs
ceph osd tree

# Shell output
ID  CLASS  WEIGHT   TYPE NAME        STATUS  REWEIGHT  PRI-AFF
-1         0.03897  root default
-3         0.03897      host rocky1
 0    ssd  0.01949          osd.0        up   1.00000  1.00000
 1    ssd  0.01949          osd.1        up   1.00000  1.00000
# List cluster OSDs: Details
ceph osd status

# Shell output:
ID  HOST     USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
 0  rocky1  26.4M  19.9G      0        0       0        0   exists,up
 1  rocky1  26.4M  19.9G      0        0       0        0   exists,up
# List details about specific OSD
ceph osd find 0

# Shell output:
{
    "osd": 0,
    "addrs": {
        "addrvec": [
            {
                "type": "v2",
                "addr": "192.168.30.100:6802",
                "nonce": 3273157086
            },
            {
                "type": "v1",
                "addr": "192.168.30.100:6803",
                "nonce": 3273157086
            }
        ]
    },
    "osd_fsid": "43749334-1fa2-49a3-abbe-da71d9ab1b12",
    "host": "rocky1",
    "crush_location": {
        "host": "rocky1",
        "root": "default"
    }
}
# List details about OSD performance stats
ceph osd perf

Add Nodes to Cluster
#

Copy SSH Key
#

# Copy the Ceph SSH key to the other Ceph nodes
ssh-copy-id -f -i /etc/ceph/ceph.pub root@192.168.30.101
ssh-copy-id -f -i /etc/ceph/ceph.pub root@192.168.30.102

Add Additional Nodes:
#

# Add the other nodes to the Ceph cluster
ceph orch host add rocky2 192.168.30.101
ceph orch host add rocky3 192.168.30.102

# Shell output:
Added host 'rocky2' with addr '192.168.30.101'
Added host 'rocky3' with addr '192.168.30.102'

Verify the Cluster Nodes
#

# List cluster nodes
ceph orch host ls

# Shell output:
HOST    ADDR            LABELS  STATUS
rocky1  192.168.30.100  _admin
rocky2  192.168.30.101
rocky3  192.168.30.102
3 hosts in cluster
# List cluster nodes and roles
ceph node ls

# Shell output:
{
    "mon": {
        "rocky1": [
            "rocky1"
        ],
        "rocky2": [
            "rocky2"
        ],
        "rocky3": [
            "rocky3"
        ]
    },
    "osd": {
        "rocky1": [
            0,
            1
        ]
    },
    "mgr": {
        "rocky1": [
            "rocky1.ybgqbk"
        ],
        "rocky2": [
            "rocky2.megwzq"
        ]
    }
}

Add OSD
#

# List available devices
ceph orch device ls

# Shell output:
HOST    PATH          TYPE  DEVICE ID                                   SIZE  AVAILABLE  REFRESHED  REJECT REASONS
rocky1  /dev/nvme0n2  ssd   VMware_Virtual_NVMe_Disk_VMware_NVME_0000  20.0G  No         3m ago     Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected
rocky1  /dev/nvme0n3  ssd   VMware_Virtual_NVMe_Disk_VMware_NVME_0000  20.0G  No         3m ago     Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected
rocky2  /dev/nvme0n2  ssd   VMware_Virtual_NVMe_Disk_VMware_NVME_0000  20.0G  Yes        3m ago
rocky2  /dev/nvme0n3  ssd   VMware_Virtual_NVMe_Disk_VMware_NVME_0000  20.0G  Yes        3m ago
rocky3  /dev/nvme0n2  ssd   VMware_Virtual_NVMe_Disk_VMware_NVME_0000  20.0G  Yes        3m ago
rocky3  /dev/nvme0n3  ssd   VMware_Virtual_NVMe_Disk_VMware_NVME_0000  20.0G  Yes        3m ago
# Add OSDs from the other nodes: Manually define OSDs
ceph orch daemon add osd rocky2:/dev/nvme0n2
ceph orch daemon add osd rocky2:/dev/nvme0n3
ceph orch daemon add osd rocky3:/dev/nvme0n2
ceph orch daemon add osd rocky3:/dev/nvme0n3


# Add all available OSDs
ceph orch apply osd --all-available-devices
# List cluster OSDs: Details
ceph osd status

# Shell output:
ID  HOST     USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
 0  rocky1  27.2M  19.9G      0        0       0        0   exists,up
 1  rocky1  26.5M  19.9G      0        0       0        0   exists,up
 2  rocky3  27.2M  19.9G      0        0       0        0   exists,up
 3  rocky2  26.5M  19.9G      0        0       0        0   exists,up
 4  rocky3  26.5M  19.9G      0        0       0        0   exists,up
 5  rocky2  27.2M  19.9G      0        0       0        0   exists,up



Storage Pools
#

Overview
#

Number of Placement Groups “pg_num”:

  • Number of placement groups in a Ceph pool. Each placement group is essentially a bucket that data objects are placed into.

  • By using placement groups, Ceph can distribute and balance the data load across all OSDs in the cluster.

  • The number of placement groups affects how well the data is distributed and balanced across the OSDs. Too few placement groups might not get optimal performance because the data is not evenly distributed. Too many could lead to overhead that might degrade performance because each OSD has to manage more placement groups.


Number of Placement Groups for Placement “pgp_num”:

  • Defines how many of the placement groups are actively used to map data placements in the pool.

  • Used to control the re-balancing of data when pg_num is changed (usually increased to scale with the cluster). It allows the cluster to adjust at a controlled pace without overwhelming the system with too much data movement at once.


List Storage Pools
#

# List storage pools
ceph osd lspools

Create Storage Pool
#

# Create storage pool: With 64 placement groups
ceph osd pool create pool-1 64 64 replicated

# Set the replication factor
ceph osd pool set pool-1 size 3

Explanation:

  • Pool name: pool-1

  • Initial number of placement groups (pg_num): 64

  • Number of placement groups for placement (pgp_num): 64

  • Type: replicated Data is replicated across multiple OSDs for redundancy

Adjust PG and PGP Number
#

# If necessary adjust the number of placement groups for optimal performance
ceph osd pool set pool-1 pg_num 128
ceph osd pool set pool-1 pgp_num 128

List Pool Details
#

# List pool details
ceph osd pool get pool-1 all

# Shell output:
size: 3
min_size: 2
pg_num: 64
pgp_num: 64
crush_rule: replicated_rule
hashpspool: true
nodelete: false
nopgchange: false
nosizechange: false
write_fadvise_dontneed: false
noscrub: false
nodeep-scrub: false
use_gmt_hitset: 1
fast_read: 0
pg_autoscale_mode: on
eio: false
bulk: false

RBD Block Storage
#

Create Image
#

Create a RBD image on the previously created storage pool:

# Create RBD image: Syntax
rbd create --size {megabytes} {pool-name}/{image-name}

# Create RBD image: Example
rbd create --size 2048 --pool pool-1 image-1

List Images & Image Details
#

# List images in a pool
rbd ls pool-1
# List image details
rbd info pool-1/image-1

# Shell output:
rbd image 'image-1':
        size 2 GiB in 512 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: d3c53de5e1f
        block_name_prefix: rbd_data.d3c53de5e1f
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        op_features:
        flags:
        create_timestamp: Mon Jun 24 20:36:13 2024
        access_timestamp: Mon Jun 24 20:36:13 2024
        modify_timestamp: Mon Jun 24 20:36:13 2024

RBD Mapping
#

Install Ceph Client
#

Install the Ceph client on the Linux host where you want to use the RBD:

# Install Ceph client
sudo apt install ceph-common -y

Configure Ceph access:

# Copy the cluster configuration and authentication credentials to the client
scp /etc/ceph/ceph.conf debian@192.168.30.20:~/
scp /etc/ceph/ceph.client.admin.keyring debian@192.168.30.20:~/

Note: This is just a homelab playground and not production ready.

# Move the files and set permissions
sudo mv ~/ceph.conf /etc/ceph/ceph.conf &&
sudo mv ~/ceph.client.admin.keyring /etc/ceph/ceph.client.admin.keyring &&
sudo chown root:root /etc/ceph/ceph.conf /etc/ceph/ceph.client.admin.keyring

Map the RBD Image
#

Map the image to a local device file:

# Map the RBD to a local device
sudo rbd map image-1 --pool pool-1

# Shell output:
/dev/rbd0

Mount the RBD Image
#

# Create a file system on the RBD image
sudo mkfs.ext4 /dev/rbd0

# Create a mount point directory
sudo mkdir -p /mnt/ceph-image-1

# Mount the RBD image
sudo mount /dev/rbd0 /mnt/ceph-image-1
# Verify the mount
df -h

# Shell output:
Filesystem                   Size  Used Avail Use% Mounted on
udev                         1.9G     0  1.9G   0% /dev
tmpfs                        389M  736K  388M   1% /run
/dev/mapper/debian--vg-root   28G  2.6G   24G  10% /
tmpfs                        1.9G     0  1.9G   0% /dev/shm
tmpfs                        5.0M     0  5.0M   0% /run/lock
/dev/sda1                    455M  172M  259M  40% /boot
tmpfs                        389M     0  389M   0% /run/user/1000
/dev/rbd0                    2.0G   24K  1.8G   1% /mnt/ceph-image-1
# Unmount the RBD image
sudo umount /mnt/ceph-image-1

RBD Mapping with specific User
#

Create User
#

# Create user that can access "pool-1"
ceph auth get-or-create client.user1 osd "allow rwx pool=pool-1" mon "allow r" -o /etc/ceph/ceph.client.user1.keyring

Test User Authentication
#

# Test Authentication
sudo ceph -s --id user1 --keyring /etc/ceph/ceph.client.user1.keyring

Add Keyring
#

Copy the keyring of “client.user1” via SSH, or manually create and paste the keyring on the client where the image will be mapped:

# Create a keyring file for the user
sudo vi /etc/ceph/ceph.client.user1.keyring

# Paste the keyring
[client.user1]
        key = AQBN7INmtgrMNhAARX6CPwUvFPJneO6gARoZhg==

Map Image
#

# Map the image with the previously created user
sudo rbd --id user1 --keyring /etc/ceph/ceph.client.user1.keyring map image-1 --pool pool-1

# Shell output:
/dev/rbd0

Verify Image Mapping
#

# Verify the mapped image
rbd showmapped

# Shell output:
id  pool    namespace  image    snap  device
0   pool-1             image-1  -     /dev/rbd0

Unmap Image
#

# Unmap image
sudo rbd unmap /dev/rbd0

# Verify the unmapping
rbd showmapped

Links #

# Install Cephadm
https://docs.ceph.com/en/latest/cephadm/install/
https://docs.ceph.com/en/latest/cephadm/install/#cephadm-install-curl

# Ceph Dashboard
https://docs.ceph.com/en/quincy/mgr/dashboard/

# Add OSD
https://docs.ceph.com/en/latest/cephadm/services/osd/#cephadm-deploy-osds
Ceph - This article is part of a series.
Part 1: This Article