Skip to main content

Proxmox Hypervisor - High-availability Cluster, Firewall, Proxmox Commands

1426 words·
Proxmox Hypervisor High-availability Cluster KVM
Table of Contents
Proxmox - This article is part of a series.
Part 2: This Article

Proxmox tutorial part 2

Prerequisites
#

Secure Boot
#

Proxmox does not support secure boot, it’s necessary to disable it in BIOS.


Nvidia GPU
#

I was not able to install Proxmox on a system that was using an Nvidia GT 730 as main and only GPU. There seems to be a problem with Proxmox 8 and Nvidia GPUs. I was able to install Proxmox after I switched to an AMD GPU.


Nodes
#

In my tutorial I use the following nodes:

192.168.70.10 pm10.jueklu pm10
192.168.70.11 pm11.jueklu pm11
192.168.70.12 pm12.jueklu pm12

Proxmox Cluster
#

Create Cluster
#

To create a cluster connect to any node and run the following command:

# Create cluster
pvecm create jkw-pmx

# Shell output:
Corosync Cluster Engine Authentication key generator.
Gathering 2048 bits for key from /dev/urandom.
Writing corosync key to /etc/corosync/authkey.
Writing corosync config to /etc/pve/corosync.conf
Restart corosync and cluster filesystem
# Check cluster status
pvecm status

Add Node to Cluster
#

Connect to the node that should be added to the cluster and run the following command:

# Add node to cluster
pvecm add 192.168.70.10 # Use IP of node that is already in the cluster


# Shell output:
Please enter superuser (root) password for '192.168.70.10': # Enter your root PW

Are you sure you want to continue connecting (yes/no)? # yes
# List the nodes in the cluster
pvecm nodes

# Shell output
Membership information
----------------------
    Nodeid      Votes Name
         1          1 pm10 (local)
         2          1 pm12
         3          1 pm11

Add Node to Cluster: Debugging
#

# Rejoin a member node with the same hostname and IP
pvecm add <existing_mode_ip> -f

Note: This command rewrites the cluster configuration, recreates the SSH authentication key.


Shared Storage
#

To move virtual machines between nodes it’s necessary to add a shared storage pool. In this tutorial I use a NFS export from an Ubuntu server.

NFS Settings on Fileserver
#

# Open NFS configuration
sudo vi /etc/exports

# Add the NFS export
/mnt/pmx_vms 192.168.70.10(rw,sync,no_root_squash)
/mnt/pmx_vms 192.168.70.11(rw,sync,no_root_squash)
/mnt/pmx_vms 192.168.70.12(rw,sync,no_root_squash)

# Restart NFS service
sudo systemctl restart nfs-server

NFS Shared Storage
#

  • ID Name of the storage pool
  • Server IP or DNS of your NFS server
  • Export NFS export
  • Content Define the storage usage

Create VM
#

Select the shared storage pool for the VM

If you used local storage for the ISO image, remove the ISO image from the VM after the installation fo the OS. Otherwise you can’t migrate the VM to another node.


Migrate VM
#

GUI
#

Let’s migrate the VM from my pm10 node to node pm11:


Shell
#

Let’s migrate the VM back to node pm10, but this time using the shell:

# Migrate VM to another node
qm migrate 101 pm10 -online

# Shell Output:
2023-08-05 22:45:54 starting migration of VM 101 to node 'pm10' (192.168.70.10)
2023-08-05 22:45:54 starting VM 101 on remote node 'pm10'
2023-08-05 22:45:56 start remote tunnel
2023-08-05 22:45:56 ssh tunnel ver 1
2023-08-05 22:45:56 starting online/live migration on unix:/run/qemu-server/101.migrate
2023-08-05 22:45:56 set migration capabilities
2023-08-05 22:45:56 migration downtime limit: 100 ms
2023-08-05 22:45:56 migration cachesize: 256.0 MiB
2023-08-05 22:45:56 set migration parameters
2023-08-05 22:45:56 start migrate command to unix:/run/qemu-server/101.migrate
2023-08-05 22:45:57 migration active, transferred 111.9 MiB of 2.0 GiB VM-state, 117.2 MiB/s
2023-08-05 22:45:58 migration active, transferred 221.5 MiB of 2.0 GiB VM-state, 7.3 GiB/s
2023-08-05 22:45:59 average migration speed: 688.3 MiB/s - downtime 250 ms
2023-08-05 22:45:59 migration status: completed
2023-08-05 22:46:03 migration finished successfully (duration 00:00:09)

High Availability
#

  • Shut down the virutal machine before enabling HA
  • Go to the “HA” section of the cluster and add the VM
  • Max. Restart Number of times Proxmox will restart the migrated virtual machine should a failure occur

  • Max. Relocate Number of times Proxmox will try to migrate the virtual machine to another node

  • Request State VM state after the migration to another node

  • After add the VM to HA, the VM automatically turns on

  • Check the “HA State” of the VM:

  • Let’s disconnect the node (pm10) where the virutal machine is running
  • It should take 60 seconds untill Proxmox is trying to migrate the virtual machine to another node

Note: After the node (pm10) is back online, Proxmox does not migrate the virtual machine back to the original node. This has to be done manually.


No quorum
#

A quorum represents a majority of members in a cluster, which is required to agree on updates to the cluster state.

# Shell Output
No cluster network links passed explicitly, fallback to local node IP '192.168.70.11'
Request addition of this node
An error occurred on the cluster node: cluster not ready - no quorum?
Cluster join aborted!

Start VM when Node is offline
#

When you try to start a VM on a node - even though the VM is not configured as HA - and several other node in the cluster are offline, the followign error is shown:
cluster not ready - no quorum? (500)

Use the following command to bypass the error:

# Set the expected number of votes to reach a quorum to 1
pvecm expected 1

Miscellaneous
#

Hosts file structure
#

When changing the hostname of one of the nodes - before adding the node to a cluster -, keep in mind to adapt the layout of the /etc/hosts file.

# Edit hosts file
vi /etc/hosts

# Adopt the following format: Add entry for your IP address
127.0.0.1 localhost.localdomain localhost
192.168.70.12 pm12.jueklu pm12


# Check the IP address
hostname --ip-address

# shell Output:
192.168.70.12

Proxmox clustered file system (pmxcfs)
#

All configuration files are shared by all nodes in the cluster, changes made to files inside the /etc/pve folder get replicated automatically to all nodes.

# Pmxcfs filesystem mountpoint
/etc/pve
# Cluster configuration
/etc/pve/corosync.conf

# /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: pm10
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.168.70.10
  }
  node {
    name: pm11
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 192.168.70.11
  }
  node {
    name: pm12
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 192.168.70.12
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: jkw-pmx
  config_version: 3
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

Firewall
#

Cluster Firewall
#

The cluster firewall rules automatically affect all the nodes in the cluster. They also affect the virtual machines, but the firewall must be enabled for each specific virtual machine.

  • Add rule for SSH

Note: Since I selected “Interface: vmbr0” this rule still blocks the traffic for the virtual machines. To enable SSH for the virtual machines I create a new firewall rule on the security group level.

  • Add rule for the Proxmox webinterface
  • Enable the firewall

Security Group
#

Let’s create a security Group for web applications

  • Create a new security group
  • Add the necessary rules for port 80 and port 443
  • Add a firwall rule for SSH

Virtual Machine
#

  • Add the new security group to the virtual machine
  • Select the security group from the dropdown menu and leave the “Interface” blank / without definition
  • Enable the firewall for the virtual machine

Proxmox Commands
#

VM related Commands #

# List VMs: On node
qm list

# Shell output:
VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID
 100 JKW-Fileserver       stopped    4096              32.00 0
 101 Cluster-VM           running    2048              20.00 16667


# List VMs: On cluster
cat /etc/pve/.vmlist

# Shell output:
{
"version": 22,
"ids": {
"100": { "node": "pm10", "type": "qemu", "version": 1 },
"101": { "node": "pm10", "type": "qemu", "version": 20 }}
}
# Start VM
qm start <VMID>      
# RebootVM
qm reboot <VMID>

# Shutdown VM: gentle
qm shutdown <VMID>	
# Stop VM: not gentle
qm stop <VMID>		
# Delete VM and all used/owned Volumes, Firewall Rules
qm destroy <VMID>	

# List RAM usage of VM
qm config 100 | grep ^memory


# Migrate VM to another Node
qm migrate <VMID> <Destination-Node>

# Migrate VM to another Node: Running VM
qm migrate <VMID> <Destination-Node> -online

Add & Remove Node to Cluster
#

# Add Node to Cluter
pvecm add <IP> # Use IP of any member of the Cluster

# Remove Node from Cluster
pvecm delnode <hostname>

# Check Cluster status
pvecm status

# List Proxmox Nodes
pvecm nodes

Proxmox Commands
#

# Check Proxmox version
pveversion

# Check Proxmox version: Detailed
pveversion --verbose

# Upgrade Proxmox
# (shows more output then dist-upgrade, such as reboot required related to PM)
pveupgrade

# List CPU and Disk performence
pveperf

# List attached storages
pvesm status

# List LVMs on current node
pvesm lvmscan

Troubleshooting
#

# Restart Proxmox GUI (Execute Commands in order)
service pvedaemon restart
service pveproxy restart
service pvestatd restart
Proxmox - This article is part of a series.
Part 2: This Article