Debian Jessie 8.3: Short howto for Corosync+Pacemaker Active/Passive Cluster with two nodes and DRBD/LVM

Hello,

since I had to change my old “heartbeat v1” setup to an more modern Corosync+Pacemaker setup, because “heartbeat v1” does not support systemd (it first looks like it is working, but it fails on service start/stops), I want to share a simple setup:

  • Two nodes (node1-1 and node1-2)
  • Active/Passive setup
  • Shared IP (here: 123.123.123.123/24)
  • Internal network on eth1 (here: 192.168.99.0/24)
  • DRBD shared storage
  • LVM on top of DRBD
  • Multiple services, depending also on the DRBD/LVM storage

First you have to activate the jessie-backports repository, because the cluster stack is not available/broken in Debian Jessie. Install the required packages with:

apt-get install -t jessie-backports libqb0 fence-agents pacemaker corosync pacemaker-cli-utils crmsh drbd-utils

After that configure your DRBD and LVM (VG+LV) on it (there are enough tutorials for it).

Then deploy this configuration to /etc/corosync/corosync.conf:

totem {
version: 2
token: 3000
token_retransmits_before_loss_const: 10
clear_node_high_bit: yes
crypto_cipher: none
crypto_hash: none
transport: udpu
interface {
ringnumber: 0
bindnetaddr: 192.168.99.0
}
}

logging {
to_logfile: yes
logfile: /var/log/corosync/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}

quorum {
provider: corosync_votequorum
two_node: 1
wait_for_all: 1
}

nodelist {
node {
ring0_addr: node1-1
}
node {
ring0_addr: node1-2
}
}

Both nodes require a passwordless keypair, which is copied to the another node, so that you can ssh from one to each other.

Then you can start with crm configure:

property stonith-enabled=no
property no-quorum-policy=ignore
property default-resource-stickiness=100

primitive DRBD_r0 ocf:linbit:drbd params drbd_resource=”r0″ op start interval=”0″ timeout=”240″ \
op stop interval=”0″ timeout=”100″ \
op monitor role=Master interval=59s timeout=30s \
op monitor role=Slave interval=60s timeout=30s
primitive LVM_r0 ocf:heartbeat:LVM params volgrpname=”data1″ op monitor interval=”30s”
primitive SRV_MOUNT_1 ocf:heartbeat:Filesystem params device=”/dev/mapper/data1-lv1″ directory=”/srv/storage” fstype=”ext4″ options=”noatime,nodiratime,nobarrier” op monitor interval=”40s”

primitive IP-rsc ocf:heartbeat:IPaddr2 params ip=”123.123.123.123″ nic=”eth0″ cidr_netmask=”24″ meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive IPInt-rsc ocf:heartbeat:IPaddr2 params ip=”192.168.99.4″ nic=”eth1″ cidr_netmask=”24″ meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart

primitive MariaDB-rsc lsb:mysql meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive Redis-rsc lsb:redis-server meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive Memcached-rsc lsb:memcached meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive PHPFPM-rsc lsb:php5-fpm meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive Apache2-rsc lsb:apache2 meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive Nginx-rsc lsb:nginx meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart

group APCLUSTER LVM_r0 SRV_MOUNT_1 IP-rsc IPInt-rsc MariaDB-rsc Redis-rsc Memcached-rsc PHPFPM-rsc Apache2-rsc Nginx-rsc
ms ms_DRBD_APCLUSTER DRBD_r0 meta master-max=”1″ master-node-max=”1″ clone-max=”2″ clone-node-max=”1″ notify=”true”

colocation APCLUSTER_on_DRBD_r0 inf: APCLUSTER ms_DRBD_APCLUSTER:Master
order APCLUSTER_after_DRBD_r0 inf: ms_DRBD_APCLUSTER:promote APCLUSTER:start

commit

The last (bold marked) lines made me some headache. In short they define that the DRBD device on the active node has to be the primary one and that it is required to start the “APCLUSTER” on the host, since the LVM, filesystem and services require to access its data.

Just a short copy paste howto for an simple use case with not so much deep explanaitions..

Leave a Reply