Be careful: Upgrading Debian Jessie to Stretch, with Pacemaker DRBD and an nested ext4 LVM hosted on VMware products

Detached DRBD (diskless)

In the past I setup some new Pacemaker clustered nodes with a fresh Debian Stretch installation. I followed our standard installation guide, created also shared replicated DRBD storage, but whenever I tried to mount the ext4 storage DRBD detached the disks on both node sides with I/O errors. After recreating it, using other storage volumes and testing my ProLiant hardware (whop I thought it had got a defect..) it still occurs, but somewhere in the middle of testing, a quicker setup without LVM it worked fine, hum..

Much later I found this (only post at this time about it) on the DRBD-user mailinglist: [0]
This means, if you use the combination of VMware-Product -> Debian Stretch -> local Storage -> DRBD -> LVM -> ext4 you will be affected by this bug. This happens, because VMware always publishs the information, that the guest is able to support the “WRITE SAME” feature, which is wrong. Since the DRBD version, which is also shipped with Stretch, DRBD now also supports WRITE SAME, so it tries to use this feature, but this fails then.
This is btw the same reason, why VMware users see in their dmesg this:

WRITE SAME failed.Manually zeroing.

As a workaround I am using now systemd, to disable “WRITE SAME” for all attached block devices in the guest. Simply run the following:

for i in `find /sys/block/*/device/scsi_disk/*/max_write_same_blocks`; do echo “w $i  –   –   –   –  0” ; done > /etc/tmpfiles.d/write_same.conf

[0]: http://lists.linbit.com/pipermail/drbd-user/2017-January/022931.html

Pacemaker failovers with DRBD+LVM do not work

If you use a DRBD with a nested LVM, you already had to add the following lines to your /etc/lvm/lvm.conf in past Debian releases (assuming that sdb and sdc are DRBD devices):

filter = [ “r|/dev/sdb.*|/dev/sdc.*|”  ]
write_cache_state = 0

Wit Debian Stretch this is not enough. Your failovers will result in a broken state on the second node, because it can not find your LVs and VGs. I found out, that killing lvmetad helps. So I also added a global_filter (it should be used for all LVM services):

global_filter = [ “r|/dev/sdb.*|/dev/sdc.*|”  ]

But this also didn’t helped.. My only solution was to disable lvmetad (which I am also not using at all). So adding this all – in combination – works now for me and failovers are as smooth as with Jessie:

filter = [ “r|/dev/sdb.*|/dev/sdc.*|”  ]
global_filter = [ “r|/dev/sdb.*|/dev/sdc.*|”  ]
write_cache_state = 0
use_lvmetad = 0

Do not forget to update your initrd, so that the LVM configuration is updated on booting your server:

update-initramfs -k all -u

Reboot, that’s it :)

Debian Jessie 8.3: Short howto for Corosync+Pacemaker Active/Passive Cluster with two nodes and DRBD/LVM

Hello,

since I had to change my old “heartbeat v1” setup to an more modern Corosync+Pacemaker setup, because “heartbeat v1” does not support systemd (it first looks like it is working, but it fails on service start/stops), I want to share a simple setup:

  • Two nodes (node1-1 and node1-2)
  • Active/Passive setup
  • Shared IP (here: 123.123.123.123/24)
  • Internal network on eth1 (here: 192.168.99.0/24)
  • DRBD shared storage
  • LVM on top of DRBD
  • Multiple services, depending also on the DRBD/LVM storage

First you have to activate the jessie-backports repository, because the cluster stack is not available/broken in Debian Jessie. Install the required packages with:

apt-get install -t jessie-backports libqb0 fence-agents pacemaker corosync pacemaker-cli-utils crmsh drbd-utils

After that configure your DRBD and LVM (VG+LV) on it (there are enough tutorials for it).

Then deploy this configuration to /etc/corosync/corosync.conf:

totem {
version: 2
token: 3000
token_retransmits_before_loss_const: 10
clear_node_high_bit: yes
crypto_cipher: none
crypto_hash: none
transport: udpu
interface {
ringnumber: 0
bindnetaddr: 192.168.99.0
}
}

logging {
to_logfile: yes
logfile: /var/log/corosync/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}

quorum {
provider: corosync_votequorum
two_node: 1
wait_for_all: 1
}

nodelist {
node {
ring0_addr: node1-1
}
node {
ring0_addr: node1-2
}
}

Both nodes require a passwordless keypair, which is copied to the another node, so that you can ssh from one to each other.

Then you can start with crm configure:

property stonith-enabled=no
property no-quorum-policy=ignore
property default-resource-stickiness=100

primitive DRBD_r0 ocf:linbit:drbd params drbd_resource=”r0″ op start interval=”0″ timeout=”240″ \
op stop interval=”0″ timeout=”100″ \
op monitor role=Master interval=59s timeout=30s \
op monitor role=Slave interval=60s timeout=30s
primitive LVM_r0 ocf:heartbeat:LVM params volgrpname=”data1″ op monitor interval=”30s”
primitive SRV_MOUNT_1 ocf:heartbeat:Filesystem params device=”/dev/mapper/data1-lv1″ directory=”/srv/storage” fstype=”ext4″ options=”noatime,nodiratime,nobarrier” op monitor interval=”40s”

primitive IP-rsc ocf:heartbeat:IPaddr2 params ip=”123.123.123.123″ nic=”eth0″ cidr_netmask=”24″ meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive IPInt-rsc ocf:heartbeat:IPaddr2 params ip=”192.168.99.4″ nic=”eth1″ cidr_netmask=”24″ meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart

primitive MariaDB-rsc lsb:mysql meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive Redis-rsc lsb:redis-server meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive Memcached-rsc lsb:memcached meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive PHPFPM-rsc lsb:php5-fpm meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive Apache2-rsc lsb:apache2 meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
primitive Nginx-rsc lsb:nginx meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart

group APCLUSTER LVM_r0 SRV_MOUNT_1 IP-rsc IPInt-rsc MariaDB-rsc Redis-rsc Memcached-rsc PHPFPM-rsc Apache2-rsc Nginx-rsc
ms ms_DRBD_APCLUSTER DRBD_r0 meta master-max=”1″ master-node-max=”1″ clone-max=”2″ clone-node-max=”1″ notify=”true”

colocation APCLUSTER_on_DRBD_r0 inf: APCLUSTER ms_DRBD_APCLUSTER:Master
order APCLUSTER_after_DRBD_r0 inf: ms_DRBD_APCLUSTER:promote APCLUSTER:start

commit

The last (bold marked) lines made me some headache. In short they define that the DRBD device on the active node has to be the primary one and that it is required to start the “APCLUSTER” on the host, since the LVM, filesystem and services require to access its data.

Just a short copy paste howto for an simple use case with not so much deep explanaitions..

BASH fix Debian Lenny (5.0) CVE-2014-6271, CVE-2014-7169 aka Shellshock

Hello,

I have decided to create fixed bash packages for Debian Lenny. I have applied the upstream patchsets from from 052 until 057, so some other issues are also addressed in it. :-)
And here they are:

Source .dsc: http://misc.linux-dev.org/bash_shellshock/bash_3.2-4.1.dsc
amd64 package: http://misc.linux-dev.org/bash_shellshock/bash_3.2-4.1_amd64.deb
i386 package: http://misc.linux-dev.org/bash_shellshock/bash_3.2-4.1_i386.deb

Much fun with it!

What an ugly (PHP) work..

We still have got some more or less webapplications which are not compatible with PHP higher than version 5.2.x, which is the only blocker for the last Lenny servers to upgrade them to Squeeze.. I do not think that I am alone with this ****** topic :)

So the new “masterplan” is to deploy those applications on seperated Wheezy servers, with PHP 5.2.x running as FastCGI, so that most parts of the system are “security supported”.

First; I didn’t documented my steps and I am not 100% done (something like 99%) but I have done the following to have it “as clean as possible”:

  • Catch the original 5.2.17 sources and build them, urgs.. it fails at all with the new multiarch paths from Wheezy, after a few hours of patching I gave up..
  • Using dotdeb Lenny sources as reference, they also have got 5.2.17 sources, but what the fuck? The orig sources of their mirror also could not build, because the patch series FAILS (not hunky, they fail!), how did they build them???
  • After some sanitizing of the dotdeb packages I thought it is better to smoke some cigarette and to delete them, urgs..
  • My next step was to catch the latest 5.2.12-x packaging from snapshot.debian.org, here the story continues… again…:

PHP 5.2.x is just not able to detect the new multiarch paths, it fails at most “dir” options. Since patching the whole build system would be *too* much work I decided to hack around this (ln -s /usr/lib/x86…./foo.so /usr/lib/), then some adjustions to the build dependencies, disablieng some modules, like SSL and libdb (incompatible versions), disabling merged patches and refresh the suhosin hardening patch; I get an working PHP 5.2.17 package on Wheezy.
But this is too easy!
I want packages which I could co-install with the PHP 5.4 packages from Wheezy, 5.2 should only used withing vHosts where I have enabled them..

So I rewrote the whole packaging (*burg* IMHO at all) to use “php52” instead of “php5” as packaging namespace and also everything is put into “/opt” as prefix. Much painfull work, but yeah it works.. :)
Some packages, like php52-dev or php-pear are broken, but those were not my goal of this action.

If someone is interested in those packages please send me an email.
Since PHP 5.2 is not supported any longer (and that this is at all a big hack) I will not publish the source and binaries at all.

Playing with Apache mod_geoip

If you want to add some rules to your Apache based on the clients country, mod_geoip is perfect for it.

Installation

On Squeeze following is enough: # apt-get install libapache2-mod-geoip geoip-database/squeeze-backports

Note that you should use the geoip-database version from squeeze-backports to have got the most up to date database version, I am updating it every month.

Configuration

You can add the rules to your VirtualHost, Directory, Location directives and also to your apache2.conf (“serverwide”). So you are flexible with where to use it.

Blocking countries

On some servers I have got more than 90 percent of spam requests only from three countries, so I blocked them with:

<DirectoryMatch “^/var/www/.*/html”>
SetEnvIf GEOIP_COUNTRY_CODE RU BlockCountry
SetEnvIf GEOIP_COUNTRY_CODE CN BlockCountry
SetEnvIf GEOIP_COUNTRY_CODE UA BlockCountry
Deny from env=BlockCountry
</DirectoryMatch>

Allow only specific countries

In the other way you also can allow specific countries to have got access to your website, this also may be a good idea for extranets, where you know from where your customers are:

<Directory “/var/www/my.site.com/html/login”>
SetEnvIf GEOIP_COUNTRY_CODE DE AllowCountry
SetEnvIf GEOIP_COUNTRY_CODE CH AllowCountry
Deny from all
Allow from env=AllowCountry
</Directory>

Very easy!

Rewrite Rules

You can also use it for mod_rewrite. Within a project, customers from CN and TW should be redirected to the chinese page:

RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(CN|TW)$
RewriteRule ^(.*)$ http://some.example.cn/site.php [L]

mod_geoip with proxy frontends

Normaly mod_geoip works behinds load balancers and proxy servers, since it also take care of the HTTP_X_FORWARDED_FOR header.

But with haproxy it looks problematic, since it does not add the HTTP_X_FORWARDED_FOR header to KeepAlive’d requests :( Disabling KeepAlive is a bad idea on this cluster, so we decided to also use php5-geoip in our application, so everything is working nice now..

What mod_geoip is NOT is

mod_geoip helps you to block/allow specific countries, but it does not protect you from them.
Also keep in mind that the database is only ~ 99,8% accurate, so you may have got false positives/negatives. If you only allow german users, a german IP could be listed as russian.
This is much more problematic with mobile/satellite connections and surely you can also not access your page, if you are on vacation in another country. ;)

Hide process information for other users

Debian GNU/Linux Debian 7.0 (aka Wheezy) will be a “general hardened” distribution in my eyes. Not only that it now enabled hardened building of packages (see http://wiki.debian.org/Hardening), the Kernel team also backported with 3.2.20-1 the IMO very interesting hidepid option (already available in Wheezy since some weeks)!

What is the job of “hidepid”?

hidepid is an new mount option for the procfs (/proc), with that you can hide processes and its information to other users, like other shell users and to web scripts.

hidepid accepts three different values:

  • hidepid=0 (default): This is the default setting and gives you the default behaviour.
  • hidepid=1: With this option an normal user would not see other processes but their own about ps, top etc, but he is still able to see process IDs in /proc
  • hidepid=2: Users are only able too see their own processes (like with hidepid=1), but also the other process IDs are hidden for them in /proc!

Additionaly you can specifiy an user/group ID which is still able to look up the processes with the gid option. So if you want to hide all processes to other users, except root (uid=0) and in this example gid=1001 (some semi administrative user in this example) your /etc/fstab has to look like this:

proc            /proc           proc    defaults,hidepid=2,gid=1001        0       0

It was a good descision to backport this feature IMO, but also be careful, it *may* break programs. I did not found any server related application which will break with hidepid=2, but we had to adjust our Nagios monitoring to execute some process checks with another UID, since the nagios user itself could not see anymore, if process A and B is still running.

UPDATE 1:
Since a few people asked (thanks for it) with hidepid=2 the process IDs are not invisible, they are unavailable:
$ ls /proc/1
ls: cannot access /proc/1: No such file or directory
$

Logging packets with iptables and ULOG

Imagine you have got the following iptables rule set:

*filter
:INPUT ACCEPT [2:130]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [119:14185]
-A INPUT -s 127.0.0.0/8 -j ACCEPT
-A INPUT -p tcp -m tcp –dport 22 -j ACCEPT
-A INPUT -p tcp -m tcp –dport 80 -j ACCEPT
-A INPUT -p tcp -m tcp –tcp-flags FIN,SYN,RST,ACK SYN -j DROP
COMMIT

This would allow all traffic from 127.0.0.0/8, on port 22 and 80. Other (TCP/IP) SYN packages (so on all the other connections) would be dropped.
Now you see, that your counter for the SYN DROP rule is increasing and you want to know what is rejected, but how?

The simple answer is ULOG – the netfilter userspace logging daemon.
In Debian you have got various implementations/variants of it, the local logging one (which I will use here, just called ulogd) and the -postgres, -mysql and -sqlite3 one (that are not the exact package names), with that you also can log everything to a (remote) database.
An special variant is the -pcap one, it will write the logs in the .pcap format, so you can analyze the full traffic.

So for our example it is enough to install the package:

apt-get install ulogd

And then add another rule BEFORE our SYN DROP:

-A INPUT -p tcp -m tcp –tcp-flags FIN,SYN,RST,ACK SYN -j ULOG
-A INPUT -p tcp -m tcp –tcp-flags FIN,SYN,RST,ACK SYN -j DROP

Now you will find in /var/log/ulog/syslogemu.log a log of all connections, which would be dropped, the log looks like this:

Aug 13 14:42:07 srv1 IN=eth0 OUT= MAC=00:0c:29:8c:2b:6c:00:d0:02:eb:e8:0a:08:00  SRC=75.125.70.194 DST=XXX.XXX.XXX.XXX LEN=40 TOS=00 PREC=0x00 TTL=54 ID=9566 PROTO=TCP SPT=57144 DPT=445 SEQ=2770468863 ACK=0 WINDOW=512 SYN URGP=0
Aug 13 14:45:29 srv1 IN=eth0 OUT= MAC=00:0c:29:8c:2b:6c:00:d0:02:eb:e8:0a:08:00  SRC=75.125.70.194 DST=XXX.XXX.XXX.XXX LEN=40 TOS=00 PREC=0x00 TTL=55 ID=13702 PROTO=TCP SPT=58528 DPT=445 SEQ=1217789951 ACK=0 WINDOW=512 SYN URGP=0

So you have got now the information about the full date, mac address (mostly it will be the one of your gateway), source and destination IP, source and destination port, length, protocol, etc.

You also could use it to log outgoing connections to port 80 and the IRC ports:

-A OUTPUT -p tcp -m tcp –dport 80 -j ULOG

-A OUTPUT -p tcp -m tcp –dport 6666:6669 -j ULOG

Whatever you want.