I am going to configure 2 machines to be in a Active/Passive failover situation, which means that if the primary machine dies, the secondary will take over its identity and continue functioning as previously. Here I discribe how to install and configure DRBD with Heartbeat cluster.
DRBD (Distributed Replicated Block Device) is a technology that is used to replicate data over TCP/IP. It is used to build HA Clusters and it can be seen as a RAID-5 implementation over the network.
Image:drbd.png
Install Servers
Shared IP: 172.16.1.254 On lb1 uname -n
vi /etc/hosts
172.16.1.247 lb1
172.16.1.248 lb2
On lb2 uname -n
vi /etc/hosts
172.16.1.247 lb1
172.16.1.248 lb2
Create Partition for DRBD Disk
On each node, use fdisk to create a type 83 linux partition.
[root@lb1 ~]# fdisk /dev/sda
The number of cylinders for this disk is set to 26564. There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): p
Command (m for help): n
Command action
E extended
P primary partition (1-4)
3
Command (m for help):w
Command (m for help): p
Command (m for help):q
Repeat all above on Machine 2
SSH Shared keys To allow the two Openfiler appliances to talk to each other without having to use a password, use SSH shared keys.
On lb1:
root@lb1 ~# ssh-keygen -t dsa
Hit enter at the prompts (don’t set a password on the key).
On lb2:
root@lb2 ~# ssh-keygen -t dsa
Hit enter at the prompts (don’t set a password on the key). The above command will generate a file called “id_dsa.pub” in ~/.ssh/, which is the public key that will need to be copied to the other node:
root@lb1 ~# scp .ssh/id_dsa.pub root@lb2:~/.ssh/authorized_keys2
root@lb2 ~# scp .ssh/id_dsa.pub root@lb1:~/.ssh/authorized_keys2
DRBD Configuration
Download the latest DRBD source from http://oss.linbit.com/drbd/.
[root@lb1]# cd /usr/src/redhat/SOURCES/
[root@lb1 SOURCES]# wget http://oss.linbit.com/drbd/8.2/drbd-8.2.7.tar.gz
[root@lb1 SOURCES]# tar -xvzf drbd-8.2.7.tar.gz
[root@lb1 SOURCES]# cd drbd-8.2.7
[root@lb1 SOURCES]# make && make rpm
[root@lb1 SOURCES]# cd dist/RPMS/i386
[root@lb1 SOURCES]# ls
drbd-8.2.7-3.i386.rpm
drbd-debuginfo-8.2.7-3.i386.rpm
drbd-km-2.6.18_92.el5-8.2.7-3.i386.rpm
[root@lb1 SOURCES]# rpm -Uvh drbd-8.2.7-3.i386.rpm
[root@lb1 SOURCES]# rpm -Uvh drbd-debuginfo-8.2.7-3.i386.rpm
[root@lb1 SOURCES]# rpm -Uvh drbd-debuginfo-8.2.7-3.i386.rpm
Copy these files to lb2 server and install. Now Edit drbd.conf on lb1
global {
minor-count 2;
dialog-refresh 5; # 5 seconds
}
resource www {
protocol C;
on lb2 {
device /dev/drbd0;
disk /dev/sda5;
address 172.16.1.247:7788;
meta-disk internal;
}
on lb1 {
device /dev/drbd0;
disk /dev/sda5;
address 172.16.1.248:7788;
meta-disk internal;
}
disk {
on-io-error detach;
}
net {
max-buffers 2048;
ko-count 4;
}
syncer {
rate 10M;
al-extents 257; # must be a prime number
}
startup {
wfc-timeout 0;
degr-wfc-timeout 120; # 2 minutes.
}
}
resource mysql {
protocol C;
on lb2 {
device /dev/drbd1;
disk /dev/sda6;
address 172.16.1.247:7789;
meta-disk internal;
}
on lb1 {
device /dev/drbd1;
disk /dev/sda6;
address 172.16.1.248:7789;
meta-disk internal;
}
disk {
on-io-error detach;
}
net {
max-buffers 2048;
ko-count 4;
}
syncer {
rate 10M;
al-extents 257; # must be a prime number
}
startup {
wfc-timeout 0;
degr-wfc-timeout 120; # 2 minutes.
}
}
Copy it to lb2:/etc
Bringing DRBD Services UP
[root@lb1]# drbdadm create-md www
[root@lb1]# drbdadm create-md mysql
[root@lb1]# drbdadm attach www
[root@lb1]# drbdadm attach mysql
[root@lb1]# drbdadm connect www
[root@lb1]# drbdadm connect mysql
[root@lb1]# cat /proc/drbd
[root@lb1]# drbdadm — –overwrite-data-of-peer primary all
[root@lb1]# drbdadm connect all
On secondary node
[root@lb1]# drbdadm — –overwrite-data-of-peer secondary all
[root@lb1]# drbdadm connect all
Now, I need to perform the INITIAL FULL SYNCHRONIZATION.
[root@lb1:~#] drbdadm — –overwrite-data-of-peer primary www
[root@lb1:~#] drbdadm — –overwrite-data-of-peer primary mysql
[root@lb1:~#] service drbd start
[root@lb1:~#] watch cat /dev/drbd0
[root@lb1:~#] watch cat /dev/drbd1
[root@lb1:~#] mkfs.ext3 /dev/drbd0
[root@lb1:~#] mkfs.ext3 /dev/drbd1
That won’t work on the secondary node.
[root@lb2:~#] mkfs.ext3 /dev/drbd0
mke2fs 1.40.4 (31-Dec-2007)
mkfs.ext3: Wrong medium type while trying to determine filesystem size
Configuring A High Availability Cluster (Heartbeat)
Heartbeat is clustering solution which is apart of the Linux High Availability project developed to increase reliability, availability and serviceability of systems. The program sends out heartbeat packets to other members of the the heartbeat node. If a node fails to respond to a packet, then it is assumed that it is dead, all services are killed on that server and another node takes over its services.
Enable IPVS On The Load Balancers
First I enable IPVS on our load balancers. IPVS (IP Virtual Server) implements transport-layer load balancing inside the Linux kernel, so called Layer-4 switching.
echo ip_vs_dh >> /etc/modules
echo ip_vs_ftp >> /etc/modules
echo ip_vs >> /etc/modules
echo ip_vs_lblc >> /etc/modules
echo ip_vs_lblcr >> /etc/modules
echo ip_vs_lc >> /etc/modules
echo ip_vs_nq >> /etc/modules
echo ip_vs_rr >> /etc/modules
echo ip_vs_sed >> /etc/modules
echo ip_vs_sh >> /etc/modules
echo ip_vs_wlc >> /etc/modules
echo ip_vs_wrr >> /etc/modules
Then I do this:
modprobe ip_vs_dh
modprobe ip_vs_ftp
modprobe ip_vs
modprobe ip_vs_lblc
modprobe ip_vs_lblcr
modprobe ip_vs_lc
modprobe ip_vs_nq
modprobe ip_vs_rr
modprobe ip_vs_sed
modprobe ip_vs_sh
modprobe ip_vs_wlc
modprobe ip_vs_wrr
Enable Packet Forwarding On The Load Balancers The load balancers must be able to route traffic to the Apache nodes. Therefore I enable packet forwarding on the load balancers. Add the following lines to /etc/sysctl.conf:
lb1 and lb2
vi /etc/sysctl.conf
# Enables packet forwarding
net.ipv4.ip_forward = 1
Then do this:
lb1/lb2:
sysctl -p
172.16.1.254 is our Apache webserver (i.e., Apache will listen on that address).
1. Install heartbeat
heartbeat-2.08
heartbeat-pils-2.08
heartbeat-stonith-2.08
2. Now I configure heartbeat on two node cluster. These three files I configure
authkeys
ha.cf
haresources
3. Copy these files to the /etc/ha.d directory.
cp /usr/share/doc/heartbeat-2.1.2/authkeys /etc/ha.d/
cp /usr/share/doc/heartbeat-2.1.2/ha.cf /etc/ha.d/
cp /usr/share/doc/heartbeat-2.1.2/haresources /etc/ha.d/
4. First authkeys file, I use authentication method 2 (sha1).
vi /etc/ha.d/authkeys
auth 1
1 crc
Change the permission of the authkeys file:
chmod 600 /etc/ha.d/authkeys
5. Move second file (ha.cf) and edit the ha.cf file with vi:
vi /etc/ha.d/ha.cf
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 30
initdead 120
bcast eth0
udpport 694
auto_failback off
node lb1
node lb2
ping 172.16.1.1
apiauth ipfail gid=haclient uid=hacluster
Note: lb1 and lb2 is the output generated by
uname -n
6. The final piece of work in the configuration is to edit the haresources file. This file contains the information about resources which I want to highly enable. In my case I want the webserver (httpd) highly available:
vi /etc/ha.d/haresources
Add the following line:
lb1 IPaddr::172.16.1.254 \
drbddisk::www Filesystem::/dev/drbd0::/var/www/html::ext3 \
drbddisk::mysql Filesystem::/dev/drbd1::/var/lib/mysql::ext3 mysqld httpd
7. Edit /etc/ha.d/resource.d/drbddisk
DEFAULTFILE=”/etc/drbd.conf”
Copy httpd and mysqld startup script,
[root@lb1 ~]# cp /etc/rc.d/init.d/httpd /etc/ha.d/resource.d/
[root@lb1 ~]# cp /etc/rc.d/init.d/mysqld /etc/ha.d/resource.d/
8. Copy the /etc/ha.d/ directory from lb1 to lb2:
scp -r /etc/ha.d/ root@lb2:/etc/
Integration & Testing
1. Stop all services,
[root@lb1 ~]# service drbd stop
[root@ lb1 ~]# service httpd stop
[root@ lb1 ~]# service mysqld stop
[root@ lb1 ~]# service heartbeat stop
[root@ lb2 ~]# service drbd stop
[root@ lb2 ~]# service httpd stop
[root@ lb2 ~]# service httpd stop
[root@ lb2 ~]# service heartbeat stop
2. Configure particular services to enable/disable automatic startup,
[root@ lb1 ~]# chkconfig drbd on
[root@ lb2 ~]# chkconfig drbd on
[root@ lb1 ~]# chkconfig httpd off
[root@ lb2 ~]# chkconfig httpd off
[root@ lb1 ~]# chkconfig mysqld off
[root@ lb2 ~]# chkconfig mysqld off
[root@ lb1 ~]# chkconfig heartbeat on
[root@ lb2 ~]# chkconfig heartbeat on
3. Start drbd on both machines
[root@ lb1 ~]# service drbd start
[root@lb2 ~]# service drbd start
4. Start heartbeat on both machines
[root@lb1 ~]# service heartbeat start
[root@lb2 ~]# service heartbeat start
Fail-over Testing
Stop heartbeat service on lb1
[root@lb1 ~]# service heartbeat stop
At lb1, HTTPD and MySQL services have stopped while DRBD becomes secondary,
root@lb1 ~]# service httpd status
httpd is stopped
[root@lb1 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.0.3 (api:86/proto:86)
SVN Revision: 2881 build by root@lb1, 2007-06-04 10:11:24
0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r—
ns:55564 nr:984 dw:56548 dr:14845 al:5 bm:529 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/127 hits:13886 misses:5 starving:0 dirty:0 changed:5
[root@lb2 ~]# service httpd status
httpd (pid 9318) is running…
[root@lb2 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.0.3 (api:86/proto:86)
SVN Revision: 2881 build by root@lb2, 2007-06-04 10:49:29
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r—
ns:960 nr:55564 dw:56524 dr:8638 al:1 bm:12 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/127 hits:239 misses:1 starving:0 dirty:0 changed:1
Usefull command
on the server you have declared as master:
drbdadm connect all
on the server you have declared as slave:
drbdadm — –discard-my-data connect all
This gets everything back and starts the sync from the master to slave.
$ drbdsetup –on-io-error detach /dev/drbd0 disk /dev/hda5
$ drbdsetup /dev/drbd0 net 10.0.0.2 10.0.0.1 C
ubuntu01.domain.de
$ drbdsetup –on-io-error detach /dev/drbd0 disk /dev/hda5
$ drbdsetup /dev/drbd0 net 10.0.0.1 10.0.0.2 C
$ drbdsetup /dev/drbd0 primary
If you see this error:
Device size would be truncated, which
would corrupt data and result in
‘access beyond end of device’ errors.
You need to either
* use external meta data (recommended)
* shrink that filesystem first
* zero out the device (destroy the filesystem)
Operation refused.
Then use this command
dd if=/dev/zero bs=1M count=1 of=/dev/sdXYZ; sync
drbdadm create-md $r
drbdadm — -o primary $r
mkfs /dev/drbdY
How to verify
#drbdadm verify all
drbd split-brain problems.
As part of a High Availability linux firewall I have setup, I have used DRBD in order to share a filesystem to ensure that we don’t lose too much information in the event of a failure. However, the primary server of the system has been pushed into production and the secondary system is still in the process of being moved around and configured.
As a result of this, the secondary system lost network connection for the timeout period which lead to the primary server thinking that it was the only one (standalone) and the secondary drbd wouldn’t come up – a form of split-brain from what I’ve read. Before continuing, make sure that you have a look at the documentation. I won’t be held responsible if you completely fry your drbd or (even worse) overwrite your primary with the wrong data Apparantly, this is a known problem with 0.7.x and has been fixed with 0.8.x (feel free to correct me). In my instance I am using 0.7.21. On the primary server, I get:
primary:~# cat /proc/drbd
version: 0.7.21 (api:79/proto:74)Avail Use% Mounted on
SVN Revision: 2326 build by root@vajra, 2007-07-09 16:39:51
0: cs:StandAlone st:Primary/Unknown ld:Consistentnit/rw
ns:534782720 nr:6220 dw:534944920 dr:291509157 al:33812 bm:47 lo:0 pe:0 ua:0 ap:0 503M 0 503M 0% /dev/shm
and on the secondary server, I get:
secondary:~# cat /proc/drbd
version: 0.7.21 (api:79/proto:74)
SVN Revision: 2326 build by root@vajraSecondary, 2007-07-12 09:31:09
0: cs:WFConnection st:Secondary/Unknown ld:Consistent
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
I tried to reconfigure this setup by running
primary:~# drbdadm primary all
on the primary server and
secondary:~# drbdadm secondary all
on the secondary server in order to tell the system which one was the primary and which was the secondary, then running
primary:~# drbdadm connect all
in order to specify that they should reconnect. When I did this, I got an error that read like this:
Jan 25 11:16:18 secondary kernel: drbd0: Secondary/Unknown –> Secondary/Primary
Jan 25 11:16:18 secondary kernel: drbd0: sock was shut down by peer
Jan 25 11:16:18 secondary kernel: drbd0: drbd0_receiver [4175]: cstate BrokenPipe –> BrokenPipe
Jan 25 11:16:18 secondary kernel: drbd0: short read expecting header on sock: r=0
Jan 25 11:16:18 secondary kernel: drbd0: worker terminated
Jan 25 11:16:18 secondary kernel: drbd0: drbd0_receiver [4175]: cstate BrokenPipe –> Unconnected
Jan 25 11:16:18 secondary kernel: drbd0: Connection lost.
Jan 25 11:16:18 secondary kernel: drbd0: drbd0_receiver [4175]: cstate Unconnected –> WFConnection
In the end, I decided that the only way that I was going to be able to get this back to a reasonable state was to flush the data on the secondary server and resync all of the data from the primary server. The correct command to do this is invalidate or invalidate-remote (depending on which machine you want to invalidate. Make sure that you run this on the correct server! When I tried to run this command on the secondary server, I got the next cryptic message:
secondary:~# drbdadm invalidate all
can not open /dev/drbd0: No such file or directory
Command ‘drbdsetup /dev/drbd0 invalidate’ terminated with exit code 20
drbdsetup exited with code 20
secondary:~#
After a bit of hunting, I found the solution on an archived mailing list ( http://archives.free.net.ph/message/20060619.131041.fd07cb48.en.html ). I was able to resync the two filesystems with the following commands on the _secondary_ server (there is another slightly more destructive method in the thread):
secondary:~# /etc/init.d/drbd stop
Now, restart the drbd on the primary server:
primary:~# drbdadm connect all
primary:~# cat /proc/drbd
version: 0.7.21 (api:79/proto:74)
SVN Revision: 2326 build by root@vajra, 2007-07-09 16:39:51
0: cs:WFConnection st:Primary/Unknown ld:Consistent
ns:0 nr:0 dw:537240380 dr:291534637 al:34125 bm:360 lo:0 pe:0 ua:0 ap:0
Which puts the primary server back into a state where it’s waiting for a connection. Then, back on the secondary server:
secondary:~# rmmod drbd
ERROR: Module drbd does not exist in /proc/modules
secondary:~# modprobe drbd
secondary:~# drbdadm attach r0
secondary:~# drbdadm invalidate r0
secondary:~# drbdadm adjust r0
After a couple of minutes, you should be able to see the following output if you cat /proc/drbd on both the servers:
primary:~# cat /proc/drbd
version: 0.7.21 (api:79/proto:74)
SVN Revision: 2326 build by root@vajra, 2007-07-09 16:39:51
0: cs:SyncSource st:Primary/Secondary ld:Consistent
ns:24768676 nr:0 dw:537576684 dr:315968437 al:34170 bm:1852 lo:0 pe:0 ua:0 ap:0
[=======>............] sync’ed: 37.5% (39870/63773)M
finish: 1:04:55 speed: 10,476 (9,448) K/sec
primary:~#
secondary:~# cat /proc/drbd
version: 0.7.21 (api:79/proto:74)
SVN Revision: 2326 build by root@vajraSecondary, 2007-07-12 09:31:09
0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent
ns:0 nr:24778832 dw:24778832 dr:0 al:0 bm:5479 lo:6 pe:206 ua:6 ap:0
[=======>............] sync’ed: 37.5% (39860/63773)M
finish: 1:04:59 speed: 10,412 (9,448) K/sec
1: cs:Unconfigured
secondary:~#
Once that’s done, you should get the following on each of the servers:
primary:~# cat /proc/drbd
version: 0.7.21 (api:79/proto:74)
SVN Revision: 2326 build by root@vajra, 2007-07-09 16:39:51
0: cs:Connected st:Primary/Secondary ld:Consistent
secondary:~# cat /proc/drbd
version: 0.7.21 (api:79/proto:74)
SVN Revision: 2326 build by root@vajraSecondary, 2007-07-12 09:31:09
0: cs:Connected st:Secondary/Primary ld:Consistent
Now is _not_ the time to work out which is the SyncTarget and which is the SyncSource! Hopefully this fixes the problem for someone else, it worked for me but individual mileage may vary.