The CLN Failover Kit HOWTO

Ed L. Cashin

2009-22-04

Table of Contents

Introduction
High Availability Concepts
Active/Passive
Cluster IP
STONITH
Single Point of Failure
Installing Software
Heartbeat
Supplemental Software
Hardware Installation
Configuration
Preparing the WTI IPS-800
Preparing the Storage
Removing Heartbeat-Managed Services from System Initialization
Cluster Member Authentication
Heartbeat Configuration
High Availability Resources
Syslog Messages
NFS Exports
Finishing Up
Initial Testing
Testing STONITH
Testing Serial Connection
Starting Heartbeat
Accessing the Storage
Manually Failing Over
Pulling the Plug
Further Steps
Resources

Introduction

This document is intended to demonstrate the use of the Coraid CLN Failover Kit with two Coraid Linux NAS units to improve storage service availability. After digesting the principles demonstrated by the simple examples here, the reader will have an outline which can be filled out by reading the documentation for the software they choose to use. With the acquired expertise, the reader should be able to create highly available services as desired.

Administering a highly available system using open source software is safe when the system administrator understands the technology. The Coraid support team is happy to serve as a resource as you acquire this understanding, especially if you are going beyond the scope of this HOWTO.

The reader should keep in mind that high availability is different from archival backup. A highly available system minimizes downtime, but an archival backup allows data to be restored after a catastrophic event. The two are complimentary. Data on a highly available system should be backed up regularly.

The CLN is a general-purpose Linux system tuned toward exporting AoE storage via NFS and CIFS. This document builds on and assumes familiarity with the content of the CLN-HOWTO. Please do not assume that everything covered in the CLN HOWTO is safe to use on a heartbeat cluster in the same way as on a single host.

The Failover Kit allows two CLN units to provide storage services with higher availability than a single unit could provide. With one CLN actively serving clients, the other passive CLN can monitor the active CLN's "heartbeat." If the heartbeat stops, it indicates that the active CLN has stopped working, and the passive CLN takes over after turning off the dead CLN. Clients should not notice more than a brief pause in ongoing service.

In this HOWTO, two CLN units, "pete" and "mclaws," form a simple "Active/Passive" two-node cluster. Together, pete and mclaws provide a more reliable NFS service than either one could alone.

They're using data from one shared AoE device with shelf address zero and slot address zero. An XFS resides on the e0.0 device and is being exported via NFS.

This example is deliberately simple for two reasons. Firstly, a short HOWTO should help the important principles stand out. Secondly, simple systems are usually more stable than complex ones, and it is expected that most readers will have availability as a priority.

For instance, by foregoing the flexibility provided by LVM, the admin can omit a large amount of software, and all software (not just LVM) sometimes has bugs. In general, reducing the number of components in a system increases its availability. So although the software stack in the example below is minimal, it probably is not unrealistic.

High Availability Concepts

Active/Passive

A simple and reliable way to achieve high availability is to let one computer perform a task while another computer stands ready to step in should the first one fail. This situation is called "Active/Passive," because only one of the machines is performing the service.

It is possible, but significantly more difficult, to achieve high availability by instead having both machines sharing in the performance of the same task, so that if one fails the other simply keeps going. This "Active/Active" situation requires more complex software, like cluster filesystems.

The Active/Passive application of the Heartbeat software package is a solid, well-tested way of achieving high availability. It does not rely on rapidly changing or complex software, and it is relatively easy to understand and configure.

This document describes the use of Heartbeat to configure an Active/Passive cluster of two CLNs.

Cluster IP

When two machines support one another in offering a single service, they are acting as a cluster. Each machine is distinct and has its own IP. In our example, there are names for all the IP addresses.

pete:~# host pete
pete.coraid.com has address 205.185.197.218
pete:~# host mclaws
mclaws.coraid.com has address 205.185.197.217

The computers in a cluster are called nodes. The clients using the service provided by the cluster don't care which node is the "real" server. They connect to a special IP address that is used for the cluster.

pete:~# host clusterb
clusterb.coraid.com has address 205.185.197.220

In addition to its own IP address, whichever node has the active role will assume the cluster IP on its front side network interface.

STONITH

When two cluster nodes are mounting a traditional (single host) filesystem from a shared block device, they must be sure that only a single host mounts the filesystem at any given time.

Cluster filesystems like GFS depend on programs that coordinate the different nodes, ensuring that access to the shared block storage proceeds in a consistent way. For GFS, the Distributed Lock Manager and the Cluster Manager software perform this coordination.

Traditional filesystems lack such cluster management because they were designed with the assumption that only one host will be able to access the block storage device, like a hard disk in a computer case.

Two CLNs can safely share a traditional filesystem only as long as they don't both attempt to use the filesystem at the same time.

Imagine that pete is the active node and mclaws notices that pete's heartbeat stops. Now mclaws wants to take over, but how can mclaws be sure that pete no longer has the XFS mounted? Well, mclaws can turn pete off. With pete's power disconnected, mclaws can be certain that it's the only node accessing the shared AoE storage.

So to be sure that the other machine can't be accessing the shared block storage device, mclaws can "Shoot the Other Node in the Head." "STONITH" is an acronym.

Single Point of Failure

The objective is a highly available service. A whole system of parts makes the service available. To increase availability, we can attempt to design a system where any single part can fail without interrupting the service itself.

It's important to concentrate on points of failure that cause a service interruption. Trying to make every part redundant is a wild goose chase.

Installing Software

Heartbeat

Heartbeat is a popular, well-tested, high-availability software that can be installed with APT.

pete:~# apt-get update
pete:~# apt-get install heartbeat-2

It is normal to see an error at the end like the one below, because you haven't yet configured heartbeat.

Setting up heartbeat-2 (2.0.3-2) ...
Heartbeat not configured: /etc/ha.d/ha.cf not found.
 Heartbeat failure [rc=1]. Failed.

If you see an error like the one in the example below, please read the subsection immediately following this one. It will help you to use the package from the stable distribution.

pete:~# apt-get install heartbeat-2
Reading package lists... Done
Building dependency tree... Done
Package heartbeat-2 is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
E: Package heartbeat-2 has no installation candidate

There are CLN-specific scripts that work with Heartbeat to ensure that failover works smoothly. They're installed with the command below.

pete:~# apt-get install coraid-ft-scripts

Temporarily borrowing heartbeat-2 from debian stable

Currently the heartbeat-2 package is maintained in the "stable" debian distribution but not in the "testing" distribution. This situation is new and is expected to change. Coraid Linux is based on the testing distribution, but you can selectively use packages from the stable distribution.

When heartbeat-2 does appear in the testing distribution, the procedure below will not be necessary. You can check the debian website to see whether heartbeat-2 already has appeared in testing. After going to the URL below, you can click "Search package directories" and select "any" distribution from the drop down list before searching for "heartbeat-2".

http://www.debian.org/distrib/packages

While heartbeat-2 remains only in stable, you can install it using the following method. First, make sure that your /etc/apt/apt.conf file specifies that testing is the default distribution.

pete:~# grep -i default /etc/apt/apt.conf
APT::Default-Release "testing";

If a default line does not exist, simply add it to that file (creating it if necessary).

Next, tell APT where to find a stable distribution repository.

pete:~# grep stable /etc/apt/sources.list
deb http://ftp.us.debian.org/debian/ stable main

You will probably have to add this line to your sources.list file, because the CLN ships without it.

Update your APT repository information next.

pete:~# apt-get update        # repeat if needed

If you see a message about key B5D0C804ADB11277 not being available, it's because you don't have the new stable distribution key installed yet. It is easy to upgrade your debian-archive-keyring package, though, so that the key is present.

pete:~# apt-get install debian-archive-keyring
pete:~# apt-get update

Install the heartbeat-2 from the stable distribution by specifying it with the "-t" option as shown in the command below.

pete:~# apt-get -t stable install heartbeat-2

You can then free some local storage by using "apt-get clean".

Supplemental Software

The kermit program is a terminal emulator that can speak to the WTI IPS-800 over a null-modem cable. We'll use it to set the IP address on the IPS-800.

pete:~# apt-get install coraid-kermit

Install telnet on the CLNs in order to configure the WTI IPS-800 Internet Power Switch.

pete:~# apt-get install telnet

In general, you can get rid of old .deb files to free up some space after installing or upgrading.

pete:~# apt-get clean

Hardware Installation

Install the WTI IPS-800 in your rack. After cleanly shutting down the CLNs with shutdown -h now, plug the power cable of each CLN into the IPS-800 so that each is on a different power circuit.

For the example, we plug pete into A1 and mclaws into B5.

The RJ-45 ethernet port is plugged into a switch on the CLNs' "front" network. This connection is not a single point of failure, because if it fails, the machines would not be able to turn one another off, but something else would have to fail for service to be interrupted.

We'll connect the serial ports of the two CLNs together later, after using the null-modem cable to first configure the WTI IPS-800 in the next section.

Configuration

Preparing the WTI IPS-800

Assign an IP

In the steps below, we give the IPS-800 an IP address so that the CLNs can connect to it over your network. The initial configuration is performed over a serial connection.

Using the null-modem cable supplied in the Failover Kit, connect the 9-pin serial port of one CLN to the 9-pin serial port of the IPS-800.

Using kermit, connect to the IPS-800 as follows.

pete:~# kermit
C-Kermit 8.0.211, 10 Apr 2004, for Linux
 Copyright (C) 1985, 2004,
  Trustees of Columbia University in the City of New York.
Type ? or HELP for help.
(/root/) C-Kermit>set line /dev/ttyS0
(/root/) C-Kermit>set speed 9600
/dev/ttyS0, 9600 bps
(/root/) C-Kermit>set carrier-watch off
(/root/) C-Kermit>connect
Connecting to /dev/ttyS0, speed 9600
 Escape character: Ctrl-\ (ASCII 28, FS): enabled
Type the escape character followed by C to get back,
or followed by ? to see other options.
----------------------------------------------------

Hit the enter key, and you'll see the IPS-800 menu.

Internet Power Switch v1.41h    Site ID: (undefined)
Plug | Name             | Password    | Status | Boot/Seq. Delay | Default |
-----+------------------+-------------+--------+-----------------+---------+
 1   | (undefined)      | (undefined) |   ON   |     0.5 Secs    |   ON    |
 2   | (undefined)      | (undefined) |   ON   |     0.5 Secs    |   ON    |
 3   | (undefined)      | (undefined) |   ON   |     0.5 Secs    |   ON    |
 4   | (undefined)      | (undefined) |   ON   |     0.5 Secs    |   ON    |
 5   | (undefined)      | (undefined) |   ON   |     0.5 Secs    |   ON    |
 6   | (undefined)      | (undefined) |   ON   |     0.5 Secs    |   ON    |
 7   | (undefined)      | (undefined) |   ON   |     0.5 Secs    |   ON    |
 8   | (undefined)      | (undefined) |   ON   |     0.5 Secs    |   ON    |
-----+------------------+-------------+--------+-----------------+---------+
"/H" for help.
IPS>

To configure the IP address,

  • enter "/n" to change network settings,
  • enter "1" (the number one) to change the IP address, and
  • enter your chosen IP address in dotted decimal notation, e.g., "205.185.197.219".

Press the escape key to return to the IPS> prompt. You can disconnect by holding down the control key and pressing the backslash key to get kermit's attention, and then hitting the "c" key. That key sequence returns you to the kermit prompt, at which point you can quit kermit.

(Back at pete.coraid.com)
----------------------------------------------------
(/root/) C-Kermit>quit
Closing /dev/ttyS0...OK
pete:~#

The rest of the IPS-800 configuration may be performed over the network, now that it has an IP address.

Setting a Password on the IPS-800

The STONITH plugin in Heartbeat for the WTI IPS-800 assumes that a password is needed to use the IPS-800.

To set the password, telnet to the IP you assigned to the IPS-800 in the previous section.

pete:~# host benjamin
benjamin.coraid.com has address 205.185.197.219
pete:~# telnet benjamin

At the IPS> prompt, use "/g" to set the general parameters. Enter the number "1" to set the password. Type in a password that you won't forget or lose, and hit the enter key. I'm using "changeme" for this example.

Only one telnet session at a time is allowed on the IPS-800, so you'll have to log out in order to try out your new password.

Use "/x" to exit, and be sure to select "1" (the number) to save your changes.

Now that you're sure that you can get to the IPS-800 over your IP network, you can use the null-modem cable to connect the serial ports of your two CLNs. This cable will carry the actual heartbeat messages. On each CLN, connect the serial port that is right next to the VGA video port.

Setting Outlet Names on the IPS-800

Heartbeat's STONITH plugin uses host names to control the CLN power outlets.

Telnet to the IPS-800 and set a name for plug number 1. Enter "/p1", and then enter "1". Type in the name of the host ("pete" in our example) connected to plug 1.

Now set the name for your other CLN on plug 5 after entering "/p5".

Save with "/e" and exit with "/x".

Preparing the Storage

Before configuring Heartbeat, the storage itself should be configured so that it can be started up smoothly. For this example, we create an XFS on an AoE device that both mclaws and pete can use.

pete:~# aoe-stat | fgrep e0.0
      e0.0      3298.534GB   eth2 up
pete:~# modprobe xfs
pete:~# mkfs -t xfs /dev/etherd/e0.0
meta-data=/dev/etherd/e0.0       isize=256    agcount=32, agsize=25165824 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=805306368, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal log           bsize=4096   blocks=32768, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0
pete:~#

Now mclaws and pete can both see this new XFS on e0.0, but only one of them should have it mounted at any given time. The Heartbeat resource scripts will take care of that.

This step only needs to be performed on one host (because there's only one AoE device that they're both using). The rest of the configuration steps must be performed on both hosts.

Removing Heartbeat-Managed Services from System Initialization

It's important to understand that when Heartbeat is starting and stopping services, the services should not be started or stopped when the rest of the system comes up or goes down.

init.d

The general system initialization scripts reside in /etc/init.d. They are run based on the presence of symbolic links ("symlinks") in the directories that are named after runlevels.

pete:~# ls -d /etc/rc*.d
/etc/rc0.d  /etc/rc2.d  /etc/rc4.d  /etc/rc6.d
/etc/rc1.d  /etc/rc3.d  /etc/rc5.d  /etc/rcS.d

The symlinks would be difficult to manage without tools, and the update-rc.d tool makes it easy to remove all the symlinks for NFS. That's necessary to make sure that only Heartbeat controls NFS.

pete:~# update-rc.d -f nfs-common remove
 Removing any system startup links for /etc/init.d/nfs-common ...
   /etc/rc0.d/K79nfs-common
   /etc/rc1.d/K79nfs-common
   /etc/rc2.d/S21nfs-common
   /etc/rc3.d/S21nfs-common
   /etc/rc4.d/S21nfs-common
   /etc/rc5.d/S21nfs-common
   /etc/rc6.d/K79nfs-common
pete:~# update-rc.d -f nfs-kernel-server remove
 Removing any system startup links for /etc/init.d/nfs-kernel-server ...
   /etc/rc0.d/K80nfs-kernel-server
   /etc/rc1.d/K80nfs-kernel-server
   /etc/rc2.d/S20nfs-kernel-server
   /etc/rc3.d/S20nfs-kernel-server
   /etc/rc4.d/S20nfs-kernel-server
   /etc/rc5.d/S20nfs-kernel-server
   /etc/rc6.d/K80nfs-kernel-server

In the future, after updating your nfs-common and nfs-kernel-server packages, make a habit of running these commands to make sure that the symlinks haven't been recreated.

On mclaws the system-wide symlinks are removed in the same way.

/etc/aoe

Any Software RAID that is managed by Heartbeat should not be listed in /etc/aoe/md.conf, nor should any filesystem managed by Heartbeat be listed in /etc/aoe/fs.conf. (This HOWTO does not cover Linux Software RAID, but it should be clear from the last statement and from a basic understanding of heartbeat that heartbeat itself, and not the rest of the system, must control all of the layers above a resource shared by both nodes in an HA cluster.)

Cluster Member Authentication

The heartbeat itself consists of messages that go between cluster members. To help the members identify legitimate cluster members, some cryptography is used. You can pick a secret that will be shared by the members of the cluster and put it in the /etc/ha.d/authkeys file.

pete:~# touch /etc/ha.d/authkeys
pete:~# chmod 600 /etc/ha.d/authkeys

Edit the file next. In the example contents below, the secret is "ABetterDayToStoreDataBetterIsToday".

pete:~# cat /etc/ha.d/authkeys
auth 1
1 sha1 ABetterDayToStoreDataBetterIsToday

This configuration file must be the same on both hosts.

Heartbeat Configuration

The central configuration file for Heartbeat is /etc/ha.d/ha.cf. The configuration file on pete is short, but Heartbeat comes with a long example configuration file with comments explaining each of its parts. You can read it with zless, which runs the less pager on compressed files. (Hit "q" to quit zless. Use arrow keys to navigate the text.)

pete:~# zless /usr/share/doc/heartbeat-2/ha.cf.gz

Here is pete's configuration file. (The remote power switch for STONITH is connected to the network and is reachable at benjamin.coraid.com.) The file on mclaws is the same.

pete:~# cat /etc/ha.d/ha.cf
keepalive 1
deadtime 10
warntime 5
baud 9600
serial /dev/ttyS0
auto_failback off
stonith_host pete wti_nps benjamin.coraid.com changeme
stonith_host mclaws wti_nps benjamin.coraid.com changeme
node pete
node mclaws
use_logd yes

The ha.cf on mclaws is the same.

Ethernet Heartbeat

A pair of CLNs with Intel dual-port PCI-X NICs can use the onboard network interface eth1 for heartbeat over ethernet instead of using a null modem cable for the heartbeat messages. Such a configuration would have a ha.cf like the one below instead of the one shown above.

pete:~# cat /etc/ha.d/ha.cf
keepalive 1
deadtime 10
warntime 5
ucast eth1
auto_failback off
stonith_host pete wti_nps benjamin.coraid.com changeme
stonith_host mclaws wti_nps benjamin.coraid.com changeme
node pete
node mclaws
use_logd yes

In addition, the eth1 interface would require IP configuration. (On these CLNs, eth2 is the "front" network and eth3 is the "back" network, leaving eth1 free for handling heartbeat messages.)

pete:~# sed -n '/eth1/,$p' /etc/network/interfaces
auto eth1
iface eth1 inet static
        address 192.168.1.1
        netmask 255.255.255.0
        network 192.168.1.0
        broadcast 192.168.1.255
mclaws:~# sed -n '/eth1/,$p' /etc/network/interfaces
auto eth1
iface eth1 inet static
        address 192.168.1.2
        netmask 255.255.255.0
        network 192.168.1.0
        broadcast 192.168.1.255

High Availability Resources

When any machine boots, it performs a sequence of tasks in order to get to a usable state. Assuming an active role in the cluster is much like booting, and the /etc/ha.d/haresources file defines the order of tasks that must be performed on takeover.

Make sure that the lines ending with backslashes don't really end with spaces or tabs.

pete:~# cat /etc/ha.d/haresources
pete           IPaddr::205.185.197.220           Filesystem::/dev/etherd/e0.0::/mnt/e0.0::xfs           killnfsd           nfs-common           nfs-kernel-server

This example file is the same on pete and mclaws. It says that pete is the default active server. It lists scripts (found in /etc/ha.d/resources.d) to run forwards when assuming the active role or backwards when giving it up.

Notice that everything is under the control of heartbeat, from the NFS export down to the AoE target. It is beyond the scope of this HOWTO, but if you were using md, LVM, or anything else, these additional layers would, of course, need to be represented in the above configuration.

Syslog Messages

If you haven't already configured syslog on each CLN, now is a good time to do that. It's explained in the CLN-HOWTO. By sending syslog messages to a remote host, you can more easily monitor both machines.

The messages from Heartbeat are copious and make for dry reading, but by reading the logged messages you can get an excellent understanding of how the failover process is working. You can also more easily identify and correct problems.

After changing /etc/syslog.conf on the CLN units, you can restart syslog services to make sure the new configuration takes effect.

pete:~# /etc/init.d/sysklogd restart
Restarting system log daemon: syslogd.
pete:~# /etc/init.d/klogd restart
Restarting kernel log daemon: klogd.

NFS Exports

Configure NFS to export the XFS on both hosts.

Create a mount point on each host for the XFS. Don't mount the XFS, though. That will be done by Heartbeat.

pete:~# mkdir /mnt/e0.0

Tell the NFS server to export the /mnt/e0.0 filesystem by adding a line to /etc/exports.

pete:~# cat /etc/exports
# /etc/exports: the access control list for filesystems which may be exported
#               to NFS clients.  See exports(5).
/mnt/e0.0 205.185.197.0/24(rw,sync,no_root_squash)

The exports manpage and the CLN-HOWTO have more information about the /etc/exports file.

Finishing Up

Finally, double check to make sure that the configuration is the same on both CLNs.

Initial Testing

It's important to be familiar with the workings of Heartbeat. Although Heartbeat is simple and reliable, it can occasionally be confusing. Being confused during testing isn't always comfortable, but it's always more comfortable than being confused when the system is in production.

Testing STONITH

From each node, you should be able to turn off the other node using the stonith tool that's part of heartbeat-2. Before performing this test, you should consider temporarily removing the symbolic links that cause heartbeat to come up at boot.

mclaws:~# update-rc.d -f heartbeat remove
 Removing any system startup links for /etc/init.d/heartbeat ...
   /etc/rc0.d/K05heartbeat
   /etc/rc1.d/K05heartbeat
   /etc/rc2.d/S75heartbeat
   /etc/rc3.d/S75heartbeat
   /etc/rc4.d/S75heartbeat
   /etc/rc5.d/S75heartbeat
   /etc/rc6.d/K05heartbeat

To test it, first sync filesystem data on the node you're going to power off.

mclaws:~# sync

Next turn off the target node from the other node. In this example, we're turning off mclaws from pete. Note that this will suddenly kill the power that's going to mclaws.

pete:~# stonith -t wti_nps -p "benjamin.coraid.com changeme" mclaws
** INFO: Successful login to WTI Network Power Switch.
stonith: wti_nps device OK.
connect() failed: Connection refused
** INFO: Successful login to WTI Network Power Switch.
** INFO: Host is being rebooted: mclaws
** INFO: Power restored to host: mclaws

The "Connection refused" message doesn't matter, because mclaws does get powered down and powered back up. CLNs ship with the BIOS set so that power is restored to the last status (off or on) after power failure.

Testing Serial Connection

Before starting the heartbeat service on the CLNs, make sure that the serial connection between the two CLNs is working properly. The cat and echo commands are sufficient for a quick check.

First run cat on mclaws to listen for messages from pete.

mclaws:~# cat /dev/ttyS0

Now echo a message into the serial port device to send it to mclaws.

pete:~# echo hello there > /dev/ttyS0

The cat process on mclaws prints out the message. Mine shows an extra blank line after the message. Next, kill the cat process by hitting control-c on mclaws and try sending a message the other way, with cat first listening on pete.

Starting Heartbeat

If the parts are all working, it's time to start Heartbeat. Keep an eye on the syslog messages coming from the CLNs.

If you have previously removed the startup symlinks, now is a good time to add them again on each of the two hosts.

pete:~# update-rc.d heartbeat defaults
 Adding system startup for /etc/init.d/heartbeat ...
   /etc/rc0.d/K20heartbeat -> ../init.d/heartbeat
   /etc/rc1.d/K20heartbeat -> ../init.d/heartbeat
   /etc/rc6.d/K20heartbeat -> ../init.d/heartbeat
   /etc/rc2.d/S20heartbeat -> ../init.d/heartbeat
   /etc/rc3.d/S20heartbeat -> ../init.d/heartbeat
   /etc/rc4.d/S20heartbeat -> ../init.d/heartbeat
   /etc/rc5.d/S20heartbeat -> ../init.d/heartbeat

I try to start heartbeat at about the same time on both hosts, using two xterm windows on my desktop.

pete:~# sync; /etc/init.d/heartbeat start
logd is already running
Starting High-Availability services:
2006/04/14_12:35:23 INFO:  IPaddr Resource is stopped
Done.
mclaws:~# sync; /etc/init.d/heartbeat start
logd is already running
Starting High-Availability services:
2006/04/14_12:35:25 INFO:  IPaddr Resource is stopped
Done.

Now one of the hosts should have the cluster IP as well as its own "personal" IP. I use the arping command (on a third system) as an easy way to check that.

root@kokone ~# arping clusterb
ARPING 205.185.197.220
60 bytes from 00:30:48:88:36:d0 (205.185.197.220): index=0 time=128.984 usec
60 bytes from 00:30:48:88:36:d0 (205.185.197.220): index=1 time=132.084 usec
--- 205.185.197.220 statistics ---
2 packets transmitted, 2 packets received,   0% unanswered
root@kokone ~# arping pete
ARPING 205.185.197.218
60 bytes from 00:30:48:88:36:d0 (205.185.197.218): index=0 time=130.892 usec
60 bytes from 00:30:48:88:36:d0 (205.185.197.218): index=1 time=132.084 usec
--- 205.185.197.218 statistics ---
2 packets transmitted, 2 packets received,   0% unanswered

Notice that when I arping "clusterb," which is the name that resolves to the cluster's IP address, I get the same MAC address, 00:30:48:88:36:d0, that I get when I arping pete's IP.

From that I can tell that pete's front-side network interface now has two IP addresses: its own and the cluster's.

Accessing the Storage

With Heartbeat running and managing the NFS services on both nodes, you should be able to use the storage by way of the cluster IP. Be sure to use the cluster IP, not just the IP of one node.

root@kokone ~# mkdir /mnt/clusterb
root@kokone ~# mount -t nfs clusterb:/mnt/e0.0 /mnt/clusterb
root@kokone ~# df -h /mnt/clusterb
Filesystem            Size  Used Avail Use% Mounted on
clusterb:/mnt/e0.0    3.0T  512K  3.0T   1% /mnt/clusterb
root@kokone ~# cp -a /usr/share/doc/irb /mnt/clusterb
root@kokone ~# find /mnt/clusterb
/mnt/clusterb
/mnt/clusterb/irb
/mnt/clusterb/irb/changelog.Debian.gz
/mnt/clusterb/irb/copyright

Manually Failing Over

There are some commands you can use to cause the active and passive nodes to change roles. Now that pete is the active node, we can run hb_standby on pete, changing pete to the passive role. Running hb_takeover on mclaws would trigger the same role switch.

I like to make sure that NFS services aren't interrupted when I perform this test. On kokone, my NFS client, I unmount and then remount the NFS filesystem, so that the local cache isn't consulted. Then after triggering failover, I run a find command on kokone, requiring NFS service. After a brief pause, the find command should run to completion.

Step one: Get a fresh NFS mount on kokone.

root@kokone ~# umount /mnt/clusterb
root@kokone ~# mount -t nfs clusterb:/mnt/e0.0 /mnt/clusterb

Step two: Trigger a failover on pete, performing step three immediately afterwards.

pete:~# /usr/lib/heartbeat/hb_standby
2006/04/14_13:18:55 Going standby [all].

Step three: Use the NFS service on the client.

root@kokone ~# wc /mnt/clusterb/irb/copyright
  73  429 3062 /mnt/clusterb/irb/copyright

Step four: Verify that the roles have been reversed.

root@kokone ~# arp | egrep 'pete|mclaws|clusterb'
pete.coraid.com      ether  00:30:48:88:36:D0  C  eth0
mclaws.coraid.com    ether  00:30:48:85:F3:2E  C  eth0
clusterb.coraid.com  ether  00:30:48:85:F3:2E  C  eth0

I can see that mclaws has the cluster IP. So mclaws is certainly the one answering kokone's NFS requests, but kokone never noticed a thing.

Pulling the Plug

A final test is to actually cut the power of the active node, letting the standby node take over. Don't use the WTI IPS-800 to do it, just pull the power cable physically. The IPS-800 only allows one telnet session at a time, and Heartbeat needs to use it.

For testing purposes, you have the luxury of issuing the sync command on the active node before you suddenly turn it off. Even though the ext3 filesystem is quite robust, yanking the power is a rude thing to do to your CLN.

Still, this test demonstrates conclusively that the failover works. NFS clients should experience only a several-second pause in service after you yank the active node's power.

Further Steps

After gaining initial experience with heartbeat, it becomes clear that despite providing a very real increase in fault tolerance, the "bare bones" configuration will not handle all faults. If the active host keeps performing the heartbeat, then failover won't occur, even if the highly available service is interrupted.

When the highly available service has been interrupted and the heartbeat has not, it is difficult for the servers to tell that failover is necessary in any straightforward and reliable way,

A machine using the highly available service is, however, in the perfect position to know when service has been interrupted. Such a machine can then tell the passive server to take over.

Software is available for performing this service monitoring function. A popular package is "Mon," the service monitoring daemon. Once you have successfully installed and configured a redundant pair of CLNs, it might be your next step.

Resources

In setting up a highly available system, it is important to know what software you are using and how it works. A good place to visit next would be the Linux High Availability website.

There is an interesting Linux Journal article written by a CLN Failover Kit expert user, Daniel Bartholomew. It has a good "References" section of its own.


www.coraid.com