Using AoE on FreeBSD


This document assumes you have already added support for AoE to your system. Here we discuss how to interface to the AoE driver and configure the system to mount AoE devices on boot.


In order to understand the interface to the AoE driver a few simple concepts concerning AoE must be discussed. We will then discuss the sysctl interface to the driver and finally show how to use the AoE devices automatically on boot.

Table of contents:

  1. The AoE unit abstraction
  2. AoE device discovery
  3. AoE security
  4. The sysctl interface
  5. FreeBSD 4.x and MAKEDEV
  6. Software RAID
  7. Using AoE devices on boot
  8. Common mistakes

1. The AoE unit abstraction

Each AoE device possesses an {aoemajor, aoeminor} pair. For the Coraid EtherDrive ® blade this pair corresponds to the shelf and slot of the blade. In a FreeBSD system, each disk device must have a unit number associated with it that is unique for that class of device. When naming the device in /dev, the unit number follows the name of the device. As an example for the ATA driver, the ATA disk for unit 0 is ad0, unit 1 is ad1, and so on. The AoE driver makes a map between {aoemajor, aoeminor} and the unit number as follows:
unit = aoemajor * 10 + aoeminor
As an example, the following {aoemajor, aoeminor} pairs correspond to the disk device names listed:
{aoemajor, aoeminor} device name
{0, 5} aoed5
{10, 8} aoed108
{39, 0} aoed390

2. AoE device discovery

AoE device discovery is done in two ways. When the AoE device initializes it will broadcast a message so that potential users will know it exists. Secondly, the client may send out a message to probe for devices. The AoE driver provides a sysctl settable string to trigger the sending of such a message.

3. AoE security

In order to keep rogue machines on the network from pretending they are AoE devices, the AoE driver restricts AoE traffic to a specific list of interfaces. This list must be set before any traffic may occur to any AoE device.

4. The sysctl interface

The AoE driver provides a few sysctl(8) variables for configuring the driver and finding out what devices the driver knows about:
net.aoe.iflist A settable string containing a space separated list of interface names valid for AoE network traffic.
net.aoe.discover A settable integer that triggers a discover beacon to be sent to all valid AoE interfaces.
net.aoe.maxwait A settable integer containing the number of seconds to wait for an AoE ATA request to be completed.
net.aoe.wc A settable integer specifying whether or not the write cache should be enabled on all AoE devices.
net.aoe.devices A read-only string listing the AoE devices the driver knows about.
Net.aoe.discover is always 0; setting it merely triggers the beacon to be sent. Each ATA read/write message to an AoE device is given a certain amount of time to complete before the device is considered gone or failed. Net.aoe.maxwait is the number of seconds to wait before failing all outstanding I/O and marking the device as DOWN. Once a device has entered the DOWN state, it cannot be used for I/O. A device can come back UP only after it has been closed by all prior users and it has completed a new discovery sequence. Each line of net.aoe.devices denotes the device name, the interface it is on, and its state.

Example:

$ sysctl net.aoe
net.aoe.iflist: 
net.aoe.discover: 0
net.aoe.maxwait: 180
net.aoe.wc: 1
net.aoe.devices: 
$ sysctl net.aoe.iflist=fxp0
net.aoe.iflist:  -> fxp0
$ sysctl net.aoe.discover=1
net.aoe.discover: 0 -> 0
$ sysctl net.aoe
net.aoe.iflist: fxp0
net.aoe.discover: 0
net.aoe.maxwait: 180
net.aoe.wc: 1
net.aoe.devices:
aoed7    fxp0    UP
aoed4    fxp0    UP
aoed0    fxp0    UP
aoed6    fxp0    UP
aoed10   fxp0    UP

5. FreeBSD 4.x and MAKEDEV

In Freebsd 5.x and beyond device node creation is automatically handled by the system at device discovery time. In 4.x device node files are statically created for the devices used by the system. This is usually done automatically on system installation because there are very few devices per device class. With AoE it is possible to use all the unit numbers available (512) to a device class. The total potential number of AoE device nodes, including partitions and slices, is approximately 16,384. As a result, AoE device nodes must be manually created by the system administrator for the devices used in the system.

This is accomplished with the MAKEDEV script in /dev:

$ cd /dev
$ ls aoed*
$ ./MAKEDEV aoed0
$ ls aoed*
aoed0           aoed0d          aoed0h          aoed0s1c        aoed0s1g        aoed0s4
aoed0a          aoed0e          aoed0s1         aoed0s1d        aoed0s1h
aoed0b          aoed0f          aoed0s1a        aoed0s1e        aoed0s2
aoed0c          aoed0g          aoed0s1b        aoed0s1f        aoed0s3

6. Software RAID

Software RAID on FreeBSD is done with a program called vinum. Learn more about vinum by visiting The Vinum Volume Manager Homepage.

7. Using AoE devices on boot

Using AoE devices on boot is a little tricky because the network must be up before the system can access an AoE device. The current rc boot method for automounting filesystems will not work with AoE devices; vinum and mount -a are both run before bringing up the network.

In order to use AoE devices on boot a few rc.conf variables have been defined that permit the boot script to initialize systems using AoE devices after the network is up. They are as follows:

aoe_enable (5.x) A string set to "NO" or "YES" to tell rcorder whether or not to run the aoe script.
aoe_iflist A space separated string of interfaces valid for AoE.
aoe_wc An integer specifying whether or not the write cache should be on.
aoe_vinum_drives A space separated string of AoE devices used by vinum.
aoe_mounts A space separated string of AoE mounts in /etc/fstab that should be mounted after initializing the iflist and the vinum drives.
Boot time initialization is as follows. If aoe_iflist is set, then 'sysctl net.aoe.iflist=$aoe_iflist' and 'sysctl net.aoe.discover=1' will be run. If start_vinum is set to YES and we have initialized the iflist as above, then 'vinum read $aoe_vinum_drives' will be run. If aoe_iflist is set -- and after optionally initializing the vinum devices -- the mounts in aoe_mounts are mounted. Each mount in /etc/fstab must have the noauto option specified to disable auto-mounting and must have the final (sixth) field set to 0 to prohibit boot-time fsck. This sequence permits filesystems using AoE devices to be automatically mounted on boot.

If aoe_iflist is set the aoe module will be loaded if it is not already.

8. Common mistakes

The only common mistake reported is in making sure the interface selected for AoE traffic is up. Since AoE does not require an interface to have an IP address, it is easy to forget to configure it to come up on boot.

 

Please contact support@coraid.com with questions/comments.


BSD Daemon © 1988 by Marshall Kirk Mckusick. All Rights Reserved.