Creating a Filesystem

Once you have selected or created a block device that you would like to use as your storage medium, you can create a filesystem on that block device. For the block device you might use …

In this example we're using an LVM logical volume. The mkfs command is used to create a filesystem. Creating an XFS generates some verbose output.

makki:~# mkfs -t xfs /dev/vg0/lv0
meta-data=/dev/vg0/lv0           isize=256    agcount=32, agsize=8473312 blks
         =                       sectsz=512
data     =                       bsize=4096   blocks=271145984, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal log           bsize=4096   blocks=32768, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0
makki:~#

To use this filesystem, we just have to mount it on a directory somewhere.

makki:~# mkdir /mnt/alpha
makki:~# mount /dev/vg0/lv0 /mnt/alpha
makki:~# echo testing > /mnt/alpha/test.txt
makki:~# cat /mnt/alpha/test.txt
testing

It's important to make sure this filesystem is mounted on boot and cleanly unmounted when the system goes down. On the CLN we add it to /etc/aoe/fs.conf. Just add a line with the block device and the mountpoint. The line from the above example looks like this:

/dev/vg0/lv0 /mnt/alpha

You can use an editor to do it. If you're careful you can just append (with >>) to the file as shown below.

makki:~# bkp /etc/aoe/fs.conf
bkp "/etc/aoe/fs.conf" --> "/etc/aoe/fs.conf.20051122"
makki:~# echo /dev/vg0/lv0 /mnt/alpha >> /etc/aoe/fs.conf

For more complex mount commands (e.g., ones with special options) the mount command itself may be placed on a single line like this in /etc/aoe/fs.conf, with the mountpoint last.

mount -o uquota /dev/md0 /mnt/md0

The /etc/fstab file is not used because filesystems on AoE devices are not mounted as early in the boot sequence as the filesystems in /etc/fstab.

Growing a Filesystem

The xfs_growfs command is used to expand an existing XFS. Let's say we have just extended the logical volume /dev/vg0/lv0. Now we want to make sure that the XFS can use all the new space.

With the XFS already mounted, we can grow it to fill the empty space. The XFS is still the size of the logical volume where it was created, 1.1 terabytes in this example.

makki:~# df -h | grep alpha
/dev/mapper/vg0-lv0   1.1T  528K  1.1T   1% /mnt/alpha
makki:~#

We've already grown the logical volume by adding another terabyte of space (using vgextend and lvextend as shown in the Growing a Logical Volume section). All we need to do is tell XFS to use the new space. The only argument to xfs_growfs is the directory where the filesystem is mounted.

PLEASE NOTE: There is a bug in the xfs driver in Coraid Linux that causes growth in increments of over two terabytes to fail. We are working to fix this bug., but you can work around the problem by growing repeatedly in increments of one terabyte (268435456 blocks of 4096 bytes each). Using the "-D" option to specify an absolute total size in terms of 4096-byte data blocks, and adding 268435456 for each growth, then omitting the "-D" option to finally grow by the sub-terabyte remainder, you can expand the XFS as needed.

makki:~# xfs_growfs /mnt/alpha
meta-data=/mnt/alpha             isize=256    agcount=32, agsize=8473312 blks
         =                       sectsz=512
data     =                       bsize=4096   blocks=271145984, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=32768, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0
data blocks changed from 271145984 to 539580416

Notice that the number of data blocks has about doubled.

makki:~# df -h | grep alpha
/dev/mapper/vg0-lv0   2.1T  1.1M  2.1T   1% /mnt/alpha

Now the extra terabyte is ready to use. No downtime!

Repairing a Filesystem

Having a single large filesystem is attractive in its convenience. A more conservative approach is to use many small filesystems, so that each independent filesystem has its own metadata, and problems in one filesystem can't cause problems in any other. But many administrators prefer to use one very large filesystem instead.

It takes a lot of memory to repair a huge XFS filesystem. If you ever have to repair an XFS, you can use the xfs_repair tool.

Here is a warning about using xfs_repair: If xfs_repair rescues your files, placing them in a lost+found directory, it is a good idea to rename the directory from "lost+found" to something else. The reason is that running xfs_repair again could result in the deletion and recreation of of lost+found.

To run this tool, the XFS experts say that you'll need …

  • about 2GB of memory per terabyte of filesystem space, and
  • 100-200MB of memory per million inodes in the filesystem.

That memory can be virtual memory, so by adding temporary swap space to the CLN, you can more easily run xfs_repair on large filesystems. In general, it's best not to use AoE devices for swap. It can be an attractive option, though, when the CLN is just performing an xfs_repair.

Using spare disks on Coraid SR appliances helps to make their LUNs more resilient to disk failures. It also provides a potential source of emergency virtual memory for your CLNs.

The strategy is as follows. (Please double check your commands!)

  • On the SR unit, borrow the spare by using the rmspare command.
  • On the SR unit, export the former spare disk using the jbod command.
  • On the CLN, initialize the new LUN as a swap device with the mkswap command.
  • On the CLN, activate the swap device with the swapon command.
  • On the CLN, perform the xfs_repair.
  • On the CLN, finish using the swap device with the swapoff command.
  • On the SR unit, remove the temporary jbod LUN and make the disk a spare again.

If you don't have an SR unit with a spare disk, another option would be to use a vblade process to make space from a Linux-based system available to the CLN.


www.coraid.com