CORAID - Affordably Fast SAN
Returning Customers click here to log in.
 
  register login logout View Cart My Account Checkout
newssupportcontact us bar-end

ESX/vSphere EtherDrive HBA FAQ

  • Q 1: When I attempt to create a datastore on one of my SR[X]'s LUNs, the operation takes an inordinate amount of time to complete, does not complete at all, or I receive the error: "Unable to create Filesystem." Is there something I should check?
  • Q 2: After installing the EtherDrive HBA and driver my ESX server sees my SR LUNs as 2 TB, even though the LUNs I created on the SR are larger. Is something wrong?
  • Q 3: I've installed ESXi 3.5 update 4 (or any other incremental update for ESXi 3.5). After installing the update and rebooting ESXi, my CORAID HBA is no longer recognized by the ESXi server. What happened?
  • Q 4: The "Available" space of one of my SR LUNs is not listed at the VMWare Infrastructure Client's storage configuration page. When I try to create a datastore on the LUN I receive an error stating "Failed to update the disk partition information." What can I do to make this LUN available for use?
  • Q 5: An AoE LUN has gone offline and now my ESX box hangs. Is this normal?
  • Q 6: Is RDM available for EtherDrive storage?
  • Q 7: Why is the EtherDrive HBA’s description listed as ‘Unknown’ on the Storage Adapters page in vSphere?
  • Q 8: I still need help. What should I do next?

  • Q 1: When I attempt to create a datastore on one of my SR[X]'s LUNs, the operation takes an inordinate amount of time to complete, does not complete at all, or I receive the error: "Unable to create Filesystem." Is there something I should check?

    This is a common symptom when support for 9000 MTU is lacking between the ESX host and the SR[X], generally because a switch in the path is not configured for jumbo frames or simply does not support jumbos at all. The EtherDrive HBA requires adequate support for 9000 MTU, so a good first step would be to check your switch's configuration and/or capability. A good test to rule this variable out is to try the same operation with the EtherDrive HBA connected directly to the SR[X], bypassing the switch; if the operation works fine with the direct connection, there is likely a problem with the switch or its configuration.

  • Q 2: After installing the EtherDrive HBA and driver my ESX server sees my SR LUNs as 2 TB, even though the LUNs I created on the SR are larger. Is something wrong?

    Due to VMware's implementation of the SCSI-2 industry standard the largest extent size that ESX and ESXi can use is 2 TB; for this reason the EtherDrive HBA automatically presents a LUN greater than 2TB as multiple LUNs segmented at 2TB boundaries. For example, a 5TB LUN is presented to the ESX server as two 2TB LUNs and one 1TB LUN. The segmented LUNs can be put back together by Virtual Infrastructure by using the "Extend Datastore" feature.

    For further information regarding this 2 TB size limitation see VMware's knowledge base.

  • Q 3: I've installed ESXi 3.5 update 4 (or any other incremental update for ESXi 3.5). After installing the update and rebooting ESXi, my CORAID HBA is no longer recognized by the ESXi server. What happened?

    This is normal behavior for ESXi with regard to third-party drivers. Reinstall the HBA driver for ESXi following the normal procedure in our EtherDrive SAN Integration manual and the HBA will again be recognized, along with your previously created datastores. Remember that the HBA driver installation also requires a reboot of the ESXi server.

  • Q 4: The "Available" space of one of my SR LUNs is not listed at the VMWare Infrastructure Client's storage configuration page. When I try to create a datastore on the LUN I receive an error stating "Failed to update the disk partition information." What can I do to make this LUN available for use?

    This error can occur when there is partition information at the beginning of the LUN, typically with disks that have been previously used with other filesystems or LUN configurations. VMware has a helpful article at their Knowledge Base that outlines how to resolve this issue.

    After following the instructions to identify the device and recreate the partition table, then rescanning for storage at "Configuration -> Storage Adapters" (or using 'vmkfstools -s' at the COS), the previously inaccessible LUN will now be available for use via the Infrastructure Client's storage configuration page.

    Occasionally, creating a new partition with 'fdisk' as described at VMware's KB article above will not render the LUN usable again. If this is the case, you can use the info in /proc/ethdrv.devices (or /proc/scsi/ethdrv/ethrv.devices) and the ESX command 'esxcfg-mpath -l' to find out which SCSI "device" is associated with the SR's LUN (output below is snipped for brevity):

    [root@neptune root]# cat /proc/ethdrv.devices 
    'ESX Device' 'AoE Target' Size
    vmhba2:5:0 32.3 299.999GB
    vmhba2:6:0 32.2 199.999GB
    vmhba2:7:0 32.1 299.999GB
    vmhba2:8:0 32.14 99.999GB
    [root@neptune root]# esxcfg-mpath -l
    Disk vmhba2:0:0 /dev/sdb (286102MB) has 1 paths and policy of Fixed
     FC 9:2.0 N/A<->e003200001 vmhba2:0:0 On active preferred
    

    CD-ROM Drive vmhba32:0:0 /dev/scd0 (0MB) has 1 paths and policy of Fixed Local 0:31.2 vmhba32:0:0 On active preferred

    Disk vmhba2:8:0 /dev/sdj (95367MB) has 1 paths and policy of Fixed FC 9:2.0 N/A<->e000320014 vmhba2:8:0 On active preferred

    Disk vmhba2:6:0 /dev/sdh (190734MB) has 1 paths and policy of Fixed FC 9:2.0 N/A<->e000320002 vmhba2:6:0 On active preferred

    Disk vmhba2:7:0 /dev/sdi (286102MB) has 1 paths and policy of Fixed FC 9:2.0 N/A<->e000320001 vmhba2:7:0 On active preferred

    Disk vmhba2:5:0 /dev/sdg (286102MB) has 1 paths and policy of Fixed FC 9:2.0 N/A<->e000320003 vmhba2:5:0 On active preferred [root@neptune root]#

    In this example, if LUN 32.1 is the one being problematic, 'dd' can be used to wipe out the front of the LUN like so:

    [root@neptune root]# dd if=/dev/zero of=/dev/sdi bs=1024 count=256 
    256+0 records in
    256+0 records out
    [root@neptune root]# 
    

    Again, after rescanning the HBA the LUN should now be usable.

  • Q 5: An AoE LUN has gone offline and now my ESX box hangs. Is this normal?

    Unfortunately, yes. In ESX storage is not considered to be hot pluggable because VMs do not behave well when their boot storage disappears. We have been advised that ESX expects its disks to never disappear and never return a failure unless the disk is actually failed. We have complied with this, but starting with driver 1.0.2 we added the ability to enable failures should the administrator desire it. If you find yourself in a situation where ESX has hung waiting on I/O to a disk that will not return, you can run the following at the Service Console or Console OS to enable failures:

    echo failenable on >/proc/ethdrv.ctl
    cat /proc/ethdrv.ctl

    This will, however, enable failures on all AoE LUNs and should not be used as a long-term running configuration. Ultimately a reboot is necessary to clean up the missing device state.

  • Q 6: Is RDM available for EtherDrive storage?

    In ESX 3.5, yes, RDM (Raw Disk Map) just works as expected.

    In ESX 4.0 the RDM system was changed to only permit RDM on non-local storage. Currently only iSCSI, Fibre Channel, and SAS storage types are considered non-local. VMWare has implemented a generic SAN type that we are utilizing, but VMWare has not finished implementing the details necessary to consider the generic SAN type non-local. There is a workaround to enable RDM for local storage types, but it comes with the following caveats from VMWare:

    1. RDM is a mechanism that VMware provides in order to support MSCS, as described in our public manuals. Other uses of RDM are not tested. Using RDM to map any other volume that contains sensitive customer data (i.e., outside ESX Server's root file-system, and not MSCS, but still important to the customer) is strictly at the customer's own risk. If the customer loses data in a non-MSCS configuration (i.e., something that we've never tested), we will reject any service or support request.
    2. The workaround makes it potentially possible to RDM a volume that holds your ESX Server's root file-system. This is absolutely NOT supported or recommended by VMware. If a customer tries this, accidentally destroys his system as a result, then files a Field Service request or Support Request with us, we will reject the request out of hand.

    If you understand these caveats and would still like to enable RDM on ESX4, please contact support@coraid.com for more information.

  • Q 7: Why is the EtherDrive HBA’s description listed as ‘Unknown’ on the Storage Adapters page in vSphere?

    This is a known issue with ESXi 4. Even though the EtherDrive HBA is described as ‘Unknown’, there are no functional issues with ESXi or the EtherDrive HBA as a result of this problem. This issue will be addressed by VMware in a future release of ESXi 4.

  • Q 8: I still need help. What should I do next?

    We have hints that will help you get the fastest and best support we can provide at our Fast Support HOWTO. If you still need assistance with your CORAID EtherDrive HBA please send the information requested at the Fast Support HOWTO to support@coraid.com.


www.coraid.com