NetApp SE Lab Report:
SAN Boot with VMware ESX 3.0.0
By Nick Triantos
Enterprises with large numbers of servers are increasingly turning to diskless servers that boot from SAN (FC or IP) to reduce costs, consolidate storage, and streamline provisioning. Although SAN boot is not new, the introduction of the blade server has helped accelerate its adoption. Blade servers provide greater manageability, reduced hardware costs, and simpler cable management in addition to power, cooling, and real-estate savings.
One of the most popular platforms that lends itself to booting from SAN is VMware ESX Server. More and more enterprises are deploying VMware ESX Server to consolidate hundreds of physical servers into a few diskless blades in a single-blade chassis. Others have elected to deploy VMware using racks of standalone 1U Intel-based servers.
Given that most servers today ship with internal SATA drives, which are not supported to host the ESX Server image, SAN boot becomes an attractive choice. Storage-based snapshots, cloning, and replication techniques provide the added advantage of being able to quickly recover a corrupted ESX image from a clone or a snapshot and restore it on the same physical server or replicate it to a remote site for DR purposes.
Like most technology vendors, Network Appliance has labs in the vast majority of its sales offices. These labs are primarily used for technology demonstrations, and they utilize a variety of operating systems and third-party software. The NetApp Dallas office also has a systems engineering lab, where NetApp SEs can delve deep into specific technologies. This lab has included a VMware ESX Server environment since the 2.5.1 release, and we have been booting ESX Server from a fabric since the 2.5.3 release.
In August I decided to upgrade to the new ESX 3.0.0 release to get a feel for the changes, and I'm definitely impressed with the results.
With ESX 3.0.0, VMware has made significant advancements in supporting boot from SAN. The multiple requirements from the previous release have eased. Based on my experience, the setup process is quick and easy and—at least for testing purposes—the environment has been working flawlessly.
Setting Up ESX 3.0.0 for SAN Boot
Setting up VMware for SAN boot with NetApp storage was a breeze. The whole process didn't take more than 20 or 25 minutes from the time I provisioned the boot LUN to the time the ESX image installation had completed.
The following table shows our setup.
| Server |
IBM x346 |
| CPU |
2x Xeon 3.2GHz |
| Memory |
8GB |
| FC HBA |
2x QLA 2340 |
| FC Switch |
MDS 9120 |
| External Storage |
NetApp FAS3050c |
|
Table 1. NetApp SE Lab Setup
|
After the installation, it was time to create virtual machines (VMs) and install the guest operating system. I chose to install VMs on LUNs over iSCSI so I could get a feel for VMware's implementation. Configuring the iSCSI initiator was a breeze, and I was able to install the guest operating system with no issues. Given that ESX currently does not provide a multipathing mechanism for iSCSI LUNs, I chose to implement NIC teaming, which essentially serves the same purpose.
Suggested Edits to the Default ESX 3.0.0. Configuration
If you're interested in SAN boot with ESX 3.0.0, there are some things you need to consider. First, I highly recommend that before making any HBA purchasing decisions you contact your storage vendor and carefully review the VMware I/O Compatibility Guide for ESX Server 3.0. You will find that certain model HBAs are not supported for SAN booting.
Additionally, there are a number of tweaks and customizations that you can make to achieve higher performance and nondisruptive failovers if hardware failures occur. I recommend three simple changes to the default ESX Server 3.0.0 setting:
- Enable the BIOS on only one HBA.
- Modify the Execution Throttle/Queue Depth.
- Modify the PortDownRetryCount parameter.
I go into more detail on each point in the following sections. However, keep in mind that this advice hasn't been fully tested or approved by NetApp engineering, so I can't claim that this is the right answer for all environments.
Tip #1: Enable the BIOS on only one HBA.
You need to enable the BIOS on the second HBA only if you need to reboot the server while either the original HBA used for booting purposes, the cable, or the FC switch has failed. In this scenario, you would use QLogic Fast!UTIL to select the active HBA, enable the BIOS, scan the BUS to discover the boot LUN, and assign the WWPN and LUN ID to the active HBA. However, when both HBA connections are functional, only one needs to have its BIOS enabled.
Tip #2: Modify the Execution Throttle/Queue Depth.
The Execution Throttle/Queue Depth signifies the maximum number of outstanding commands that can execute on any one HBA port. The default for ESX 3.0.0 is 32, but the best value for your environment depends on a couple of factors:
- Total number of LUNs exposed through the array target ports
- Array target port queue depth
The formula to determine the value is:
Queue Depth = Target Queue Depth / Total number of LUNs mapped from the array
This formula will guarantee that a simultaneous fast load on every LUN will not flood the Target Port resulting in QFULL conditions. A QFULL condition signifies a Target port's inability to process more I/O than is capable. In most operation systems, upon receiving a QFULL condition from the Target, the HBA driver will typically decrease a LUN's maximum queue depth to the minimum value, typically "1", thereby throttling I/O to the Target port. When the Target stops issuing QFULL conditions, the HBA driver will start gradually increasing the LUN queue depth value, thereby slowly increasing I/O to the Target port.
Here's an example of how the above formula can help you avoid a QFULL condition. If a Target Port has a queue depth of 1024 and 64 LUNs are exposed thru that port, then the Queue Depth on each host should be set to 16 outstanding I/Os per LUN. This is the safest approach and guarantees no QFULL conditions:
16 Outstanding I/Os per LUN x 64 LUNs = Target Port Queue Depth
But—be careful. If a separate queue depth calculation for each host is performed using the above formula, then you still have the potential for QFULL conditions.
Here's why. Let's expand the previous example and assume that we have a total of 64 LUNs and four ESX hosts, each of which has 16 LUNs mapped.
Performing the calculation for each ESX host separately would yield: Queue Depth = 1024 / 16 LUNs = 64 Outstanding I/Os per LUN. However, a simultaneous fast load on all 64 LUNs across four ESX servers would yield: 64 Outstanding I/Os per LUN x 64 LUNs = 4096 which is much greater than Queue Depth of the Physical Array Target Port. This is an undesirable condition that, under certain circumstances, will generate a QFULL and throttle I/O.
To Change the Queue Depth on a QLogic HBA
- Create a copy of /etc/vmware/esx.conf.
- Locate the following entry for each HBA:
/device/002:02.0/name = "QLogic Corp QLA231x/2340 (rev 02)"
/device/002:02.0/options = ""
- c) Modify as shown:
/device/002:02.0/name = "QLogic Corp QLA231x/2340 (rev 02)"
/device/002:02.0/options = "ql2xmaxqdepth= xxx"
Where xxx is the queue depth value.
- Reboot.
Tip #3: Modify the PortDownRetryCount parameter.
The PortDownRetryCount parameter value must be set to the value recommended by your storage vendor, using Fast!UTIL. This setting specifies the number of times the adapter's driver retries a command to a port that is returning Port Down status. This value for ESX server is 2* n +5, where n is the value of PortDownRetryCount from the HBA BIOS.
You can change this value directly in the HBA, or you can do it after you've installed ESX by editing the /etc/vmware/esx.conf file. To edit the file, locate the "options=" entry under the HBA model you are using and make the following change.
To Change the PortDownRetryCount Parameter
- Create a copy of /etc/vmware/esx.conf.
- Locate the following entry for each HBA:
/device/002:02.0/name = "QLogic Corp QLA231x/2340 (rev 02)"
/device/002:02.0/options = ""
- Modify as shown:
/device/002:02.0/name = "QLogic Corp QLA231x/2340 (rev 02)"
/device/002:02.0/options = "qlport_down_retry= xxx"
Where xxx is the value recommended by your storage vendor. The equivalent setting for Emulex HBAs is "lpfc_nodedev_tmo". The default is "30".
- Reboot.
Overall Assessment
So far, my experience with SAN booting VMware ESX 3.0.0 has been nothing but positive. From a procedural perspective, the process is certainly much easier than with previous releases. In addition, I've found the ESX host's reliability during storage controller failover testing to be rock solid so far.