I am not biased to one virtualization solution or another but I know a great product and amazing features when I see them. VMware ESX Server and the VMware Infrastructure suite has a lot of amazing features that really “set the bar” for other virtualization products. One of those features is VMware’s High Availability feature – dubbed VMHA.
When a physical server goes down or loses all network connectivity, VMHA steps in and migrates the virtual guest machines off of that server and onto another server. This way, the virtual machines can be up and running again in just the time that it takes them to reboot.
Figure 1: VMware High Availability (VMHA) – Image Courtesy of VMware.com
This is a very powerful feature because it means that any operating system and appliance can have high availability just by running inside the VMware Infrastructure.
There are a number of requirements to make this happen and there are both good and bad qualities of VMHA. I will cover all of that and show you how to configure VMHA in this article.
Let’s get started.
What is required to make VMHA work?
There are a number of requirements that you will have to meet to make VMHA work. Those requirements are:
VMware Infrastructure Suite Standard or Enterprise (no you cannot do it with the free ESXi nor can you do it with the VMware Foundations Suite).
At least 2 ESX host systems.
A shared SAN or NAS between the ESX Servers where the virtual machines will be stored. Keep in mind that with VMHA the virtual disks for the VMs covered by VMHA never move. What happens when a host system fails is that the ownership of those virtual machines is transferred from the failed host to the new host.
CPU compatibility between the hosts. The easiest way to test this is to attempt a VMotion of a VM from one server to another and see what happens. Here is what CPU incompatibility looks like when it fails:
Figure 2: CPU Incompatibility
If you cannot achieve CPU compatibility between hosts in the HA resource pool, then you will have to configure CPU Masking (see VMworld: VMotion between Apples and Oranges).
Highly Recommended – to have VMware management network redundancy (at least two NICs associated with the VMware port used for VMotion and iSCSI). If you do not have this, you will see:
Figure 3: Configuration issues because there is no VMware management network redundancy
What is great about VMHA?
Here are some of the great features of VMHA:
Provides high availability for all virtual machines at a low cost (compared to purchasing a HA solution on a per machine basis).
Works for any OS that runs inside VMware ESX. That means that even if I create a Vyatta virtual router running inside ESX Server, that ESX Server is in a HA resource pool, and the server it is running on goes down, then that Vyatta virtual router OS will migrate and have it reboot on the ESX host system.
VMHA is easy to configure. If you have the right equipment, licenses, and VMware Infrastructure already set up, you can configure VMHA in just a few minutes.
Works with DRS (distributed resource scheduler) such that when a VMs are going to be brought to other hosts in the resource pool due to host failure, DRS is used to determine where that load should be placed and to balance that load.
What is “not so great” about VMHA?
Just like with any solution, there are some features of VMHA that are not as great. Those features are:
CPUs on each host must be compatible (almost exactly) or you will have to configure CPU masking on every virtual machine.
Virtual machines that are on the host system that goes doem WILL have to be restarted.
VMHA is unaware of the underlying applications on those VMs. That means that if the underlying application data is corrupt from an application crash and server reboot, then even though the VM migrates and reboots from a crashed machine, the application still may be unusable (not that this is necessarily VMware’s fault).
How do I configure VMHA?
Configuration of VMHA is easy, just follow these steps:
The following assumes that you already have two ESX Server host systems, the VMware Infrastructure Suite (VI Suits), the CPUs on the host systems are compatible, you have a shared storage system, and all licensing related to VMHA and the VMHA feature is in place.
In the VI Client, Inventory View, Right-click on your datacenter and click on New Cluster.
Figure 4: Adding a New HA Cluster
This brings up the New Cluster Wizard. Give the Cluster a name and (assuming you are only creating a HA cluster), check the VMware HA cluster feature.
Figure 5: Naming the HA Cluster
Next, you will be given a chance to configure the HA options for this cluster. There is a lot to consider here – how many hosts can fail, if guests will be powered on if the proper amount of resources is not available, host isolation, restart priority, and virtual machine monitoring. To learn more about these settings, please read the VMware 3.5 Documentation.
Figure 6: Configuring HA Options
Select the swapfile location – either with the VM on your shared storage or on the host. I recommend keeping the swapfile with the VM on your shared storage.
And finally, you are shown the “ready to complete” screen where you can review what you are about to do and click Finish.
Once the HA cluster is created, you need to move ESX host systems into the cluster by clicking on them and dragging them into the cluster. You can also move VMs to the cluster in the same way. Here are my results:
Figure 7: HA Cluster created with ESX Server hosts and VMs inside
At this point, you should click on the cluster to see if there are any configuration issues (as you see in Figure 3). Also, notice how the cluster has its own tabs for Summary, Virtual Machines, Hosts, Resource Allocation, Performance, Tasks & Events, Alarms, and Permissions.
Even though I had configuration issues (no redundant management network), my VMHA cluster was still functional. To get around the “insufficient resources to satisfy configured failover level for HA” error message when powering up a VM, I changed the HA configuration to “Allow VMs to be powered on even if they violate availability constraints”.
Let’s test it.
How do I know if VMHA worked?
To test VMHA, I had two low end Dell Servers in my cluster. I had one Windows Server 2008 system running on ESX host “esx4”. To perform a simple HA test, I rebooted host “esx4” without going into maintenance mode. This caused the Windows 2008 Server to move from “esx4” to “esx3” and be rebooted. Here is the “before and after”:
Figure 8: Before causing the failure of server ESX4
Figure 9: After the failure of server ESX4 – proving the VMHA was successful
In this test, we saw that the Windows 2008 VM was moved from “esx4” to “esx3” when “esx4” we restarted.
In this article, you learned what VMware’s High Availability solution is and how to configure it. We started off with the requirements to use VMHA. From there, you saw what was good and what was not so good about VMHA. After showing you how to configure VMHA, I demonstrated exactly how it works in a real server failure. VMHA is really the leader when it comes to virtualization high availability.