by Tom Finnis – January 20, 2010
In my previous article on vSphere Data Recovery, you learned how to deploy the DR plug-in for the vSphere4 client and how to add the appliance to your virtual infrastructure. You also learned that one of its key features is an intuitive, wizard driven management interface that is integrated with the vSphere client to allow for simple configuration of your backup jobs. Assuming you followed the steps described in that article you should now be ready to learn how to use that management interface; in this article we will cover creating a backup schedule for a virtual machine, running a backup job and then how to restore that VM from the backup.
Data Recovery Basic Principles
The vSphere Client Data Recovery plug-in is used to configure the Data Recovery virtual machine, which then takes care of backup and restore jobs. In theory the DR VM can backup up to eight VMs concurrently, although its CPU utilization must be under 90% for it to start a backup job, otherwise it will wait until it drops. It works by using ESX’s snapshot feature to freeze a point-in-time copy of the target VM’s disks, which then give it a locked image to backup whilst the VM can continue to ope
rate as any disk changes are instead written to an interim snapshot file. Once the backup has completed the DR VM then releases the snapshot so that the intervening disk changes are replayed from the interim snapshot file into the frozen disk image, bringing it back to a live state.
Data Recovery supports writing backups to a variety of locations, either a local ESX datastore or network targets utilizing CIFS based file sharing such as SAMBA or Windows folder shares. However due to memory constraints only two separate storage locations can be written to concurrently, more than two locations can be specified but the jobs have to be scheduled to run separately. There is a limit of 100 virtual machines that can be backed up by a DR VM, although it will let you create backup jobs for more than that number of VMs it will simply omit to backup the excess. Additional DR VMs can be installed in order to work around this limitation but additional care needs to be taken when configuring the backup jobs as the appliances are not aware of each other.
It is important to note that to ensure a fully restorable backup of a Virtual Machine state Data Recovery attempts to make a “quiesced” snapshot. This requires the OS and any applications running on it to write any essential memory resident data to the disk so it is included in the snapshot for backup, otherwise applications may lose important data. To do this VMware Tools has to be installed on the guest operating system, Data Recovery then instructs it to quiesce the system for snapshot creation and then to de-quiesce when the process is completed. With Windows guest OS’s that support Volume Shadowcopy Services this is actioned by the VMware VSP service, otherwise VMware uses whatever quiescing support is available in the OS. Therefore you should always ensure you have installed the most up to date version of VMware Tools available on all your Virtual Machines wherever possible. Not having VMware Tools installed will not stop you from backing up a VM though, but your backups will only be “crash consistent” and may need a forced reboot after a restore.
vSphere 4.0 ESX hosts include optimisations for virtual machines created on them that enable advanced change tracking for the virtual disk states, these optimisations are not present on VMs created on older versions of ESX (3.5 and earlier). You can easily check what versions your VMs are from the Summary tab in the vSphere client:
Virtual machines created on vSphere4 should be version 7, which supports the advanced data change tracking features, but if you have VMs created on ESX 3.5 or earlier then they will be version 4 or less. Fortunately you can easily upgrade the VM version, and it is well worth doing, just shutdown the VM and then right-click it in the left hand pane and select “Upgrade Virtual Machine version”.
However before you do this make sure you have the latest version of VMware Tools installed on your VM, as the version upgrade also changes some of the virtual hardware, e.g. the NICs, which require new drivers included in VMware Tools.
This change-tracking function allows the Data Recovery VM to analyse the changes since the previous backup and thus will accelerate the backup process. Data Recovery also applies data de-duplication to each storage location so where information is repeated across VM backups it will only store that information once. This can lead to significant space savings, particularly when several VMs running the same OS are backed up to the same storage location, so should be taken into account when designing your backup strategy.
Setting Up Data Recovery
In the previous article you deployed the VMware Data Recovery appliance onto your vSphere infrastructure, now we need to finish configuring it and create a backup schedule. Open your vSphere Client and if it is not there already navigate to the “Home” page, you should now see a new icon under the “Solutions and Applications” section for “VMware Data Recovery” – click this to start managing your appliance. Should you not see the icon there then refer to the previous article for how to install the management plugin – it has to be installed on each vSphere Client system you intend to use, rather than the vCenter Server. Since the release of version 1.1 VMware have simplified the interface and initial setup process, now you can just select your VMware Data Recovery appliance from the list on the left and click “Connect”. The “Getting Started” wizard should then begin, if it doesn’t you can start it manually by clicking the “Configuration” tab and then the “Getting Started” link.
On the first page you will be prompted for credentials for the VM-DR appliance to connect to your vCenter Server with, depending on your security requirements you may want to create a separate user account for it to use. The VM-DR appliance initiates various tasks in order to perform its backups, such as creating VM snapshots, so by giving it its own login you can easily see which are its tasks when checking the vCenter logs.
The second step of the wizard configures the backup destination storage, for this guide we are assuming that you are using a VMFS store for your backup store, either on a SAN or local storage, which you attached to your VM-DR VM in the previous article. However if you want to use network based storage the process is the same, except you will first have to click the “Add Network Share” link here and provide the location of your storage.
Note that the VM-DR appliance is a Linux based system and as such only supports CIFS/SAMBA shares, technically this should include Windows shares but there are a number of potential issues you may encounter. The first thing to check is that you are using the IP of your network target rather than a name, after that if you are still having trouble connecting then I suggest a quick web search which turn up several things to check. If you are using a VMFS store you wont have to worry about this and you should see the disk you added to your appliance listed in the wizard already:
If under “Type” it says “unmounted” then you will need to click “Mount” first, then you need to “Format” the disk, once this has completed you can click “Next” and then complete the wizard. On the final page check that you are happy with the settings you have chosen, check the “Setup new backup job” option and then click “Close”.
The new backup job wizard should now start, if it doesn’t you may start it manually by clicking the “Backup” tab and then clicking “New”. The first page of the wizard will list all the virtual machines on your vSphere infrastructure, with check boxes so you can select them for backup:
Tick the boxes of the VMs you want backed up, if you wish you can expand a specific VM and select to only backup certain disks, or you can just select the cluster/datacenter to backup all the VMs it contains. Click “Next” and then on the next page select the backup store which you wish the backups to go to, there should only be one to choose from at this stage. VMware Data Recovery supports multiple stores, although it can only backup to two different stores simultaneously, however you should bear in mind that you will not maximise the benefits of the data de-duplication if you split your backups across several stores.
On the next page you need to define the “backup window” for the job, i.e. when it is allowed to run on each day of the week. The virtual machine backups themselves do not have a great impact on performance but the initial “quiesce” operation when the VM is snapshotted at the start of the backup can cause it to freeze for a while, especially if it has a high data throughput. As a result you should schedule your backup window so the backups start when your users aren’t online and when the processing demands on the VMs are at their lowest. In practice once the initial full backup of a VM has been completed the subsequent incrementals are much smaller and so are completed in a fraction of the time, so you may want to specify as large a window as possible to start with and then reduce it later on.
Its here that one of the limitations of VMware Data Recovery compared to other commercial backup solutions becomes apparent, you have fairly limited control over your backup scheduling. You cannot run more than one backup a day and the precise timing of that backup starting is hard to control, although usually they will start at the first window of opportunity each day. You can however restrict the backup frequency to less than daily by defining your backup windows appropriately.
On the last page you have to define your retention policy, i.e. how many historic backups you want to keep and for how long. What you choose here will be a compromise between the amount of storage space you need for your backups and how far you will be able to go back if you need to recover systems or files from the past. At this stage it is virtually impossible to judge how much space each backup will consume, since it is a combination of the daily changed data against the savings achieved by de-duplication. Therefore I would advise selecting a fairly conservative policy (e.g. “more” or “many”) for now, and then if necessary adjust it in a few weeks time when you are able to judge more accurately your storage consumption. Here you will discover another shortcoming of VM-DR, the reporting in general is rather concise and it can be fiddly to work out how much storage your backups are consuming. This is partly a side effect of the de-duplication, the logs will indicate figures for each backup but these are the “theoretical” total and as a result the best option is to monitor how the free space on your backup store declines with usage.
The final page of the wizard will confirm the settings you have chosen, so check these are ok and then click “Finish” to save the backup job. Depending on whether you are currently within your backup window it may start running the backup job immediately, in which case you will see the snapshot tasks appear in the task pane at the bottom. The backup jobs themselves do not appear here but you can monitor their progress by clicking the “Reports” tab and selecting “Running Tasks”. After a few days of operation if all has gone well you should see a list of successful backup tasks on this page, and if you click “Virtual Machines” you should be able to see the daily backup points for each VM.
Carrying Out a Restore Rehearsal
Its never a good idea to only find out that there is a problem with your backup solution when you need to restore something in a disaster situation, hence regular testing is recommended. VMware Data Recovery addresses this rather well with its “Restore Rehearsal” option which allows you to restore a virtual machine from backup without it affecting the live version of itself. It is simple to run, just right click on the virtual machine in the left hand pane and select “Restore Rehearsal”, then follow the wizard’s instructions to restore another copy of the VM to your vSphere datacenter. Once the restore is complete you can change to the Inventory view in your vSphere client and you will see the new VM listed, double-check the NIC is not connected and you can power it on to check that everything is working correctly. When you are happy that it has been a successful restore you can then shutdown the VM again and delete it from the datastore to release the storage space.
Assuming you have followed the steps laid out in this article you should now have your Data Recovery appliance up and running regular backups for you, and you can test it is working correctly with restore rehearsals. Unfortunately it does not have any reporting or alerting features so the only way you can check your backups are completing successfully is to regularly check it yourself, and remember to keep an eye on the free space in the backup store.
While VMware Data Recovery lacks many of the features you would expect from commercial backup applications but considering that it is included with most of the vSphere bundles it can be a useful addition to your disaster recovery provisions. In its present state I would not recommend it as your only backup solution but it can provide you with an additional level of protection and an alternative recovery option. Assuming you don’t already have a system image backup application it gives you the capability to rapidly restore complete virtual machine images when required and the incremental backups combined with the data de-duplication mean it’s storage requirements are not excessive.
You can try out VMware Data Recovery for free by evaluating VMware vSphere at this link.