Virtualization Changed Everything
While we may all be familiar with virtualization at this point, we are still feeling its significant impact on our data centers. What was once sprawling racks of servers has been consolidated into ever-smaller hardware packages. With the advent of cloud computing, your hardware footprint could in fact be zero! However, despite the technology making our lives so much easier, it’s important to consider that our responsibility for these workloads remains the same.
But Don’t Overlook Backups
In my experience talking with customers who are creating a new virtual environment, one of the frequently overlooked components of the environment is backup. Unfortunately, there is nothing glamorous about backup, and it is often relegated as an afterthought compared to the novelty of a new environment. However, even the best storage arrays can’t guarantee against accidental (or willful) deletions, and there is always a chance of corruption affecting your VMs. Any administrator will tell you, backups are not glamorous, but not having them is a career ender.
“There is always a chance of corruption affecting your VMs.”
It is important to remember, however, that backups of virtual machines can be very different from traditional backups. Instead of backing up a physical machine, you are essentially backing up a combination of files that represents a machine. This abstraction opens up many new possibilities, and complications, for backing up those workloads. Some of the largest things to consider are:
5 Key VM Backup Considerations
1. Agent vs. Agent-Less
Traditional backups almost always require an agent. The backup software needs some way for the backup server to communicate with the backup client, and the easiest way to do that has always been to install some form of an agent on the client. However, once virtualized, the virtual machine can be backed up quite easily without an agent. The physical machine that it is running on can be used as a proxy to access the virtual machine that is now represented by files. Though this can simplify things on the guest OS, it’s important to remember that backup agents do more than simply provide a means of communication. Backup agents for databases, such as Oracle, Exchange, SQL, etc. also make sure that the backups are consistent, so that in case of a restore, the database can actually be read. Though virtualization vendors have gotten much better at doing this at the virtualization layer, many of the legacy databases will still require an agent to get a good backup.
“It’s important to remember that backup agents do more than simply provide a means of communication.”
2. File Level Backups
As useful as backing up an entire VM may be, in most cases, your request for restores will usually be one or a small handful of files. This may pose a problem if you are doing only VM level backups, as the restore process would require restoring the entire VM, extracting the files that you need, and then transferring them over. Backup software has gotten better at being able to handle file level restores with native hypervisor integration points, but it usually involves being able to automate the process, and the space requirements to restore the VM usually still exists. Even if not, it’s important to read the fine print regarding the backup software’s capabilities. Especially for Linux VMs, file level backups can have many caveats that make the feature impractical. For example, one of my clients attempted to do file level restores of their Linux VMs, only to learn that the procedure required that the client had access to Java and a web browser. Given that their Linux machines did not have the GUI components installed, they ended up having to do a full VM restore.
“It’s important to read the fine print regarding the backup software’s capabilities.”
3. Network vs. SAN
One of the really cool parts of doing backups at a VM level is that it opens up new avenues for the backup server to be able to access the data that is being backed up. Most virtualized environments will have some form of shared storage, and as such, will be part of a SAN. Traditionally, backups are usually done over the regular IP network, with the backup server relying on the client to be able to interpret and access the storage that is being backed up. However, with all of the VM data on the SAN, giving the backup infrastructure direct access can significantly accelerate backups. This model eliminates the network as being a bottleneck, allowing you to do backup beyond the traditional nightly backup window. Doing SAN level backups, however, will require that you give the backup hosts the same access to the SAN as your virtual infrastructure, which may pose additional security concerns. This also means that you’ll need to keep a closer eye on SAN zoning, which is typically outside the purview of the backup administrator.
“Keep a closer eye on SAN zoning.”
4. Replication
Though replication is not technically a backup, it is still a valuable tool to protect against data loss, and has the added benefit of being able to be used in the event of a disaster recovery. When selecting replication technologies, it’s important to keep in mind at which layer the replication occurs. If you are replicating at the storage level, this will typically require that the storage technologies at the source and destination sites are similar, if not identical. However, there have been an increasing number of solutions that take advantage of replication at the hypervisor level. This means that the virtual machine files are being replicated rather than the storage blocks. This gives much greater flexibility in terms of the hardware being used, but also means that the replication will be exclusive to your VMware environment, as it will not replicate storage data outside of the virtualization layer. Whichever method you choose, make sure you conduct a test failover. There’s nothing worse than thinking you have access to your data when you don’t.
“Make sure you conduct a test failover.”
5. Cloud vs. Local
One of the more recent developments is the increased availability of cloud based backup solutions. This is particularly attractive to virtual environments, as you are able to offload entire machines to the cloud, and in case of a disaster, can even spin the workload back up in the cloud. This gives you access to a disaster site without having to maintain the hardware that it would usually entail. Of course, the biggest concern regarding this is security, and although I argued in my previous article that the cloud is only as safe as you make it out to be, there will always be those uncomfortable with losing control of their data. However, if you are able to, cloud backups provide an excellent alternative to send backups data offsite. Traditional solutions that remain local to the data center, if improperly maintained, provide more headaches than peace of mind to backup administrators. Let someone else have the headaches.
In many ways I’m just scratching the surface when it comes to backing up VMs. I hope that these key considerations give you an idea of the options available and act as a springboard for discussion with your vendors about their backup capabilities.
Remember, backups are only as good as their restores, so test, test, and test again. When the time comes, you’ll be glad you did.
About the Author
Michael Chen is a Consultant at Daymark Solutions and specializes in virtualization, as well as backup and data recovery for enterprise companies in Financial Services, Telecomm, and Healthcare. Michael holds various certifications from VMware, EMC, Hitachi, and Symantec