What is the best backup solution for VMware Infrastructure system that hosts a wide variety of VMs?
In a situation where you are running:
- VMware Infrastructure 4.x with multiple hosts
- Over 150 VMs with a wide variety of operating systems (Linux in a half dozen distros, Solaris, every MS version, etc.) in multiple languages with almost every mix of installed software (luckily, no Exchange mail servers)
- Using an EMC fiber channel SAN
- The VWs that need need to be backed up use about 2 terabytes of data (total)
- The goal is to keep backups for about 3-months
At this rough scale, what backup solutions have worked well for you? And, as an add-on question, did any of them have de-duplication that you thought was effective and useful?
vSphere has this new-fangled vBackup API that kinda does away with the VCB proxy if you want to do direct SAN-based data transfers on your backups. There are several vendors with products that support this, and my experiences so far a have been very positive.
The main advantage of SAN/vBackup-based backup jobs are:
Agent-less backup for the majority of VMs. Backups are taken by snapshotting a VM, backing up the static disk that generates, then releasing the snapshot. If software within the VM is OK with snapshots, then it'll be OK with agentless backup via this method. VSS-aware apps like Exchange and SQL are, I believe, OK with snapshots... so you don't need agents unless you want granular (item-level) recovery of stuff like individual emails and table rows.
SAN-based backups can be really fast. Especially if you're pushing that data at quiet times. We're making out all SP interfaces on our iSCSI SAN when backups are running overnight.
Change-Block tracking makes incremental/differential backup of a whole VM possible, fast, and very small.
There are a few caveats:
You really need to be going Disk-to-Disk with this, not Disk-to-Tape. So a few TBs of storage on your backup host is definitely recommended. Without this you just won't see the throughput.
Your physical backup hosts needs to be plumbed into your SAN and be able to see all your LUNs in order to back them up. In practical terms this means you usually have a Windows box with an HBA and a ton of 'unidentified volumes' in Disk Management. Which it wants to initialize for you every time you peek in there. If you do, it'll trash your VMFS volumes. Kinda suck.
The Change-Block tracking feature seems to go a bit wonky if you, for whatever reason, don't get a backup for a few days in a row.
As mentioned briefly you will probably want agents for SQL, Exchange and AD hosts even if they're VMs, to give you granular recovery.
Your ESX or ESXi hosts must be licensed. ESXi Free doesn't enable the vStorage stuff. Having a VirtualCenter is also advisable but not, I believe, required.
Individual products are hard to recommend since I've only really worked with Backup Exec, but I'm finding 2010 R2 version of this to be stable and good to work with.
De-Duplication is an interesting topic - We evaluated it for BE2010 (first release) and found it to be very buggy. It also wasn't saving us that much space (since incremental work so well)... so it wasn't worth the additional hassle or cost. We notified our supplier at the time and they seemed eager to work with us to resolve the issues, but we dropped it because the benefits weren't there.
I have a similar environment and use Veeam Backup to do both backups and replicas. Veeam uses the changed block tracking function of vSphere, which shrinks your backup window. You can set backup expirations to keep 3 months of backups on-line and available. Total amount of backup storage will depend on the amount of changed blocks each day, and whether you are going to keep daily incrementals or have a weekly or monthly backup rotation.
Veeam is licensed per host, not per vm, so it's pretty cost-effective if you have a high consolidation ratio.
The backup solution will depend on RPO and RTO goals for the organisation, available bandwidth, hardware and backup windows. One large VMware environment I know relies on scripted VMware backup solution for OS recovery (full OS disk image snapshoted every week) plus in-OS agents for backing up data. Having dedicated boot disks in the machines helps with this approach.
I wouldn't recommend backing up ESX servers themselves, it's much easier to prepare a scripted install CD and recover the server by non-interactive re-install. The interesting part are virtual machines.
Because you have so wildly heterogeneous environment I see two approaches:
1) Continue doing whatever you do, treating the machines as if they were separate physical boxes. This may be an administrative pain in the nether regions because of n separate backup and restore procedures, but if it worked so far, it worked, right?
2) Standardise on one product, that can handle all the environments. I'm hearing a lot good about TSM, but I may be not fully objective here, because my employer sells and supports TSM.