Do I still need a backup if I have a redudant storage system with rollback capabilities?

What you describe is essential a geographically distributed RAID and a RAID was never a backup.

Online sync usually means everything you do on the primary storage gets immediately replicated to the backup system, including operations like the deletion of (all) snapshots and/or volumes by an attacker or simply an admin error.


The 30-day rollback is a great capability, but what if "critically-important-file-xyz" became corrupt/damaged and this was not detected until 31+ days later? This situation is the difference between back-up and archival schedules, but in your description the latter is not mentioned. Archival systems are usually stored on very low cost tape. Also no information is available on whether the business is one that has regulatory or other requirements to retain data for longer than 30 days, which is frequently the case.

If this is not the case for your situation, then you should be good.


Having geographically separated machines both having the data is good.

What happens when you have multiple failures involving both or all your sites? A fire at one, theft of the servers at the other? Or there is a problem with the line between them, then the primary location's server goes out, and the HD controller goes ape and writes junk? Or some insider performs malicious acts on both? Or the FBI confiscates your servers at both locations because of suspected ( you would never, but, maybe you are co-hosted in a datacenter with schmucks ). Or.. I am reminded of several high profile "cloud" outages where everything was redundant, analyzed to the nth degree, but, still, things can go wrong. I'll grant you these are all unlikely, but you've acknowledged that unlikely things can happen.

So, it comes down to how important/valuable is that data? What will the organization do if it ends up gone?


The question here seems to be about just how disconnected and geographically distinct a replicated copy of your data needs to be before it's a backup and not high availability/redundancy infrastructure. My gut is that you're close, but still need a backup.

To bring together (cherry-pick) some thoughts in the other answers and comments, you can go really far down the path of "well, X technology doesn't cover Y disaster scenario, so it's not a backup," and at some point you need to decide what's reasonable for you, which seems to be why you're asking. My feeling on this, and I think the feeling of many of the commenters, is that your backup needs to exist on a separate technological infrastructure from your in-use data so that failures, accidents, and malicious actions either can't propagate or have a much higher hurdle to cross. An example given in the comments is someone deleting the volumes, which is a valid, not pie-in-the-sky scenario in my opinion. But additionally, a real-world example from my work. The university I work for (but thankfully don't manage this infrastructure for) has some serious high-availability virtualization infrastructure that supports a lot of the campus facilities. It's at multiple sites, but is all running on one vendor's platform. An obscure bug cropped up one day that caused a failure cascade that first took down a single server, then when the load shifted, it took out the rest of that site, and then when the load shifted again, it took out the other sites hosting that infrastructure. (I believe they've resolved this issue since then). The data wasn't lost in this case, but it's feasible to imagine a scenario involving your data where it was.

You want your backup to be immune to all of that, and even accessible while that infrastructure is down. If the data is unavailable for a week while your RAID rebuilds, being able to recover business critical documents from backup is nice (though not required). If your RAID disappears, then replicates to your other site, you'll really want that backup to be from a separate vendor or on some isolated media like tape.

All this said, I'll again repeat that your backup should be on a separate infrastructure from your data. There are many levels of isolation here, but I think anything connected through direct replication is too close to be a backup. You'll want something in addition.