The Problem

We would like to backup our critical files from several network shares to a removable hard drive. We want to automate the backup so we don't have to remember to run it. It needs to finish overnight. Furthermore, we want to be able to preserve multiple versions of each file so we can back out of our user's mistakes easier.

Background Information

I work in a large Windows-based enterprise with a centralized IT section who is responsible for all backups. Their backups are geared towards disaster recovery and not user error, and require upper-level management approval for any non-disaster recoveries. Several times we have noticed that our backups have failed, we weren't notified. I do not have administrative rights to the server or my desktop. We are trying to backup some 198,000 files spanning about 240 gigabytes. These files rarely change. Our backup drive is one terabyte.

My Proposed Solution

What I would like to do is to write a batch file using Robocopy with the /mir option along with Mercurial SCM to store all versions of the file. I would do an hg add followed by an hg commit before each execution of Robocopy to save the current state, and then make a mirrored copy of the file structures. The problem is the /mir will delete every folder not present in the source, and Mercurial stores the repository in a .hg folder in the destination folder.

Does anybody know how I could either convince Mercurial to store the .hg folder elsewhere, or convince Robocopy not to delete it from the destination?

I'm trying to avoid writing a custom program do to copying.


To perform a mirror while ignoring certain file exentions, try this:

robocopy c:\temp\source c:\temp\dest /e /purge /xf *.hg

I tested it a bit here and it appears to leave the .hg files alone on the destination

I also recommend the /z flag (restartable for large files) and have had to resort to the /fft flag in the past when dealing with servers having different time granularity : http://www.readynas.com/forum/viewtopic.php?t=3573

edit: ... my example above was assuming files named *.hg ... to do the same and ignore directories named *.hg, use this :

robocopy c:\temp\source c:\temp\dest /e /z /purge /xd *.hg

First, I would like to point out that you're probably taking the problem from the wrong end: in theory, you should write down a business case for a better backup solution, have your management submit that to the central IT and then let them sort it out themselves.

Now, this being out of the way, you're going to face a number of issues with that proposal. The first one being that Mercurial isn't built for what you want to do and neither is robocopy. You can find ways around you problem by deslaring the backup location itself as being a repository but that's not really efficient.

I would instead investigate two different tracks.

The first one is to use Volume Shadow Copy to create a snapshot history that you store on a different disk array on the server (you seem to be running windows on your file server and I'm assuming you have a recent version 2008 or higer). That will give you an easy way to recover user files (actually users will be able to recover their files themselves) while keeping the backup size close to the data size. The downside of this is that you will need admin access on the server to set this up so there is no way around involving your IT administrators (which I personally think is a good think, but that's only me).

The second option is to bit the bullet and get a "real" backup software. If your data is important enough, you can probably make a business case for that investment. If not, you can always use Cobian backup to create incremental copies of the files as a user-level task or, if you really want to use mercurial, use Cobian backup instead of Robocopy to create the repository copy.


Since you are already talking about Windows and robocopy I have another solution without the need for Robocopy or Mercurial. The off-site stuff gets a little tricky but you would at least have near-real time backups of all of your critical files.

You have two options to start:

  1. Enroll in the Microsoft Partner Network and shell out the $300 some for an Action Pack Subscription
  2. Buy a Microsoft DPM enabled server from Dell or HP

Either way you end up with either a full DPM box or at least the software which you can then install on a spare machine. DPM is great at taking network share backups, with backups as little as every minute. It also does Shadow Copies so users can be using the files when they are backed up. Lastly, it also allows for end-user recovery options which gets you out of the mix to some extent, if you want.

If you need off-site capabilities you could also purchase a tape drive or even an embedded tape drive in your fancy new server. If you would rather do removable disk storage take a look at a product called Firestreamer. It allows you to use disk to removable disk storage.

Either way, long term this is a much better solution and is much more reliable than a cron job using robocopy and some storage location, trust me! Overall your costs are anywhere from $500 to $5000 depending on what all you want to do, but it's worth the piece of mind and stability.