Using tar and rsync for high availability

I have Ubuntu cloud servers running which I don't have direct access to but with ssh. I'm using 'tar' to clone or to have high availability of this server. I followed the tutorial from the link [link text][1]. I tried this installing a new server of same version. When I extracted the tar (tar -xvpzf ~/clone.tgz -C /) on the destination (new), at the end it's ending with the following output similar to the below (don't know if it's error).

tar: var/run: time stamp 2010-11-09 17:09:11 is 7335.159880406 s in the future
tar: var/spool/postfix/usr/lib/zoneinfo: time stamp 2010-11-09 17:08:26 is 7290.159730037 s in the future
tar: var/lib: time stamp 2010-11-09 17:27:51 is 8455.159349527 s in the future
tar: usr/bin: time stamp 2010-11-09 17:28:02 is 8466.159254097 s in the future
tar: usr/share/sgml: time stamp 2010-11-09 17:27:47 is 8451.158909506 s in the future
tar: usr/share/man/man7: time stamp 2010-11-09 17:27:50 is 8454.158393583 s in the future
tar: usr/share/man/man1: time stamp 2010-11-09 17:28:02 is 8466.158166556 s in the future
tar: usr/share/man/man8: time stamp 2010-11-09 17:27:51 is 8455.158057701 s in the  future
tar: usr/share/omf/time-admin: time stamp 2010-11-09 17:27:52 is 8456.157830449 s in the future
---------------------------------------------
---------------------------------------------
---------------------------------------------

I'm using following command to create a tar file of the specified directories on the source system.

tar -cvzf ~/clone.tgz --exclude ~/clone.tgz --exclude /etc/hosts --exclude /etc/hostname --exclude /etc/udev/ --exclude /etc/network/interfaces --exclude /etc/resolv.conf  /etc /home /opt /tmp /usr /var /mnt
  • Is there any precautions before using tar? (the tar is one time creation from then I'll be using rsync)
  • Should I've to include any more directory like bin or lib? - suggest me
  • Should I've to exclude any directory? Like I had a network device (eth0) problem (failed to start up eth0). So in the above command I've excluded "/etc/udev/" and after this I felt this was fine. Like this, is there any thing I've to exclude from /etc/ or from any directory that I've included? - suggest me.
  • How could I schedule rsync (incremental bkp) with ssh combination to sync the directories (specified in tar) to the remote location (say /mnt/newdir) which I could tar and extract it later in case of system failure. Rsync can be scheduled to run as root user but the ssh will prompt for the password. FYI, sudo is completely disabled and as well as direct ssh login to root is also disabled.

If there is any better way without any harm to server to achieve this, can suggest.

[1]: http://ubuntuforums.org/showthread.php? t=525660


I would recommend you to use rsync instead, it will allow you to do a live system to system real synchronization without the need of temporary files. It also provides the benefit of doing incremental updates when you need to update the clone.

I would exclude only: /proc/ /sys /dev /tmp /mnt On the clone system you will need to make sure /etc/fstab and /boot/grub/grub.cfg are updated with the UUIDs of the clone systems partitions.

If you have a database like mysql you will need to be carefull and stop the DB before performing the copy.


First off, many of the IaaS cloud providers offer powerful snapshot capabilities that solve this quite easily.

On EC2, if you run an EBS based system, you can just periodically snapshot it. If something awful happens to the source instance, you can roll back to the previous snapshot on a brand new instance. If you want to archive a snapshot you can boot another instance with it attached, and use something like tar+s3 without negatively impacting the production box.

There are a number of problems with this approach which may not be apparent right now.

  1. You are locking yourself into a single technology. If you get this working on Ubuntu 10.10, and you want to go to 11.04, you have to upgrade the source system, then snapshot it again. Likewise, if you use EC2's EBS snapshots, you need a new solution if you go to rackspace cloud.
  2. You have no change history if you use rsync. If you modify something on system 1, then something breaks, you'll likely break your backup system too when you rsync.
  3. Rsync can be extremely high-impact on your production system.

What you really want is a config management system, and data high availability.

I'd recommend you choose a config management system, like puppet (in main!), chef, or cfengine. Start doing all of your configuration in the config management system, and then you can just boot a generic system, and apply the config management to it. Add in 'etckeeper' and you have history.

For data high availability, rsync should work, and be much more straight forward as you can just copy the data you want to. There's also drbd to have what amounts to a "network RAID1". These are not replacements for data backups, which should include historical snapshots (whether through block device snapshots or something like tar) rather than syncing to a recovery host (what if somebody deletes all the data which gets rsynced to the recovery box, deleting it all there too?)


The messages are likely caused because the new server clock is behind in time than the older one.

If you are cloning the package manager configuration and database (and you are), you should clone /bin, /sbin and /lib or the destination system will be in an inconsistent status. Another approach will be to exclude /etc/dpkg.info /etc/apt /var/lib/apt and /var/lib/dpkg and reinstall all the packages in the target system.

The files in /var/dpkg and /var/apt contain info about what is installed in your system. If you don't exclude them, the package manager will believe that all the programs and dependencies in the parent system are installed in the target. But if you didn't copy /bin, /sbin, etc... they will not. It's very likely that something will break on the next install or update.

To keep then synced with rsync I have always used certificated based authentication, not passwords. It's quite easy to setup, I remember that I did it just reading the man page the first time. Here is a quick guide, if you want more information I believe that this deserves a new question.