Time Machine size explodes when copied to new drive
I'm trying to replace a 640GB USB2 drive which I use as my Time Machine backup device with a 1TB FireWire 400 drive without losing the current backup history. Given that the 640GB drive is nearly full and the 400Mbps transfer limit of this combination, I realized that this process could take quite some time and might get interrupted in the middle. As a result, I decided to try doing it with rsync
instead of Finder (as Apple suggests). After some false starts and some searching online I settled on the following rsync
command:
rsync -aHXSvPh --hfs-compression --protect-decmpfs /Volumes/Macintosh\ BK/Backups.backupdb /Volumes/Untitled
However, this command is still causing significant bloat on the destination drive (to the point where I don't expect the contents of the old drive to fit on the new one, despite the new drive being about 1.5 times larger). Are there any rsync
options which I'm missing which would eliminate this bloat (I'm using v3.1.2 protocol version 31)?
It has also occurred to me that perhaps I'm using the wrong tool for the job. Would a block copy tool like dd
be more appropriate? If so, how would I set that up so as to make the process resumable in the event of an interruption (such as one caused by a full system crash, something which happened to me twice while running the rsync
command)? I've never used dd
before and so am unfamiliar with its abilities (but do have access to both the version that comes packaged with Mac OSX and the GNU version 8.25).
dd
doesn't adjust the various volume data structures based on the target volume size, so it's only appropriate if the source and destination volumes are exactly the same size. Generally, I recommend asr
instead, but it doesn't have a good way to continue after a crash.
So, one possibility would be to shrink the target volume to match the source, use dd
to copy the raw volume, then expand the target back to the full 1TB. I haven't tested this, but I think this is the process you'd need:
-
diskutil list
will list the volume's device identifiers (e.g. disk2s3 is the third slice (partition) of physical disk #2) -
diskutil info <sourcevolumeid>
will list the source volume size in bytes (along with lots of other info) -
diskutil resizeVolume <targetvolumeid> <sourcevolumesize>B
(the "B" means "bytes" -- see the "SIZES" section of thediskutil
man page). -
diskutil unmount <sourcevolumeid>
anddiskutil unmount <targetvolumeid>
-- don't usedd
on mounted volumes! -
sudo dd if=/dev/r<sourcevolumeid> of=/dev/r<targetvolumeid>
to do the copy. Note the "r" prefix on the device names -- this bypasses the OS disk buffers, and in my experience makesdd
run much faster. Be very careful to get the volume IDs right, or you may copy a blank volume over your backup! - Finally, use either
diskutil resizeVolume
or Disk Utility to expand the target volume out to the full disk's size.
Oh, and a warning: this process assumes neither the source or destination is being managed by Core Storage. If they are (e.g. if they're encrypted), things get a bit more complicated.
The short answer is that Time Machine makes heavy use of hardlinks, which is where one underlying file – one stream of bytes – appears as multiple filenames in multiple directories on the filesystem. If you do a copy that isn't savvy about hardlinks, then that underlying file will get copied multiple times (once for each time it's hardlinked).
So let's say you have a 1GB disk image file, that's been there from the beginning, so it shows up as a hardlink in all of your periodic Time Machine backups, and you have 100 Time Machine backups. If you copy that without being smart about hardlinks, You'll end up with 100 copies of that 1GB file, for 100GB total, instead of just 1GB with 100 hardlinks pointing at it.
I'm sure Apple knows their Finder-copy advice preserved hardlinks properly. I would expect that Disk Utility's "Restore" functionality does as well. Don't let the "Restore" name fool you, it's not just for restoring from backups; it's Disk Utility's way to do a block copy from one volume (or disk image) to another.
Updated to add:
Hmm, rsync -H
should have preserved hardlinks. Are you sure your destination volume is HFS+, or another filesystem that OS X trusts with hardlinks?