Are Time Machine backups incremental? And is Time Machine any better on Snow Leopard?

I would like to understand how Time Machine backups work and has this been made any better in Snow Leopard?


Solution 1:

Yes, Time machine is incremental. OS X using an events driven agent, fsevents, to track which files change (no need for scanning every hour), and then, using modified hard-links, called multi-links for files which don't change, only those that do are incrementally changed. This is done hourly for the past 24 hours, daily backups for the past month, and weekly backups for everything older than a month.

To solve both problems, Time Machine does something new and different that actually required Apple to make changes to the underlying Mac file system, HFS+. The new change is referred to multi-links, which are similar to "hard links" common to Unix users and potentially available when using NTFS on Windows. Hard links differ from "soft links" (also known as symbolic links), which simply act as placeholders pointing to another file. The Mac OS has long used aliases as a way to create a soft link stand-in for another file or directory. Windows calls soft links "shortcuts." {source}

The real magic of Time Machine however, is the simplicity of its UI to recover whatever incremental date you want, and to be able to use spotlight to search back in time for your files. This is really where the magic sauce that makes TM so useful to most users comes from.

In Snow Leopard, the time to do the initial backup to a time capsule (and I assume other network attached drives) has been substantially improved, but I think the underlying technology is unchanged.

The next technological innovation for time machine is to do within-file deltas, as currently it is a file, not block based technology (thus inefficient with large files like entourage databases). ZFS, when it finally comes to OS X client will be the best tool to improve Time Machine functionality...

Update:

John Siracusa's as-always fantastic Snow Leopard review has this golden nugget:

Time machine itself was given support for overlapping i/o. Spotlight indexing, which happens on Time Machine volumes as well, was identified as another time-consuming task involved in backups, so its performance was improved. The networking code was enhanced to take advantage of hardware-accelerated checksums where possible, and the software checksum code was hand-tuned for maximum performance. The performance HFS+ journaling, which accompanies each file system metadata update, was also improved. For Time Machine backups that write to disk images rather than native HFS+ file systems, Apple added support for concurrent access to disk images. The amount of network traffic produced by AFP during backups has also been reduced.

All of this adds up to a respectable 55% overall improvement in the speed of an initial Time Machine backup. And, of course, the performance improvements to the individual subsystems benefit all applications that use them, not just Time Machine. {source}

And as I suggested abot the ZFS magic to come:

That's a shame because Time Machine veritably cries out for some ZFS magic. What's more, Apple seems to agree, as evidenced by a post from an Apple employee to a ZFS mailing list last year. When asked about a ZFS-savvy implementation of Time Machine, the reply was encouraging: "This one is important and likely will come sometime, but not for SL." ("SL" is short for Snow Leopard.) {source}