What is the most robust archive format?

It depends. The two most popular options are tarballs and zip files, but they both are lacking:

  • .tar tape archives are a very popular option for most Linux users. It preserves UNIX file permissions (which is important for a backup) and hard links. It's supported out-of-the box on every Linux distro I've tested, as well as some Windows programs like 7-zip. However, tar has several limitations and drawbacks for the back-up use-case, as explained by the Duplicity developers. It can be very slow: even to get a list of filenames stored in the archive, the entire archive must be read. It also doesn't handle detailed meta-data that some newer filesystems have.
  • .zip zip files acts as both an archive and a compression format. For speed, you can disable compression completely. Zip files are better than tape archives in that they store a type of table of contents, allowing programs to quickly jump to the specific file they need to extract. It also stores checksums for the contents of each file, to allow for easy file corruption detection. Zip files are extremely popular, unfortunately, they are not suitable for Linux back-ups because they do not store simple file permissions.

Here are two more options that are, sadly, also lacking:

  • .7z 7z compressed archives have some excellent features such as encryption and support for very large files. Unfortunately, it does not store UNIX file permissions, so it is not suitable for Linux back-ups.
  • .ar classic UNIX archives are the predecessor to tar archives, and suffer from the same limitations as tar archives.

In my opinion, there is no completely robust back-up archive format for Linux back-ups, none that are sufficiently well-known to warrant my trust, any way.

One way to overcome the limitations of each of these formats is to combine them: for example, archive each file individually in a tar archive, and then archive all of these tarballs in one zip file.

If you want a really robust back-up, you should probably look into these solutions instead:

  • back-up directly on to an external hard disk, with the same file system on both source and destination. This ensures that you will store each file's permissions and metadata exactly as intended. (As an aside, the owners and group owners of files are stored using their userid and groupid numbers, not their names.)

  • Use full-disk imaging and cloning software, like CloneZilla. You can't retrieve one file from one of these back-ups, but you can be absolutely sure that you have saved everything you possibly can.

And remember, always remember: you can only be confident of your back-ups if you have attempted to restore them. If the worst came to worst and your source hard-drive was completely destroyed, could you restore everything you need to restore to a new hard-drive? Would it work as you expect? Try restoring your back-up to a new hard-disk and try running from that hard-disk for a couple of days. If you notice anything missing, you know your back-up wasn't thorough enough.

Also think about where you are keeping your back-ups. You need at least some back-ups that are not in same building as the source disks to protect yourself from theft or fire. Some options for this are the cloud, or a friend's house.


A tarball (.tar files) would be the way to go. Use the gzip compression format for less compression, but a good speed. bzip2 is much slower but provides a better compression ratio. For binary data, there is not a big different though.

The command for compressing a directory using the gzip compression:

tar czf /path/to/save/backup.tar.gz directory-to-backup

To extract a gzip-compressed tarball while preserving file permissions:

tar xzpf /path/to/save/backup.tar.gz

Replace z by j for the bzip2 compression and add v before v (e.g. czvf and xzpvf) to print the filenames as they're archived / extracted.