Check integrity of .tar.gz backup

If you backup your system and create a .tar.gz with thousands of files and directories, how do you easily check its integrity? (ie without having to decompress).

It's good to keep backups, but it's even better to actually make sure your backups will work when the time comes... and I'd like to know an easy way to do this.


In addition to the test suggested by Ignacio Vazquez-Abrams you might also want to create separate checksum files for your tarballs. While it doesn't catch tarballs which got corrupted during their initial creation, it will allow you to later verify that the files are still in their original state, and that they haven't gotten corrupted by disk or file system problems.

Assuming the backup backup_20110724.tar.gz is created, you might also want to have something along the following command being run:

sha256sum backup_20110724.tar.gz > backup_20110724.tar.gz.sha256

It will then create a file containing something like this:

6d5fc8993d88739247ab76c01ddf3164e585b3ed02a63362cd209d2ba463eda5  backup_20110724.tar.gz

You can then later verify the integrity doing something along these lines:

sha256sum -c backup_20110724.tar.gz.sha256

Tarballs don't have any way of checking their integrity other than decompressing, since they don't actually store checksums anywhere. Fortunately, listing the contents is usually enough to check.

tar ztvf foo.tar.gz > /dev/null

If you do want checksums then you'll need to use another format, such as zip or 7-zip.


Generate the shawhatever at the same time you do the backup - it's a little safer, and usually faster too:

tar cjflS - /home | mtee 'cat > backup_20201207.tar.xz' 'sha256sum - > backup_20201207.sha256'

mtee is available from https://stromberg.dnsalias.org/~strombrg/mtee.html Full disclosure: I wrote it.