Checksumming to verify rsync transfers

So rsync runs some checksums in the course of deciding what to transfer (i.e. what blocks within a file). But is there any reason to trust the file you end up with on the receive side any more than you would for a normal network transfer? Should I run checksums after rsync finishes to verify the data? Is rerunning rsync with the pre-check (i.e. --checksum option) turned on an accepted way to accomplish this?


Solution 1:

In general the rsync checksum mechanism is fairly reliable. The tradeoff here is the usual one: you can do more verification but it will take more time. If you are really worried that a set of files most be exactly the same on two machines, you should run a separate verification. For example, you can use md5sum on the file list on both sides and compare the results. Assuming that the files don't change in the meantime (like log files) that will give you a very high confidence that the files are identical on both sides.

Solution 2:

Use rsync -Pahn --checksum /path/to/source /path/to/destination | sed '/\/$/d' | tee migration.txt

sed removes directories from the checksum verification. tee outputs to the screen and to the file at the same time.

Keep in mind that this might not be a suitable method if you have very large files, as the verification will take a long time.

Source