Verifying Integrity of Data

Always run:

cd /filesystem; \
find . -type f -exec md5sum {} \; >& /filesystem-md5.log

and then

cd /filesystem-new; \
md5sum -c /filesystem-md5.log

before and after copying a large amount of data.

You'll be surprised how much random data corruption you experience in the real world.

When you find a corrupt file, cmp -l badfile goodfile to attempt to understand the nature of the corruption.

This is why I beg for end-to-end integrity checking in all cases. Unfortunately filesystem and OS vendors do not take this seriously.


You can check Aide. I guess there's other integrity tools out there.

It creates a database from the regular expression rules that it finds from the config file. Once this database is initialized it can be used to verify the integrity of the files. It has several message digest algorithms (md5,sha1,rmd160,tiger,haval,etc.) that are used to check the integrity of the file. More algorithms can be added with relative ease. All of the usual file attributes can also be checked for inconsistencies. It can read databases from older or newer versions. See the manual pages within the distribution for further info. There is also a beginning of a manual.