Is there a copy-and-verify command in Ubuntu/Linux?
Solution 1:
From man rsync
, under -c
option:
-c, --checksum: skip based on checksum, not mod-time & size
Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is gener‐ ated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option’s before-the-transfer "Does this file need to be updated?" check.
Solution 2:
Several years ago I had the same demands as you do. The solution I chose was to use ZFS via the ZFS-FUSE driver on my storage server. My thinking was that my personal photos, scanned documents, and other similar files were things that I may access only occasionally, so it may be a very long time, say a year or more, before I notice that a file has been corrupted due to a drive error or the like.
By that time, all of the backup copies I have may be this bit-rotted version of the file(s).
ZFS has a benefit over RAID-5 in that it can detect and repair errors in the data stored on the individual discs, even if the drives do not report a read error while reading the data. It will detect, via checksums, that one of the discs returned corrupted information and will use the redundancy data to repair that disc.
Because of the way the checksumming in ZFS is designed, I felt that I could rely on it to store infrequently used data for long periods of time. Every week I run a "zpool scrub" which goes through and re-reads all the data and verifies checksums.
ZFS-FUSE has performed quite well for me over the last few years.
In the distant past, for a client, I implemented a database system that stored checksum information on all files stored under a particular directory. I then had another script that would run periodically and check the file against the checksum stored in the database. With that we could quickly detect a corrupted file and restore from backups. We were basically implementing the same sorts of checks that ZFS does internally.
Solution 3:
I found this utility (Linux and Windows) that does just what you want (hashed copy+hashed verification with log): http://sourceforge.net/projects/quickhash/
The only downside being that it only exists as a GUI (no command line access)
Since v1.5.0, a selected source folder can be hashed, then copied & reconstructed to a destination folder where the content is again hashed for verification. Since 1.5.5, selected file masks can be used, too (*.doc; *.xls etc).
Solution 4:
https://sourceforge.net/projects/crcsum/ It extends Linux cp & mv with checksum verification