File copy program that generates checksums of the data while copying too

My question in short: is there a tool that copies a file from directory A to B while, simultaneously, generating the checksum of the file it has copied/read, without doing an extra read/pass just to generate the said checksum?

I will be copying a few TBs of files from one HDD to another and instead of:

  1. Copy files from HDD1 -> HDD2 (X hours)
  2. Generate checksums of files on HDD1 (Y hours)
  3. Verify checksums of files on HDD2 (~Y hours)

I was thinking of a more streamlined process:

i. Copy files from HDD1 -> HDD2 and generate checksums of the files copied as well (Z hours)

ii. Verify checksums of files on HDD2 (~Y hours)

My assumption is that Z ~= X because the program that can so this will have read the complete file (as it's copying it from one HDD to another) and hence does not need to read the file again just to generate its checksum.

Now I know this idea of mine might not work, if for example, the OS uses DMA to copy the file, and I am not sure what technique Windows 7 uses to copy files from one HDD to another.

Any suggestions to this effect will be appreciated - specially speeding up the copying process and making sure the transfer is 1:1 without corruption or missing files.


Solution 1:

Your assumption is not totally correct since bigger files are definitely not stored in memory and in order to increase speed of copying, files are copied in specific size chunks (in Linux, you play around with the size of that chunk in order to increase speed of operations with files). And yes, files are cached in memory. As for DMA - the whole point of this technology is avoiding CPU when copying files and putting them to RAM straight away, so it does not go directly from HDD to HDD. DMA stands for Direct Memory Access.
I would suggest using specific Linux LiveCD solution (such as rsync or very simple scripts), but I understand that this would probably cost more time than save, so it's better if you'd stick with Windows. Try out these:
http://technet.microsoft.com/en-us/magazine/2006.11.utilityspotlight.aspx
http://www.karenware.com/powertools/ptreplicator.asp
http://sourceforge.net/projects/rsyncwin32/
http://codesector.com/teracopy

EDIT
There is a newer, more powerful edition of Microsoft's ROBOCOP: http://technet.microsoft.com/en-us/magazine/2009.04.utilityspotlight.aspx

EDIT 2
If during replication you'll find that something was corrupted, I would doubt that it is safe to use HDD2 for data storage in the long run (as only more sectors will become corrupted).