Over gigabit connection, Teracopy does 31MB/s, but Windows 8 does it at ~109MB per second?

I got my brain-melting first taste of Gigabit networking today, between my 2011 MacMini and Windows 8 Pro desktop connected via Cat.5e to Linksys WRT320N(sporting dd-WRT).

After making sure that the line speed on both systems showed 1Gbps, I proceeded to copying a 2.4GB MP4 from the Mini to the Win 8 desktop (SMB sharing). Although satisfied with the 30-34 MB/s that Teracopy was showing (that was a proper step-up for me from 10 MB/s), I still was curious about this massive difference in the advertised and real-world speed.

2 hours of Google had me believing that there were other factors that resulted in less speed, SMB being one. So just for the sake of doing it, I iPerf'd both the systems and guess what that showed - around 875mbps on both systems!

I then stumbled upon this little piece of info after which I turned off Teracopy and copied the same file through Windows 8's regular copier. 109 MB/s. Molten brains :)

What exactly is causing this? And can I enable such speeds via Teracopy? I really dig the extra features that Teracopy has, will surely miss them now :D


Over gigabit connection, Teracopy does 31MB/s, but Windows 8 does it at ~109MB per second?

What exactly is causing this? And can I enable such speeds via Teracopy? I really dig the extra features that Teracopy has, will surely miss them now :D

Two words: verification and cache

Technical Explanation

This is the general procedure for copying a file with Windows Explorer:

  1. Read a chunk of data from the source drive into memory
  2. Send the chunk through the system(s)
  3. Write the chunk to the destination drive
  4. If not done yet, return to step 1

This seems simple and short enough. With this transfer algorithm, each byte of the file is gets processed only two times: one read, one write.

But in addition, Windows uses memory (as does the drive itself) to cache some data. So instead of waiting for the previous chunk to finish getting written, and then reading the next chunk, a new chunk can be read while the previous one is still being written. Obviously this can’t hold up forever, but Windows can use up all remaining free memory as a temporary buffer to store most, if not all of the the file that is being read in memory.

You can see the caching in action by copying a large file—or a folder containing a lot of files—from one drive to another, then immediately comparing the two. The comparison will be much faster at that point than if you do it later on because the file(s) are still in memory, so it is not actually reading them from the drive(s).

Since memory is very fast, and read speeds tend to be a little faster than write speeds, the ultimate transfer rate ends up being limited only by the write speed of the destination drive.

Teracopy can do two things that can slow down a file transfer which Explorer does not do:

  • Forgo the cache and read directly from the drive

  • Verify that the destination was written correctly

Unlike Explorer which only checks for basic errors during the transfer, Teracopy can actually verify that the data was written correctly to the destination drive in order to prevent data corruption which could happen due to problems in the transfer media (network/drive cable/etc.) or the drive itself (bad sector, etc.) Doing this means that it has to read the file from the destination to compare it to the original.

Depending on the algorithm used and the size of the file, verification can be optimized as low as (but no less than) three drive operations for each file/file-chunk as opposed to Explorer’s two: read the file from the source, write the file to the destination, and read the file from the destination.

Look at what happens when you copy a file with Teracopy (with an HDD-optimized algorithm):

  1. Read a chunk from the source drive
  2. Hash the chunk read from the source drive
  3. Send the chunk through the system(s)
  4. Write it to the destination drive
  5. Clear caches
  6. Read the chunk back from the destination drive
  7. Hash the chunk from the destination drive
  8. Compare hashes
  9. Determine next step
    • If hashes don’t match, give error and prompt user for action
    • If hashes matched and not finished, go back to step 1

The problem is that if you cache the files during transfer, the comparison becomes useless because you are not reading the actual data on the destination drive, you are reading the copy cached in memory from the source. Therefore, to properly verify, you must clear the cache. This can be done after each read and write (which is an extra operation that would end up getting done countless times for files of any significant size), or just once after the whole file has been transfered.

According to the screenshot below, TeraCopy performs verification after the files are transferred, not during. This means that it uses this CPU/RAM-optimized transfer algorithm instead:

  1. Read a chunk from the source drive
  2. Send the chunk through the system(s)
  3. Write it to the destination drive
  4. If not done yet, return to step 1
  5. Copy finished, so clear caches and being verification
  6. Read a chunk from the source drive
  7. Hash the chunk from the destination drive
  8. Read a chunk from the destination drive
  9. Hash the chunk from the destination drive
  10. Compare hashes
  11. Determine next step
    • If hashes don’t match, give error and prompt user for action
    • If hashes matched and not finished, go back to step 6

While this algorithm puts slightly less strain on the CPU and RAM, it also puts a lot more strain on the drive(s) because now each file has to be processed four times: read the entire file from the source, write it to the destination, then read it again from the source, and again from the destination.

(If TeraCopy were optimized for network transfers, then it could avoid sending the whole file a second time for verification and send only the hashes which are much smaller, but that does not currently appear to be the case, and network drivers are treated the same as local drive, so it actually re-reads the source.)

By default, TeraCopy does not verify or use the cache. Not verifying would speed the transfer up (or more accurately, not slow it down), while not using cache would slow it down.

Application

To determine your specific speed results, you would have to check your settings to see if you have changed them. Then you can try to approximate what kind of speeds you would get with the different settings (be aware that they will likely interact, so it is not a simple matter of adding or subtracting).

That said, let’s do a cursory calculation using your numbers:

  • Local file transfer: 875Mbps (i.e., drive speed)
  • Network speed: 1Gbps = 119.21 MBps
  • Copy through Windows Explorer: 109MBps
  • Copy through TeraCopy: 34MBps

Right off the bat, we see that Explorer’s file-transfer is nearly maxing out your network throughput. A 1Gbps link is equal to 1,000,000,000 bits per second, which in binary units of bytes, is 119MBps. Explorer is clocking 109MBps, and the remaining 10MBps (which amusingly enough was your previous max :^Þ) can easily be accounted for by overhead, background load, and fragmentation.

(Since the transfer rate is almost equal to the network speed, we can surmise that Explorer’s file transfer is one-way and only a single copy of each file is getting sent.) Explorer gets 110MBps for two file accesses per file.

Now for TeraCopy. It seems that TeraCopy is getting almost exactly one-third of Explorer’s speed.

Depending on whether or not its preferences dialog is accurately indicating the algorithm it uses, then TeraCopy could actually be sending the entire source file twice so that it can check the copy. Right away, this cuts the throughput in half. If the cache is on, then that too reduces the speed because it has to wait for each write to finish before it can send a new chunk. When combined with verification, it can knock it down even more.

Your transfer rate of 34MBps seems reasonable if you have verification on and caching off. If you turn verification off and caching on, you should get about the same as Explorer (you may still get a slightly less or even slightly more depending on just how different the file-transfer code that TeraCopy uses is from that which Explorer uses).

If you’re in the mood to transfer nearly 10GB, then you could also just try altering the settings and redoing the transfer for each of the four permutations and note down the speeds you get (to be safe, order it so that the cache is off between runs: V+C+, V+C-, V-C+, V-C-).


Screenshot of TeraCopy preferences dialog with default settings

I would suspect it's related to the fact that Windows 8 uses SMB v. 3 and I doubt teracopy is compatible with SMB v.3 and is defaulting back to SMB 2. Just a guess...