Copying terabytes of hundreds of thousands of files in folder slow

I am currently running FreeNAS and using SMB 3 on windows machines to copy folders with 80000+ files that are all about 35MB each. Here is the config

FreeNAS

  • 2x40Gbps connections bonded
  • connection SMB Share with SMB 3.1 enabled
  • 1 Xeon 8 core with 512GB of RAM
  • 400TB of storage RAID Z1 using 4TB drives for more iops
  • 23 groups of 5 drives per RAID group
  • 3x LSI 3008 SAS 3.0 12Gb/s Host Bus Adpaters
  • Similiar config can be made on thinkmate.com using the SUPERSTORAGE SERVER 6048R-E1CR72L as a base and then add on expansion chassis
  • Jumbo Frames enabled
  • during transfers CPU usage is at about 50%
  • during transfers RAM usage is at 60%

Workstations

  • Windows 10 Pro
  • i7 3.6Ghz and 16GB of RAM
  • 512GB m.2 drive
  • 40Gbps card in PCI 3.0 16x slot
  • Jumbo frames enabled
  • TCP Offload disabled
  • External RAID 0 (3 or 4 disk) drives are connected via USB-C
  • CPU usage during transfers is at 20%
  • RAM usage during transfers is at 15%

So I have these RAID 0 drives with about 4TB of files each, and each file is 35MB. Each folder has about 80000 files. 8 Simultaneous transfers, across 8 workstations.

When I use robocopy to copy the files over. I get about 1.8Gbps transferring them over. Then as time elapses and the copy gets deeper and deeper into those files that speed drops to about 600Mbps. This happens whether or not I'm using /MT:10 of /MT:1 on robocopy. EMCopy hasn't faired much better, and freefilesync wants to die after about 3 hours. I want it to at least stay stable at 1.8Gbps instead of constantly dropping. It also becomes unresponsive to browse the shares on the workstations during these transfers. Has anyone else experienced this?


The root cause for the slow transfer rate is, possibly, the fact that the workstation M2 drives need to do a lot of random reads.

The fast NVMe M2s (that you are most likely using, I think) are advertised with up to multiple GB/s r/w speeds. That is true for sequential reads for big files, but in your situation you will have random reads instead. Random read rates for common consumer/prosumer NVMe M2 SSDs range from 70MB/s to 110MB/s, that is within your rate of 600Mbps. Reviews of SSDs will often include random read speed results which is where I got that range from.

There are SSDs such as Intel Optane SSDs that can deliver random read speeds in the ballpark of around 500MB/s.

Furthermore you state that you connect the drives via USB-C. Depending on what technology is used, USB3.0, 3.1, 3.2 or Thunderbolt, this connection might cause slowdowns as well. Internal NVMe M2 drives (or other faster PCI-e based ones) might resolve the problem.

To prove or invalidate my assumption, you can use the Windows 10 task manager or the performance monitor. The task manager will give you a percentage of how busy the drives are. If the drive(s) in question sit at 100% or at anything above 80%, then they are likely limiting the speeds. On the other hand, if it is idling, then it is not limiting. Disclaimer: I do not know how reliable the busy percentages of the windows task manager are, especially not for external drives.

If it turns out that the drives on the source side are not busy at all, you might want to check the destination side and see how the drives are doing there (you can use the tool iostat for that).

If none of this helps you because you were able to exclude both the drives on the source and the destination sides as being the root cause for the issue, then I suggest you start with basic troubleshooting steps. For example, you can transfer a big file across and see if this transfer suffers the same limitations. You could reverse the transfer direction and copy some of the small files back onto the workstations. If just the reversal leads to much better speeds, then perhaps there is one component that limits only when reading and not when writing, or vice versa.

Or try to rule out some components by attaching devices directly with no extra switch in between or whatever you can remove from the scenario for testing.


Ok looks like the issue is resolved now. Here was the solution.

In the /etc/samba/smb-shares.conf.local

This line was added to the share we are using

case sensitive = yes

Now we are transferring at a stable 200MBps. While not the ideal speed, it is not decreasing in speed overtime. This fixes the speed decreasing issue.