Achieving very fast (300+MB/sec) file transfer on Linux

I'm working with high end hardware, however I'm hitting cpu bottlenecks in all situations when attempting to move large amounts of data.

Specifically, I'm moving large virtual machine image (VHD) files of 2TB between two Ubuntu hosts.

My latest attempt took 200 minutes to transfer 2TB. Resulting in a throughput of about 170MB/sec transfer.

I'm trying techniques such as netcat, and scp with basic arcfour cipher.

The hardware on each end is 6 x enterprise grade SSDs in RAID 10, on a hardware raid controller. 256GB memory, and Xeon V4 CPUs. Network is 20Gbe ( 2 x 10Gbe LACP ).

In all cases, the network and disk i/o has plenty of capacity left, the bottleneck is pegging 1 CPU core to 100% constantly.

I've performed basic benchmarks using various methods, as follows:

30GB test file transfer

scp: real 5m1.970s

nc: real 2m41.933s

nc & pigz: real 1m24.139s

However, because I dd'd an empty file for testing, I don't believe that pigz was having to work too hard. When I attempted pigz on a production VHD file, pigz hit 1200% CPU load, I believe this started to become the bottleneck. Therefor my fastest time was set by nc on it's own.

nc hits 100% CPU on each end, I'm assuming just processing the i/o from the disk to the network.

I did think about splitting the file into chunks and running multiple nc to make use of more cores, however, someone else may have a better suggestion.


Solution 1:

A few thing to try:

  • use a program that uses sendfile (e.g. apache)
  • tune the Linux network stack and NIC
  • enable a larger MTU
  • enable NIC offloading
  • use a better performing filesystem (xfs or zfs)

The ESnet Fasterdata Knowledge Base is a great resource for optimizing moving data across fast networks.