Multi Threaded file sync between 2 Linux servers

At the moment i'm running rsync for 2.2 million files total of 250GB and that just takes ages 700K files in 6 hours.

Does anyone know a rsync like tool that can do this with multiple threads so it goes faster?


Solution 1:

I doubt cpu is the limiting factor here. You're most likely limited by both network bandwidth for the transfer, and disk IO; especially latency for all those stat calls.

Can you break down the filesystem hierarchy into smaller chunks to process in parallel?

What are the source files, and what's writing or modifying them? Would it be possible to send changes as they happen at the application level?

Solution 2:

If the disk subsystem of the receiving server is an array with multiple disks, running multiple rsync processes can improve performance. I am running 3 rsync processes to copy files to an NFS server (RAID6 with 6 disks per raid group) to saturate Gigabit Ethernet.

This guy reports on a basic python harness that spawns multiple rsync processes http://www.reliam.com/company/featured_geek