How to speed up rsync for small files
I'm trying to transfer thousands of small files from one server to another using the following command:
rsync -zr --delete /home/user/ [email protected]::backup
Currently the transfer takes a long time (I haven't timed it). Is there way to make this faster? Should I be using another tool? Should I be using rsync over ssh rather than using the rsync protocol?
Solution 1:
You need to determine the bottleneck. It isn't rsync. It probably isn't your network bandwidth. As @Zoredache suggested it is most likely the huge number of iops generated by all the stat()
calls. Any syncing tool is going to need to stat the files. While syncing run iostat
to verify.
So the question becomes; how to I optimize stat? Two easy answers:
- get a faster disk subsystem (on both hosts if need be) and
- tune your filesystem (e.g. for ext3 mount with
noatime
and add adir_index
).
If by some chance it isn't your disk iops that is the limit then you could experiment with splitting the dir tree into multiple distinct trees and run multiple rsyncs.
Solution 2:
Compression is not very useful for small files (say, less than 100 bytes). For small files, sometimes the compressed version can be even bigger than the original. Try the rsync
command without the -z
flag.
ssh
is good for security, but will not make the transfer faster. In fact, it would make the transfer slower due to the need for encryption/decryption.
rsync
may not seem fast the first time it is run because there is a lot of data to transfer. However, if you plan on running this command periodically, subsequent runs may be much faster since rsync
is smart about not transferring files that have not changed.
Solution 3:
In case ext3 or ext4 filesystems are involved, check, that both have the dir_index feature enabled! This tripled rsync-throughput in my case.
See details in my answer at: https://serverfault.com/a/759421/80414
Solution 4:
What version of rsync are you using? Anything older then 3.0.0 (on both ends) doesn't have the incremental filelist feature, which speeds up large transfers.