Solution 1:

I just had a similar problem having to move several TB from one NAS to a different NAS with no backup/restore capability that would allow me to just feed 1 set to the other.

So I wrote this script to use xargs go run several rsyncs for each directory it encounters. It depends on being able to list the source directories (be careful to escape ARG 3) but I think you could set that stage with a non-recursive rsync that just copied files and directories to the appropriate level.

It also determines how many rsync's to run based on the number of processors but you might want to tweak that.

#! /bin/bash
SRC_DIR=$1
DEST_DIR=$2
LIST=$3
CPU_CNT=`cat /proc/cpuinfo|grep processor |wc -l`
#  pseudo random heuristic
let JOB_CNT=CPU_CNT*4
[ -z "$LIST" ] && LIST="-tPavW --exclude .snapshot --exclude hourly.?"
echo "rsyncing From=$SRC_DIR To=$DEST_DIR DIR_LIST=$LIST"
mkdir -p /{OLD,NEW}_NAS/home
[ -z "$RSYNC_OPTS" ] && RSYNC_OPTS="-tPavW --delete-during --exclude .snapshot --exclude hourly.?"
cd $SRC_DIR
echo $LIST|xargs -n1 echo|xargs -n1 -P $JOB_CNT -I% rsync ${RSYNC_OPTS} ${SRC_DIR}/%/ ${DEST_DIR}/%/

Solution 2:

GNU Parallel has a solution. 

I have moved 15 TB through 1 Gbps and it can saturate the 1 Gbps link.

The following will start one rsync per big file in src-dir to dest-dir on the server fooserver:

cd src-dir; find . -type f -size +100000 | \
parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \
  rsync -s -Havessh {} fooserver:/dest-dir/{}

The dirs created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:

rsync -Havessh src-dir/ fooserver:/dest-dir/

Solution 3:

Yes. Such a feature exists.

There is a utility called pssh that provides the described functionality.

This package provides parallel versions of the openssh tools. Included in the distribution:

  • Parallel ssh (pssh)
  • Parallel scp (pscp)
  • Parallel rsync (prsync)
  • Parallel nuke (pnuke)
  • Parallel slurp (pslurp)

I'm not sure how easy it is to set up, but it might just do the trick!

Solution 4:

I cannot comment, so I have added a new answer, with a little bit better code than the previous (nice & smart) code.

Check the rsync line, because it contains an optional ionice tweak.

#!/bin/bash
start_time=$(date +%s.%N)
# Transfer files in parallel using rsync (simple script)
# MAXCONN: maximum number "rsync" processes running at the same time:
MAXCONN=6
# Source and destination base paths. (not need to end with "/")
SRC_BASE=/home/user/public_html/images
[email protected]:/home/user/public_html/images
RSYNC_OPTS="-ah --partial"
# Main loop:
for FULLDIR in $SRC_BASE/*; do
    NUMRSYNC=`ps -Ao comm | grep '^'rsync'$' | wc -l `
    while [ $NUMRSYNC -ge $MAXCONN ]; do
        NUMRSYNC=`ps -Ao comm | grep '^'rsync'$' | wc -l `
        sleep 1
    done
    DIR=`basename $FULLDIR`
    echo "Start: " $DIR
    ionice -c2 -n5 rsync $RSYNC_OPTS $SRC_BASE/${DIR}/ $DST_BASE/${DIR}/ &
    # rsync $RSYNC_OPTS $SRC_BASE/${DIR}/ $DST_BASE/${DIR}/ &
    sleep 5
done

execution_time=$(echo "$(date +%s.%N) - $start" | bc)
printf "Done. Execution time: %.6f seconds\n" $execution_time

Solution 5:

Looks like someone has written this utility for you. It breaks the transfer into parallel chunks. This is a better implementation than the "parallel big file" version listed under GNU Parallel:

https://gist.github.com/rcoup/5358786

Also, lftp can parallelize file transfers via ftp, ftps, http, https, hftp, fish, sftp. A lot of times, there are some advantages to using lftp, because managing permissions, restricted access, etc for rsync can be challenging.