Inverse multiplexing to speed up file transfer
Proof it all adds up - I present the 'holy grail' of remote mirror commands. Thanks to davr for the lftp
suggestion.
lftp -c "mirror --use-pget-n=10 --verbose sftp://username:[email protected]/directory"
The above will recursively mirror a remote directory, breaking each file into 10 threads as it transfers!
There are a couple tools that might work.
-
LFTP - supports FTP, HTTP, and SFTP. Supports using multiple connections to download a single file. Assuming you want to transfer a file from remoteServer to localServer, install LFTP on localServer, and run:
lftp -e 'pget -n 4 sftp://[email protected]/some/dir/file.ext'
The '-n 4' is how many connections to use in parallel.
Then there are the many 'download accelerator' tools, but they generally only support HTTP or FTP, which you might not want to have to set up on the remote server. Some examples are Axel, aria2, and ProZilla
If you have few and large files use lftp -e 'mirror --parallel=2 --use-pget-n=10 <remote_dir> <local_dir>' <ftp_server>
: you willll download 2 files with each file split in 10 segments with a total of 20 ftp connections to <ftp_server>
;
If you have a large amount of small files, then use lftp -e 'mirror --parallel=100 <remote_dir> <local_dir>' <ftp_server>
: you'll download 100 files in parallel without segmentation, then. A total of 100 connections will be open. This may exaust the available clients on the server, or can get you banned on some servers.
You can use --continue
to resume the job :) and the -R
option to upload instead of download (then switching argument order to <local_dir> <remote_dir>
).
You may be able to tweak your TCP settings to avoid this problem, depending on what's causing the 320KB/s per connection limit. My guess is that it is not explicit per-connection rate limiting by the ISP. There are two likely culprits for the throttling:
- Some link between the two machines is saturated and dropping packets.
- The TCP windows are saturated because the bandwidth delay product is too large.
In the first case each TCP connection would, effectively, compete equally in standard TCP congestion control. You could also improve this by changing congesting control algorithms or by reducing the amount of backoff.
In the second case you aren't limited by packet loss. Adding extra connections is a crude way of expanding the total window size. If you can manually increase the window sizes the problem will go away. (This might require TCP window scaling if the connection latency is sufficiently high.)
You can tell approximately how large the window needs to be by multiplying the round trip "ping" time by the total speed of the connection. 1280KB/s needs 1280 (1311 for 1024 = 1K) bytes per millisecond of round trip. A 64K buffer will be maxed out at about 50 ms latency, which is fairly typical. A 16K buffer would then saturate around 320KB/s.