Few days ago I noticed something rather odd (at least for me). I ran rsync copying the same data and deleting it afterwards to NFS mount, called /nfs_mount/TEST. This /nfs_mount/TEST is hosted/exported from nfs_server-eth1. The MTU on both network interfaces is 9000, the switch in between supports jumbo frames as well. If I do rsync -av dir /nfs_mount/TEST/ I get network transfer speed X MBps. If I do rsync -av dir nfs_server-eth1:/nfs_mount/TEST/ I get network transfer speed at least 2X MBps. My NFS mount options are nfs rw,nodev,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountvers=3,mountproto=tcp.

Bottom line: both transfers go over the same network subnet, same wires, same interfaces, read the same data, write to the same directory, etc. Only difference one is via NFSv3, the other one over rsync.

The client is Ubuntu 10.04, the server Ubuntu 9.10.

How come rsync is that much faster? How to make NFS match that speed?

Thanks

Edit: please note I use rsync to write to NFS share or to SSH into the NFS server and write locally there. Both times I do rsync -av, starting with clear destination directory. Tomorrow I will try with plain copy.

Edit2 (additional info): File size ranges from 1KB-15MB. The files are already compressed, I tried to compress them further with no success. I made tar.gz file from that dir. Here is the pattern:

  • rsync -av dir /nfs_mount/TEST/ = slowest transfer;
  • rsync -av dir nfs_server-eth1:/nfs_mount/TEST/ = fastest rsync with jumbo frames enabled; without jumbo frames is a bit slower, but still significantly faster than the one directly to NFS;
  • rsync -av dir.tar.gz nfs_server-eth1:/nfs_mount/TEST/ = about the same as its non-tar.gz equivalent;

Tests with cp and scp:

  • cp -r dir /nfs_mount/TEST/ = slightly faster than rsync -av dir /nfs_mount/TEST/ but still significantly slower than rsync -av dir nfs_server-eth1:/nfs_mount/TEST/.
  • scp -r dir /nfs_mount/TEST/ = fastest overall, slightly overcomes rsync -av dir nfs_server-eth1:/nfs_mount/TEST/;
  • scp -r dir.tar.gz /nfs_mount/TEST/ = about the same as its non-tar.gz equivalent;

Conclusion, based on this results: For this test there is not significant difference if using tar.gz large file or many small ones. Jumbo frames on or off also makes almost no difference. cp and scp are faster than their respective rsync -av equivalents. Writing directly to exported NFS share is significantly slower (at least 2 times) than writing to the same directory over SSH, regardless of the method used.

Differences between cp and rsync are not relevant in this case. I decided to try cp and scp just to see if they show the same pattern and they do - 2X difference.

As I use rsync or cp in both cases, I can't understand what prevents NFS to reach the transfer speed of the same commands over SSH.

How come writing to NFS share is 2X slower than writing to the same place over SSH?

Edit3 (NFS server /etc/exports options): rw,no_root_squash,no_subtree_check,sync. The client's /proc/mounts shows: nfs rw,nodev,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountvers=3,mountproto=tcp.

Thank you all!


Solution 1:

Maybe it's not slower transfer speed, but increased write latency. Try mounting the NFS share async instead of sync and see if that closes the speed gap. When you rsync over ssh, the remote rsync process writes asynchronously (quickly). But when writing to the synchronously mounted nfs share, the writes aren't confirmed immediately: the NFS server waits until they've hit disk (or more likely the controller cache) before sending confirmation to the NFS client that the write was successful.

If 'async' fixes your problem, be aware that if something happens to the NFS server mid-write you very well might end up with inconsistent data on disk. As long as this NFS mount isn't the primary storage for this (or any other) data, you'll probably be fine. Of course you'd be in the same boat if you pulled the plug on the nfs server during/after rsync-over-ssh ran (e.g. rsync returns having 'finished', nfs server crashes, uncommitted data in the write cache is now lost leaving inconsistent data on disk).

Although not an issue with your test (rsyncing new data), do be aware that rsync over ssh can make significant CPU and IO demands on remote server before a single byte is transfered while it calculating checksums and generating the list of files that need to be updated.

Solution 2:

NFS is a sharing protocol, while Rsync is optimized for file transfers; there are lots of optimizations which can be done when you know a priori that your goal is to copy files around as fast as possible instead of providing shared access to them.

This should help: http://en.wikipedia.org/wiki/Rsync