NFS performance woes on Debian
I am having very inconsistent performance with NFS between two wheezy machines, and I can't seem to nail it down.
Setup:
Machine 1 'video1': Dual 5506 w/12GB ram, XFS on 8x3TB RAID6 exported as 'video1' from '/mnt/storage'
Machine 2 'storage1': Phenom X2 @ 3.2Ghtz w/8GB ram, ZFS on 5x2TB exported as 'storage1' from /mnt/storage1-storage
Local write performance:
mackek2@video1:/mnt/storage/testing$ dd if=/dev/zero of=localwrite10GB bs=5000k count=2000
2000+0 records in
2000+0 records out
10240000000 bytes (10 GB) copied, 16.7657 s, 611 MB/s
Local read performance:
Both are connected to the same HP gigabit switch, and iperf gives rock solid 940mbps both ways.
My problem is that when I write to the video1 export from storage1, performance is all over the place. It seems for the first few (5-7) gigs of file transfer (I'm hoping to move around 30-120GB AVCHD or MJPEG files as quickly as possible), performance goes from 900mbps, down to 150-180mbps, so times as slow as 30mbps. If I restart the NFS kernel server, performance picks back up for a few more gigs.
mackek2@storage1:/mnt/video1/testing$ dd if=/dev/zero of=remoteWrite10GB count=2000 bs=5000K
2000+0 records in
2000+0 records out
10240000000 bytes (10 GB) copied, 223.794 s, 45.8 MB/s
mackek2@storage1:/mnt/video1/testing$ dd if=/dev/zero of=remoteWrite10GBTest2 count=2000 bs=5000K
2000+0 records in
2000+0 records out
10240000000 bytes (10 GB) copied, 198.462 s, 51.6 MB/s
mackek2@storage1:/mnt/video1/testing$ dd if=/dev/zero of=bigfile776 count=7000 bs=2000K
7000+0 records in
7000+0 records out
14336000000 bytes (14 GB) copied, 683.78 s, 21.0 MB/s
mackek2@storage1:/mnt/video1/testing$ dd if=/dev/zero of=remoteWrite15GB count=3000 bs=5000K
3000+0 records in
3000+0 records out
15360000000 bytes (15 GB) copied, 521.834 s, 29.4 MB/s
When things are going fast, nfsiostat on the client gives average RTTs of a few ms, but it shoots up to over 1.5seconds RTT as soon as performance drops. Additionally, the CPU queue depth jumps up to over 8 while the write is happening.
Now, when reading from the same export, I get beautiful 890Mbps give or take a few mbps for the entire read.
mackek2@storage1:/mnt/video1/testing$ dd if=remoteWrite10GBTest2 of=/dev/null
20000000+0 records in
20000000+0 records out
10240000000 bytes (10 GB) copied, 89.82 s, 114 MB/s
mackek2@storage1:/mnt/video1/testing$ dd if=remoteWrite15GB of=/dev/null
30000000+0 records in
30000000+0 records out
15360000000 bytes (15 GB) copied, 138.94 s, 111 MB/s
The same thing happens the other way around with storage1 as the NFS server. CPU queue jumps up, speeds drop to crap, and I pull my hair out.
I have tried increasing the number of NFS daemons to as many as 64, and it still sputters out after a few gigs.
You don't include your mount or export options, so there's a number of things with NFS that could be impacting performance. I'd recommend trying the following options for maximum NFS performance and reliability (based on my experiences):
Mount Options:
tcp,hard,intr,nfsvers=3,rsize=32768,wsize=32768
Export Options:
async