Appropriate network file system for huge (5+ Gb) files

Solution 1:

Since you are doing performance analysis, the first question should be: "what is the data I am basing the assumption on? Are there network traces or other performance data that would support this hypothesis?"

There are a lot of possible bottlenecks in such a system, and I would question the choice of the network filesystem last, especially since you do not appear to write significant amounts of data and locking / concurrency and the accompanying latency issues would be the most likely bottleneck causes with NFS.

On the other hand, 32 concurrent requests for 8 GB of data each are likely to overload any single SATA disk due to the rather limited IOPS rating of a single disk. A simple calculation assuming a reading block size of 64 KB per request and 100 IOPS for the disk would yield a rate of just 6,4 MB/s for random read requests - which is what you will be getting with that number of simultaneous readers unless you are caching the data heavily.

You should take a good look at performance indicators provided by iostat to see if your disk is not being overloaded. And if it is, take appropriate measures (i.e. get a decent storage subsystem capable of coping with the load) to remedy the situation.

Solution 2:

This is most likely not a limitation of NFS you are encountering here.

Also take into account that those 5 GBytes take at the very, very least 40s to transfer at gigabit wire speed - for each client. You have 32 of them hammering the head2, and they're not likely to request the same blocks at the same time. Add Ethernet, TCP/UDP and NFS overhead, and you'll soon run into the minutes you described.

So, before you try to swap out NFS with anything else (yes, there are protocols with less overhead), check each part of the path the data (start at the disk subsystem) takes for any possible bottlenecks. Benchmark if in doubt.

Removing those bottlenecks (if any) with additional or better hardware will be easier than to change your whole software setup.

Solution 3:

I have an environment that is quite similar (lots of blade servers as worker nodes, and huge files on each several GB or even TB). I use Hadoop Distributed File System (HDFS). Check out:

http://en.wikipedia.org/wiki/Hadoop_Distributed_File_System#Hadoop_Distributed_File_System

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf

You might find it a bit more complex to set up than NFS though.