Handling user uploads to a web server cluster

GFS2/OCFS2 via DRBD allow a pair of servers to run dual primary as clustered storage. Your web frontends would pull from that shared pair. You could have multiple heads sharing a single FC attached media using either as well, or, could use NFS to have a single shared filesystem used by each of the web frontends. If you use NFS with DRBD, remember that you can only run that in primary/secondary mode due to the lack of cluster locks. This could cut your potential throughput in half.

GlusterFS sounds more like what you're looking for. It'll have some unique quirks, i.e. file requested on node that doesn't have it yet, metadata lookup says it is there, it gets transferred from one of the replicated nodes and then served. First time requested on a node will have some lag depending on the filesize.

OpenAFS is also another possibility. You have shared storage, each local resource has a local pool of recently used items. If the storage engine goes down, your local resource pools still serve.

Hadoop's HDFS is another alternative that just 'works'. A bit complicated to set up, but, would also meet your requirements. You're going to have a lot of duplicated content when using a distributed filesystem.

Another dirty method would be to have caches running on each of your web frontends that pull static/uploaded content from a single machine and use Varnish on each of the frontends to maintain a ram-cached version of your single copy. If your single machine fails, Varnish would cache existing items until the grace period, new items would be lost.

Much of this will be based on how reliable a backend you need. Distributed filesystems where your local machines are a replicating node are probably going to have the edge on speed since they don't involve network operations to get the data, but, with gigE and 10G cards being cheap, you probably won't experience significant latency.


All clustered filesystems have one central weakness: if a certain percentage of systems go offline, then the whole filesystem is useless but the nodes that are still up may not deal with it gracefully.

For example, assume that you have 30 servers in a rack and want to share their local space. You build a cluster filesystem and you even build it so that if just one node goes down, the data was replicated on enough other nodes that there is no problem. So far, so good. Then a card in the Ethernet switch dies. This switch interconnects all of the nodes in your cluster. The card shuts off communication to 15 of your 30 nodes. The questions you need to ask yourself are:

  1. If this scenario is fine with you, then how graceful is the failure? Do processes hang until communication is restored or do you have to log in and reboot every system to regain control?
  2. Will your customer's hang you out to dry when you suffer a switch or power failure in the rack? If so, consider spreading your nodes out across the data center or having each node feed into two switches and bond the interfaces. Some switch magic will need to occur too, so find a network admin.

Think forward a couple of steps and what the system will do under failure of any major component, including network or power cables.

Now you are ready for a clustered file system and all the million new jargon words you didn't know before.


We use either NFS from NetApp boxes or OCFS2 volumes off FC LUNs, I don't know if they're the best options but they've worked for us for years. Certainly both perform perfectly adequately although I personally prefer the OCFS2 over FC LUNs option myself as I'm more of a storage guy really. I guess it really comes down to what shared-storage infrastructure you already have and are comfortable.