Filesystem redundancy + speed

Looking for some input and if anyone has conquered this problem with a solution they feel confident about.

Looking to setup a fault tolerant web environment. So the setup is a few nodes behind a load balancer. Now the web devs can ssh into 1 server to edit the code and such.

I'm thinking of glusterfs but putting a glusterfs filesystem as the doc root lead to around 20-30% decrease in pages the webserver could serve. I expect this since I'm only over ethernet and not infiband or such.

So I was thinking about using glusterfs+inotify. So I have a inotify script running that monitors the docroot and the gluster mount for changes and does an rsync on that file/dir that gets changed. This way apache can serve from the local disk and not gluster but it gives the effect that it is being served via a clustered filesystem.

My only issue with that is I'd need to run 2 of the inotify scripts and for the filecount we are running to add all the inotify watchers I'd being using around 700megs of RAM for them both.

So anyone have any advice or pointers?

thanks!

EDIT

Think of it like a webhost. Clients ssh into 1 server but the files they create/edit/delete are on all the other nodes

The reverse also needs to be true.. If the webserver creates files they need to be also on all the nodes too.

So that throws a straight rsync out since it's too slow.


Solution 1:

Oh wow, I'm having flashbacks to a past job, where GFS was used for the exact rationale you're describing. The scenario: in excess of 2000 customers running their apps on a number of large-scale clouds.

Basically, you can't do what you think you want to do. You cannot get a clustered or network filesystem that will work at anywhere near the speed of a local filesystem. Let me emphasise that for a second: CANNOT. If you think you can, you're deluding yourself. If someone else says they can, they're lying. It's simple mathematics: disk speed + controller IO + network latency + cluster fu must be greater than disk speed + controller IO.

Now, down to the reasons you might be building this, and why what you want to do is useless:

  • Simplicity of deployment: It's not simpler to deploy to one machine and have it automatically run everywhere, because -- get this -- it's not one machine you're deploying to. Sure, you might think it's convenient to only have to copy one instance of the code, but there are plenty of per-appserver things you'll need to do in various situations. In many cases I've personally had to deal with, installing onto a shared filesystem ended up making the deployment process more complicated than it would have been. The correct answer to the question "how do I deploy to multiple machines" is "automate deployment".
  • Clustering for reliability: This one borders on a joke for me. Modern hardware is so reliable, and clustering technologies so unreliable, that clustering (especially clustering filesystems) INCREASE your downtime. I've got enough data for a white paper on that topic. Now, some people will say "I've had an EMC SAN running for four years without a production outage", but how much did they pay for that reliability? I've never heard of anyone get that sort of reliability for less than 7 figures (on a TCO basis), and there was plenty of expertise cost in there, too. The fact that you're asking this question says you don't have that expertise, and I'll bet you're not looking to put 7 figures down, either.
  • Clustering for capacity: This goes back to my opening statements -- any sort of clustered filesystem is slower than a local one. Trying to extract large amounts of performance out of a clustered or network filesystem is a lost cause. It will drive you mad (it certainly did the trick for me).

Now that I've been a negative nelly for a screen or so, what can you do? Well, it basically comes down to helping your customers on an individualised level.

You can't build a cookie-cutter, one-size-fits-all hosting infrastructure for infinite scale. That's what my GFS-loving past employer tried to do, and by gum it didn't work then, and I'm confident it cannot be done with currently available development and operational technologies.

Instead, take a bit of time to assess your customers' requirements and help them towards a solution that meets their requirements. You don't necessarily have to do a full analysis of every customer; after the first few, you'll (hopefully) start to see patterns emerge, which will guide you towards a range of "standard" solutions. It becomes "OK, you've got requirements F, P, and Aleph-1, so you'd be best off with solution ZZ-23-plural-Z-alpha -- and here's our comprehensive set of documentation on deploying this solution, and our prices for custom consulting on this solution if you can't implement it yourself are at the bottom".

As far as specifics go, there's too many to list, but I'll give you a few hints:

  • Deploy code to each appserver individually.
  • If you really have to go shared dynamic assets, use NFS. It's stupidly simple, and has by far the lowest breakage rate. Note that I said shared ASSETS -- not the code, not the config, not the logs, nothing but customer-provided assets.
  • NFS doesn't scale forever though (NetApp propaganda notwithstanding); at some point, your customers will need to move to something else (an example of which I've given before), and you can assist them in moving to a more scalable solution with good documentation and other ready-made assistance.
  • If you're thinking this can be a fire-and-forget business, you're wrong. You've got commodity web hosting (with all of the good points -- low maintenance -- and bad points -- no margins -- that that implies), and specialised web hosting (which is what you're trying to do), and the latter is high maintenance (but with correspondingly high margins).

Solution 2:

Read @Zypher's comment. Read it over and over until you understand the wisdom of those words, see the light, and chase your developers off your production servers and into an appropriate sandbox.
You can borrow my pointy stick. :-)


Reframing your question in that light, "How do I keep the code on my webservers consistent?".
Answer: puppet (or Chef), radmind, or any of the many wonderful configuration/deployment systems out there.

These tools give you a much simpler way of achieving your goal, take up a lot less RAM/CPU, and can be set up to guarantee consistency across all your nodes.
This portion of the answer withdrawn based on edit to the original question

There's really only one solution for you that I can think of, and that's a SAN (or NAS device serving up files over NFS).
The reason I'd suggest this route is that you need to have files created by each of the servers available to all the others. Doing massive N-way synchronization will become unwieldy and slow. Centralizing on to a SAN will give better performance, good redundancy (SANs are pretty bullet-proof if you don't cheap out), and the ability to scale up easily as your needs increase.

It isn't without downsides: Unless you do a pair of mirrored, redundant SANs with redundant fabric you will be introducing a single point of failure. SANs are also not cheap, and redundancy just adds on more expenses.


Note that none of this obviates the need to keep the developers off the production box, unless you're guaranteed that they'll never call you when they break something. At the very least you should be strongly suggesting that they rent a dev environment from you (at a reasonable profit obviously - something to help pay for the cost of the SAN...)