I am trying to get my head around the concept of load balancing to ensure availability and redundancy to keep users happy when things go wrong, rather than load balancing for the sake of offering blistering speed to millions of users.

We're on a budget and trying to stick to the stuff where there's plenty of knowledge available, so running Apache on Ubuntu VPS's seems like the strategy until some famous search engine acquire us (Saturday irony included, please note).

At least to me, it's a complete jungle of different solutions available. Apaches own mod_proxy & HAproxy are two that we found by a quick google search, but having zero experience of load balancing, I have no idea of what would be appropriate for our situation, or what we would look after while choosing a solution to solve our availability concerns.

What is the best option for us? What should we do to get availability high whilst staying inside our budgets?


The solution I use, and can be easily implemented with VPS, is the following:

  • DNS is round-robin'ed (sp?) to 6 different valid IP addresses.
  • I have 3 load balancers with identical configuration and using corosync/pacemaker to distribute the 6 ip adresses evenly (so each machine gets 2 adresses).
  • Each of the load balancers has a nginx + varnish configuration. Nginx deal with receiving the connections and doing rewrites and some static serving, and passing it back to Varnish that does the load balancing and caching.

This arch has the following advantages, on my biased opinion:

  1. corosync/pacemaker will redistribute the ip addresses in case one of the LB fails.
  2. nginx can be used to serve SSL, certain types of files directly from the filesystem or NFS without using the cache (big videos, audio or big files).
  3. Varnish is a very good load balancer supporting weight, backend health checking, and does a outstanding job as reverse proxy.
  4. In case of more LB's being needed to handle the traffic, just add more machines to the cluster and the IP addresses will be rebalanced between all the machines. You can even do it automatically (adding and removing load balancers). That's why I use 6 ips for 3 machines, to let some space for growth.

In your case, having physically separated VPSs is a good idea, but makes the ip sharing more difficult. The objective is having a fault resistant, redundant system, and some configurations for load balancing/HA end messing it up adding a single point of failure (like a single load balancer to receive all traffic).

I also know you asked about apache, but those days we have specific tools better suited to the job (like nginx and varnish). Leave apache to run the applications on the backend and serve it using other tools (not that apache can't do good load balancing or reverse proxying, it's just a question of offloading different parts of the job to more services so each part can do well it's share).


HAproxy is a good solution. The config is fairly straight forward.

You'll need another VPS instance to sit in front of at least 2 other VPS's. So for load balancing / fail over you need a minimum of 3 VPS's

A few things to think about also is:

  1. SSL termination. If you use HTTPS:// that connection should terminate at the load balancer, behind the load balancer it should pass all traffic over an unencrypted connection.

  2. File storage. If a user uploads an image where does it go? Does it just sit on one machine? You need someway to share files instantly between machines - you could use Amazon's S3 service to store all your static files, or you could have another VPS that would act as a file server, but I would recommend S3 because its redundant and insanely cheap.

  3. session info. each machine in your load balancer config needs to be able to access the session info of the user, because you never know what machine they will hit.

  4. db - do you have a separate db server? if you only have one machine right now, how will you make sure your new machine will have access to the db server - and if its a separate VPS db server, how redundant is that. It doesn't necessarily makes sense to have High Availability web front ends and a single point of failure with one db server, now you need to consider db replication and slave promotion as well.

So I've been in your shoes, thats the trouble with a website that does a few hundred hits a day to a real operation. It gets complex quick. Hope that gave you some food for thought :)


My vote is for Linux Virtual Server as the load balancer. This makes the LVS director a single point of failure as well as a bottleneck, but

  1. The bottleneck is not, in my experience, a problem; the LVS redirection step is layer-3, and extremely (computationally) cheap.
  2. The single point of failure should be dealt with by having a second director, with the two controlled by Linux HA.

Cost can be kept down by having the first director be on the same machine as the first LVS node, and the second director on the same machine as the second LVS node. Third and subsequent nodes are pure nodes, with no LVS or HA implications.

This also leaves you free to run any web server software you like, as the redirection's taking place below the application layer.


How about this chain?

round robin dns > haproxy on both machines > nginx to seperate static files > apache

Possibly also use ucarp or heartbeat to ensure haproxy always answers. Stunnel would sit in front of haproxy if you need SSL too


You may want to consider using proper clustering software. RedHat's (or CentOS) Cluster Suite, or Oracle's ClusterWare. These can be used to setup active-passive clusters, and can be used to restart services, and fail between nodes when there are serious issues. This is essentially what you're looking for.

All of these cluster solutions are included in the respective OS licenses, so you're probably cool on cost. They do require some manner of shared storage -- either an NFS mount, or physical disk accessed by both nodes with a clustered file system. An example of the latter would be SAN disks with multiple host access allowed, formatted with OCFS2 or GFS. I believe you can use VMWare shared disks for this.

The cluster software is used to define 'services' that run on nodes all the time, or only when that node is 'active'. The nodes communicate via heartbeats, and also monitor those services. They can restart them if they notice failures, and reboot if they can't be fixed.

You would basically configure a single 'shared' IP address that traffic would be directed to. Then apache, and any other necessary services, can be defined as well, and only run on the active server. Shared disk would be used for all your web content, any uploaded files, and your apache configuration directories. (with httpd.conf, etc)

In my experience, this works incredibly well.

  • There's no need for DNS round robin, or any other single-point-of-failure load balancer -- everything hits one IP/FQDN.
  • User uploaded files go into that shared storage, and thus don't care if your machine fails over.
  • Developers upload content to that single IP/FQDN with zero additional training, and it's always up to date if it fails over.
  • The administrator can take the offline machine, patch the heck out of it, reboot, etc. Then fail the active node over. Making an upgrade take minimal downtime.
  • That now out-of-date node can be kept unpatched for a while, making a fail-back an equally easy process. (Quicker than VMWare snapshots)
  • Changes to Apache's configuration are shared, so that nothing weird happens during a failover, because an admin forgot to make changes on the offline box.


--Christopher Karel