Load balancing / Failover for multiple VPS's set across different datacenters

Solution 1:

Notes:

How much are you willing to spend, I've yet to see someone relying on VPSs and really wanting to spend the money for a datacenter failure case.

Regarding your drawings:

The fail in the first one is true if (and only if) the load balancer is a single machine, if it's a single system (as in a system built from multiple hosts) it's not true anymore.

SPA (Shortest possible answer):

  • Datacenter power failure failover

Really short answer: You need to get a service IP that is available in all your locations. And set up BGP routing.

A little bit longer: Typically this is done by using BGP and announcing the IP on 2 different locations. You can set it up in a way that the IPs are announce all the time but one has a lower preference than the other. This way under normal circumstances you traffic will go to only one site, if that fails the BGP route is dropped and traffic switches over to the IP still available.

We have a few setups similiar to this, typical layout is:

(per location):

  • 2 loadbalancers

    This is the place where BGP also runs and announces it's IPs. Usually Quagga and some IPVS setup (we use keepalived)

  • n servers to handle the load (FE)

The failure cases:

  • Any 1 Loadbalancer (at a single site) fails

    • Handled by keepalived, the other LB will just continue it's work
  • Any n-k of the FEs fail (k being the number of FEs that can fail without us experiencing issues)

    • Handled by the LBs, a check will remove them from and they won't receive any more traffic
  • n-(k+1) FEs fail (at a single site)

    • Handled by BGP. We will kill the BGP Session on the LBs at the site where too many FEs have failed. The other location takes over
  • any major outage at a single site

    • Handled by BGP, the BGP session will be dropped and the other location jumps in

I'm sorry I'm not in the mood right now to go further into the details of doing this manually. My guess is you'll be better (and cheaper) of by renting a loadbalancer service that will do the magic for you. I've read that Amazon provides these but I don't know if their usage is possible without using the rest of their infrastructure.