How can I automaticaly change the DNS A record to point my site to a secondary server in case of a failure?

My hosting is charging me a $50/month fee to put my servers on the same VLAN so that I create a cluster using the Network Load Balancing feature.

I'm really not needing split the load between the servers, I was looking for an easy way to create a failover scenario to protect against servers failures. However, I consider this fee is little expensive.

Is there any way that I can create a cluster without using the NLB? Maybe something that monitors my primary server and changes the DNS for the domain when it goes down?


Solution 1:

DNS is a poor choice for simulating failover. The reasons are

  • DNS records a valid for a period of time, to achieve realtime failover you'd need to lower the TTL on a DNS entry to be so low that every request to your site would result in a DNS lookup. Which will seriously slow the experience for your visitors
  • There are some suspicions that even today DNS entires with very low TTL values are not respected by ISP's and older broken name servers.
  • Using Round Robin DNS will not provide failover because on average 50% of the requests will go to each IP listed. This actually would work if the higher level protocols of HTTP had retrying built in.

My suggestions would be

  • If your second server is in the same hosting company, then look into renting a real load balancer, either for your own use, or most hosting companies can rent you part of their shared load balancing infrastructure
  • Use something like spread or linux HA servers to assign the service to a floating virtual IP which is passed between your physical servers. The servers in the cluster monitor each other and decide who is the current owner of the virtual IP.

I highly recommend reading Scalable Internet Architectures by Theo Schlossnagle as he covers this in great detail

Solution 2:

You could set up a separate load balancing box with ldirectord, which monitors your web servers, and load-balances between whichever ones are currently live (and by extension, keeps the site up when one server goes down). We use this solution to allow us to re-boot either one of our two webservers without affecting the site's uptime.

In fact, if your webservers are linux boxes, you can run ldirectord on the webserver itself, and use heartbeat to keep ldirectord running on the live box.

This solution allows you to share a common IP address (or more than one) between two or more boxes, and avoid the DNS issue altogether.

Solution 3:

One way would be to have your site on two different servers, then have a site monitor at a third location that monitors the connection to the main server (where your DNS usually points). If it detects that the site is down, have the monitor execute a script or hit the update URL of a dynamic dns provider (such as DtDNS [which I operate], or DynDNS.com) with the IP address of your backup server specified for your domain/hostname. This will update the DNS record and direct traffic to your backup web server. When the main site is back online, the monitoring can run another script or URL that will put the "real" IP back.

One key here is to have DNS hosted at a provider that has an API into their system so that your domain can be updated quickly, and that keeps the TTL low enough so that visitors will be redirected relatively quickly.

Another is that the site has to be monitored from an outside, objective location. You cannot run monitoring on the same web server as the site is on. If the whole server goes down, the monitoring/update method will go right down with it.