Solution 1:

First off, I completely agree with @Alnitak that DNS isn't designed for this sort of thing, and best practice is to not (ab)use DNS as a poor man's load balancer.

My question now is... are there any best prectices / methos / rules of thumb to weight round robin distribution using the TTL attribute of DNS records?

To answer on the premise of the question, the approach used to perform basix weighted round robin using DNS is to:

  • Adjust the relative occurrence of records in authoritative DNS responses. I.e. if Server A is to have 1/3 of traffic and Server B is to have 2/3, then 1/3 of authoritative DNS responses to DNS proxies would contain only A's IP, and 2/3 of responses only B's IP. (If 2 or more servers share the same 'weight', then they can be bundled up into one response.)
  • Keep a low DNS TTL so that un-balanced load is evened out relatively quickly. Because the downstream DNS proxies have very un-even numbers of clients behind them, you'd want to re-shuffle records frequently.

Amazon's Route 53 DNS service uses this method.

The amount of Bandwidth (not requests) exceeds what one single server with ethernet can handle. So I need a balancing solution that distributes the bandwidth to several servers.

Right. So as I understand this, you have some sort of 'cheap' downloads / video distribution / large-file download service, where the total service bitrate exceeds 1 GBit.

Without knowing the exact specifics of your service and your server layout, it's hard to be precise. But a common solution in this case is:

  • DNS round robin to two or more TCP/IP or HTTP level load balancer instances.
  • Each load balancer instance being highly available (2 identical load balancers cooperating on keeping one IP address always on).
  • Each load balancer instance using weighted round robin or weighted random connection handling to the backend servers.

This kind of setup can be built with open-source software, or with purpose-built appliances from many vendors. The load balancing tag here is a great starting point, or you could hire sysadmins who have done this before to consult for you...

Solution 2:

My question now is... are there any best prectices / methos / rules of thumb to weight round robin distribution using the TTL attribute of DNS records?

Yes, best practice is don't do it !!

Please repeat after me

  • DNS is not for load balancing
  • DNS does not provide resiliency
  • DNS does not provide fail-over facilities

DNS is for mapping a name to one or more IP addresses. Any subsequent balancing you get is through luck, not design.

Solution 3:

Take a look at PowerDNS. It allows you to create a custom pipe backend. I've modified an example load-balancer DNS backend written in perl to use the Algorithm::ConsistentHash::Ketama module. This lets me set arbitrary weights like so:

my $ketamahe = Algorithm::ConsistentHash::Ketama->new();

# Configure servers and weights
$ketamahe->add_bucket("192.168.1.2", 50);
$ketamahe->add_bucket("192.168.1.25", 50);

And another one:

# multi-colo hash
my $ketamamc = Algorithm::ConsistentHash::Ketama->new();

# Configure servers and weights
$ketamamc->add_bucket("192.168.1.2", 33);
$ketamamc->add_bucket("192.168.1.25", 33);
$ketamamc->add_bucket("192.168.2.2", 17);
$ketamamc->add_bucket("192.168.2.2", 17);

I've added a cname from my desired top level domain to a subdoman I call gslb, or Global Server Load Balancing. From there, I invoke this custom DNS server and send out A records according to my desired weights.

Works like a champ. The ketama hash has the nice property of minimal disruption to existing configuration as you add servers or adjust weights.

I recommend reading Alternative DNS Servers, by Jan-Piet Mens. He has many good ideas in there as well as example code.

I'd also recommend abandoning the TTL modulation. You are getting pretty far afield already and adding another kludge on top will make troubleshooting and documentation extremely difficult.

Solution 4:

You can use PowerDNS to do weighted round robin, although distributing load in such an unbalanced fashion (100:1?) may get very interesting, at least with the algorithms I used in my solution, where each RR entry has a weight associated with it, between 1-100, and a random value is used to include or exclude records.

Here's an article I wrote on using the MySQL backend in PowerDNS to do weighted RR DNS: http://www.mccartney.ie/wordpress/2008/08/wrr-dns-with-powerdns/

R.I.Pienaar also has some Ruby based examples (using the PowerDNS pipe backend): http://code.google.com/p/ruby-pdns/wiki/RecipeWeightedRoundRobin

Solution 5:

To deal with this sort of setup you need to look at a real load balancing solution. Read Linux Virtual Server and HAProxy. You get the additional benefit of servers automatically being removed from the pool if they fail and the effects are much more easily understood. Weighting is simply a setting to be tweaked.