Redudancy with multiple server on same DNS

I have a domain name like api.test.com. I also have three server.

I want to have redudancy between then, and so if one server is offline, the domain name api.test.com will not redirect to it, but only for others.

I found that we can have multiple A/AAAA record on same domain name, but it doesn't take in count if the computer ping or not, just randomly give one of all IP available.

How can I do to have redudancy with it ? Should I get another server that redirect all request such as a proxy ?


Solution 1:

No, this is not how it works. You can configure more than one record of some type and name into a zone. Client is generally unaware of that; it requests a certain name and type (for example, a browser requests a name you entered into address box, of type A, AAAA).

If more than one record is returned, it picks one at random and tries to connect there. It's up to the client to retry to connect to another record if it's there. Servers also encourage this behaviour, by answering the same query with different ordering of records each time. So even if some client "always picks first one", the randomisation on the server still takes place. This way a primitive form of a load balancing is achieved.

Special DNS load balancers return A records with very low TTL, so they expire quickly; the server can therefore react quickly to sudden load spikes, or to backend node outages, by omitting their addresses from replies. You can see this in action if you try to resolve Zoom's video conferencing servers; they use this technology. But that requires this special software, a DNS load balancer, to manage a DNS for this name, and this is just the beginning of the story.

Generally, if first tried IP received at A record doesn't answer, web clients usually return failure, even if there were other IPs to try. There are software which really retry again, for example, OpenVPN could try again indefinitely, but this is special case.


DNS provides for redundancy by entirely different mechanism, again, expecting a smart enough client. There are special type of DNS record, called SRV (service), which has 4 data fields: priorty, weight, port, name.

Name is simplest: the name of A record this SRV record describes. Port is the TCP or UDP port on which the requested service lives on the server of that name. This must be a name that has A or AAAA record associated; CNAME is not allowed. If there are more than one A or AAAA records of that name, we will have usual "try once" DNS behaviour for this particular SRV record (but client should try other SRV records if there are any, e.g. with higher priority values).

Weight enables more deep control of load balancing: if there are several records with same priority, client should try to spread load according to their weights. It is often done probabilistically.

Priority is for redundancy: first records with lowest value must be tried, then next priority and so on. But again, retry is up to client.

The record looks like:

_kerberos._tcp.example.net. SRV 0 100 88 dc.example.net.

Underscores are really literal underscores in the record name. It says, that "kerberos" service is served over TCP at dc.example.net port 88. dc.example.net must be A or AAAA records. This example is from MS Active Directory, which heavily relies on DNS for proper operation and uses it for ldap (directory) and kerberos (security framework). If you have more than one AD domain controller, there will be more such records, pointing to different DCs.

This type of records is used for e.g. ldap, kerberos, kpasswd (kerberos password change), xmpp (jabber), sip (ip telephony) and some other services.

MX is like a "special case of SRV", which tied to the port 25 and only has "priority" field, with no "weight". It's just an "old style", which was invented before SRV (and which inspired it). And it's used for email only.

SRV can't help you with web services. It helps only for services where client knows to use SRV record to discover the server; web clients never do this.