MX records, better setup for load balancing and failover

Take domain example.com, it has two mail servers mail1.example.com and mail2.example.com, both already configured, usually I would go with the following setup:

example.com.           1200    IN      MX      10 mail1.example.com.
example.com.           1200    IN      MX      10 mail2.example.com.
mail1.example.com.     1200    IN      A       172.16.10.1
mail2.example.com.     1200    IN      A       172.16.10.2

A co-worker suggested the following setup:

example.com.           1200    IN      MX      10 mail.example.com.
mail.example.com.      1200    IN      A       172.16.10.1
mail.example.com.      1200    IN      A       172.16.10.2

mail1.example.com.     1200    IN      A       172.16.10.1
mail2.example.com.     1200    IN      A       172.16.10.2

A single new hostname with two A records that points to the two servers, as he states some clients does not correctly do round-robin with same priority MX, it should be a legit setup, but does it still correctly support failover, e.g. 172.16.10.1 fail, is 172.16.10.2 being tried for delivery? Or would it be even better a setup like:

example.com.           1200    IN      MX      10 mail.example.com.
example.com.           1200    IN      MX      20 mail1.example.com.
example.com.           1200    IN      MX      20 mail2.example.com.
mail.example.com.      1200    IN      A       172.16.10.1
mail.example.com.      1200    IN      A       172.16.10.2

mail1.example.com.     1200    IN      A       172.16.10.1
mail2.example.com.     1200    IN      A       172.16.10.2

Thanks.


The RFC's that specify how a MTA should handle MX records are RFC974, RFC1123 section 5.3.4, RFC2821 section 5 and RFC5321 section 5.

RFC974 status is now HISTORIC. According to it, MTA's are expected to query the list of MX records associated to a domain and are "encouraged" to try all (or a fixed number of) SMTP servers, in ascending order of preference. If there are multiple MX records with an equal preference value, MTA's must try to deliver the message to all SMTP servers until one succeeds. The order of attempts is a MTA's choice, that is, the RFC doesn't rule whether SMTP servers must be contacted at random or in the order given by the DNS server. In addition, the RFC doesn't rule how to handle a MX register that references multiple A records.

(...) If the list of MX RRs is not empty, the mailer should try to deliver
the message to the MXs in order (lowest preference value tried
first). The mailer is required to attempt delivery to the lowest
valued MX. Implementors are encouraged to write mailers so that they
try the MXs in order until one of the MXs accepts the message, or all
the MXs have been tried. A somewhat less demanding system, in which
a fixed number of MXs is tried, is also reasonable. Note that
multiple MXs may have the same preference value. In this case, all
MXs at with a given value must be tried before any of a higher value
are tried. In addition, in the special case in which there are
several MXs with the lowest preference value, all of them should be
tried before a message is deemed undeliverable. (...)

RFC1123 status is INTERNET STANDARD. Section 5.3.4 aims to "refine" the RFC974 procedures about how to handle MX records. It now requires MTA's to try all SMTP servers in ascending order of preference until one succeeds. However it still allows a configurable limit on the number of tries. If there are multiple MX records with an equal preference value, the RFC recommends (and doesn't require) MTA's to select one record at random. However, if a MX record references multiple A records (IPv4 addresses), the RFC requires the MTA's to contact all these addresses until one succeeds, in the order given by the DNS server.

(...) When it succeeds, the mapping can result in a list of
alternative delivery addresses rather than a single address,
because of (a) multiple MX records, (b) multihoming, or both.
To provide reliable mail transmission, the sender-SMTP MUST be
able to try (and retry) each of the addresses in this list in
order, until a delivery attempt succeeds. However, there MAY
also be a configurable limit on the number of alternate
addresses that can be tried. In any case, a host SHOULD try at
least two addresses.

The following information is to be used to rank the host
addresses:

(1) Multiple MX Records -- these contain a preference
indication that should be used in sorting. If there are
multiple destinations with the same preference and there
is no clear reason to favor one (e.g., by address
preference), then the sender-SMTP SHOULD pick one at
random to spread the load across multiple mail exchanges
for a specific organization; note that this is a
refinement of the procedure in [DNS:3].

(2) Multihomed host -- The destination host (perhaps taken
from the preferred MX record) may be multihomed, in which
case the domain name resolver will return a list of
alternative IP addresses. It is the responsibility of the
domain name resolver interface (see Section 6.1.3.4 below)
to have ordered this list by decreasing preference, and
SMTP MUST try them in the order presented.

(...)

[DNS:3] "Mail Routing and the Domain System," C. Partridge, RFC-974,
January 1986.

RFC2821 status is PROPOSED STANDARD. This RFC obsoletes RFC974 and, in the scope of MX record handling, it slightly differs from RFC1123. While the former REQUIRES a random selection of a SMTP server among multiple MX records with an equal preference value, the latter just RECOMMENDS it.

(...) Multiple MX records contain a preference indication that MUST be used
in sorting (see below). Lower numbers are more preferred than higher
ones. If there are multiple destinations with the same preference
and there is no clear reason to favor one (e.g., by recognition of an
easily-reached address), then the sender-SMTP MUST randomize them to
spread the load across multiple mail exchangers for a specific
organization.

The destination host (perhaps taken from the preferred MX record) may
be multihomed, in which case the domain name resolver will return a
list of alternative IP addresses. It is the responsibility of the
domain name resolver interface to have ordered this list by
decreasing preference if necessary, and SMTP MUST try them in the
order presented. (...)

RFC5321 status is DRAFT STANDARD. This RFC obsoletes RFC2821 and, in the context of DNS resolution, it basically rewrites the same server lookup procedure and presents a new section that slightly discuss handling of MX records that references IPv6 addresses.

(...) When a domain name associated with an MX RR is looked up and the
associated data field obtained, the data field of that response MUST
contain a domain name. That domain name, when queried, MUST return
at least one address record (e.g., A or AAAA RR) that gives the IP
address of the SMTP server to which the message should be directed.

(...) When the lookup succeeds, the mapping can result in a list of
alternative delivery addresses rather than a single address, because
of multiple MX records, multihoming, or both. To provide reliable
mail transmission, the SMTP client MUST be able to try (and retry)
each of the relevant addresses in this list in order, until a
delivery attempt succeeds.

(...)  MX records contain a preference indication that MUST be used in
sorting if more than one such record appears (see below). Lower
numbers are more preferred than higher ones. If there are multiple
destinations with the same preference and there is no clear reason to
favor one (e.g., by recognition of an easily reached address), then
the sender-SMTP MUST randomize them to spread the load across
multiple mail exchangers for a specific organization.

The destination host (perhaps taken from the preferred MX record) may
be multihomed, in which case the domain name resolver will return a
list of alternative IP addresses. It is the responsibility of the
domain name resolver interface to have ordered this list by
decreasing preference if necessary, and the SMTP sender MUST try them
in the order presented. (...)

I guess a modern mail transfer agent follows at least RFC2821 or RFC5321 procedures, so all three DNS setups provide failover capabilities. However, only the first setup may provide a better load balancing. If you give a try to the second or the third setup, you will have to make sure your DNS server delivers responses in a random order. Besides, DNS records may be cached either by MTA's themselves or by recursive DNS servers, so the randomness can't be guaranteed. I think mail1.example.com will receive most of the messages.

Another reason that directs my opinion against the second and third setups is the reference of multiple names to one IP address. Mail servers in the internet commonly rejects messages from hosts whose mapping IP address => PTR => hostname => A => IP address doesn't match (as does the Postfix restriction reject_unknown_client_hostname), so you will have to take special care on setting PTR records.

Clients that don't try MX records in a random order are already violating the RFC2821 and RFC5321 standards. So, I think there is no guarantee that these clients will also try the secondary IP address automatically. Because of that, I prefer the simplest DNS configuration:

example.com.           1200    IN      MX      10 mail1.example.com.
example.com.           1200    IN      MX      10 mail2.example.com.
mail1.example.com.     1200    IN      A       172.16.10.1
mail2.example.com.     1200    IN      A       172.16.10.2

EDIT: Added references to RFC1123.


The second setup doesn't support failover. Let's say mail.example.com has been resolved to 172.16.10.1 and it fails. Then 172.16.10.2 won't be tried as there is only one MX record.

The third setup generates twice DNS traffic as the first one. Aside fom traffic, both of them have the same behavior: As you said some clients won't correctly do round-robin with same priority MX.

In order to have both load balancing and failover I would try:

example.com.           1200    IN      MX      10 mail1.example.com.
example.com.           1200    IN      MX      10 mail2.example.com.
example.com.           1200    IN      MX      20 mail3.example.com.
example.com.           1200    IN      MX      30 mail4.example.com.
mail1.example.com.     1200    IN      A       172.16.10.1
mail2.example.com.     1200    IN      A       172.16.10.2
mail3.example.com.     1200    IN      A       172.16.10.1
mail4.example.com.     1200    IN      A       172.16.10.2
  • 10 MX records ------> Some kind of load balancing
  • 20, 30 MX records --> Failover