Is the 10-DNS-lookup limit in the SPF spec typically enforced?

Both libspf2 (C) and Mail::SPF::Query (perl, used in sendmail-spf-milter) implement a limit of 10 DNS-causing mechanisms, but the latter does not (AFAICT) apply the MX or PTR limits. libspf2 limits each of mx and ptr to 10 also.

Mail::SPF (perl) has a limit of 10 DNS-causing mechanisms, and a limit of 10 lookups per mechanism, per MX and per PTR. (The two perl packages are commonly, though not by default, used in MIMEDefang.)

pyspf has limits of 10 on all of: "lookups", MX, PTR, CNAME; but it explicitly multiplies MAX_LOOKUPS by 4 during operation. Unless in "strict" mode, it also multiples MAX_MX and MAX_PTR by 4.

I can't comment on commercial/proprietary implementations, but the above (except pyspf) clearly implement an upper limit of 10 DNS-triggering mechanisms (more on that below), give or take, though in most cases it can be overridden at run-time.

In your specific case you are correct, it is 12 includes and that exceeds the limit of 10. I would expect most SPF software to return "PermError", however, failures will only affect the final "included" provider(s) because the count will be calculated as a running total: SPF mechanisms are evaluated left-to-right and checks will "early-out" on a pass, so it depends on where in the sequence the sending server appears.

The way around this is to use mechanisms which do not trigger DNS lookups, e.g. ip4 and ip6, and then use mx if possible as that gets you up to 10 further names, each of which can have more than one IP.

Since SPF results in arbitrary DNS requests with potentially exponential scaling, it could easily be exploited for DOS/amplification attacks. It has deliberately low limits to prevent this: it does not scale the way you want.


10 mechanisms (strictly mechanisms + the "redirect" modifier) causing DNS look-ups is not exactly the same thing as 10 DNS look-ups though. Even "DNS lookups" is open to interpretation, you don't know in advance how many discrete lookups are required, and you don't know how many discrete lookups your recursive resolver may need to perform (see below).

RFC 4408 §10.1:

SPF implementations MUST limit the number of mechanisms and modifiers that do DNS lookups to at most 10 per SPF check, including any lookups caused by the use of the "include" mechanism or the "redirect" modifier. If this number is exceeded during a check, a PermError MUST be returned. The "include", "a", "mx", "ptr", and "exists" mechanisms as well as the "redirect" modifier do count against this limit. The "all", "ip4", and "ip6" mechanisms do not require DNS lookups and therefore do not count against this limit.

[...]

When evaluating the "mx" and "ptr" mechanisms, or the %{p} macro, there MUST be a limit of no more than 10 MX or PTR RRs looked up and checked.

So you may use up to 10 mechanisms/modifiers which trigger DNS lookups. (The wording here is poor: it seems to state only the upper bound of the limit, a confirming implementation could have a limit of 2.)

§5.4 for the mx mechanism, and §5.5 for the ptr mechanism each have a limit of 10 lookups of that kind of name, and that applies to the processing of that mechanism only, e.g.:

To prevent Denial of Service (DoS) attacks, more than 10 MX names MUST NOT be looked up during the evaluation of an "mx" mechanism (see Section 10).

i.e. you may have 10 mx mechanisms, with up to 10 MX names, so each of those may cause 20 DNS operations (10 MX + 10 A DNS lookups each) for total of 200. It's similar for ptr or %{p}, you can look up 10 ptr mechanisms, hence 10x10 PTRs, each PTR also requires an A lookup, again a total of 200.

This is exactly what the 2009.10 test suite checks, see the "Processing Limits" tests.

There is no clearly stated upper limit on the total number of client DNS lookup operations per-SPF-check, I calculate it as implicitly 210, give or take. There is also a suggestion to limit the volume of DNS data per-SPF-check, no actual limit is suggested though. You can get a rough estimate as SPF records are limited to 450 bytes (which is sadly shared with all other TXT records), but the total could exceed 100kiB if you're generous. Both those values are clearly open to potential abuse as an amplification attack, which is exactly what §10.1 says you need to avoid.

Empirical evidence suggests a total of 10 lookup mechanisms is commonly implemented in records (check out the SPF for microsoft.com who seem to have gone to some lengths to keep it to exactly 10). It's hard to collect evidence of too-many-lookups failure because the mandated error code is simply "PermError", which covers all manner of problems (DMARC reporting might help with that though).

The OpenSPF FAQ perpetuates the limit of a total of "10 DNS lookups", rather than the more precise "10 DNS causing mechanisms or redirects". This FAQ is arguably wrong since it actually says:

Since there is a limit of 10 DNS lookups per SPF record, specifying an IP address [...]

which is in disagreement with the RFC which imposes the limits on an "SPF check" operation, does not limit DNS lookup operations in this way, and clearly states an SPF record is a single DNS text RR. The FAQ would imply that you restart the count when you process an "include" as that is a new SPF record. What a mess.


DNS Lookups

What is a "DNS lookup" anyway? As a user. I would consider "ping www.microsoft.com" to involve a single DNS "lookup": there's one name that I expect to turn into one IP. Simple? Sadly not.

As an administrator I know that www.microsoft.com might not be a simple A record with a single IP, it might be a CNAME that in turn needs another discrete lookup to obtain an A record, albeit one that my upstream resolver will probably perform rather than the resolver on my desktop. Today, for me, www.microsoft.com is a chain of 3 CNAMEs that finally end up as an A record on akamaiedge.net, that's (at least) 4 DNS query operations for someone. SPF may see CNAMEs with the "ptr" mechanism, an MX record should not be a CNAME though.

Finally, as a DNS adminstrator I know that answering (almost) any question involves many discrete DNS operations, individual questions and answer transactions (UDP datagrams) — assuming an empty cache, a recursive resolver needs to start at the DNS root and work its way down: .commicrosoft.comwww.microsoft.com asking for specific types of records (NS, A etc) as required, and dealing with CNAMEs. You can see this in action with dig +trace www.microsoft.com, though you probably won't get the exact same answer due to geolocation trickery (example here). (There's even a little bit more to this complexity since SPF piggybacks on TXT records, and obsolete limits of 512 bytes on DNS answers might mean retrying queries over TCP.)

So what does SPF consider as a lookup? It's really closest to the administrator point of view, it needs to be aware of the specifics of each type of DNS query (but not to the point where it actually needs to count individual DNS datagrams or connections).


RFC4408 s10.1 does, as you have noted, put some limits on DNS activity. Specifically:

SPF implementations MUST limit the number of mechanisms and modifiers that do DNS lookups to at most 10 per SPF check, including any lookups caused by the use of the "include" mechanism or the "redirect" modifier. If this number is exceeded during a check, a PermError MUST be returned. The "include", "a", "mx", "ptr", and "exists" mechanisms as well as the "redirect" modifier do count against this limit. The "all", "ip4", and "ip6" mechanisms do not require DNS lookups and therefore do not count against this limit. The "exp" modifier does not count against this limit because the DNS lookup to fetch the explanation string occurs after the SPF record has been evaluated.

and moreover

When evaluating the "mx" and "ptr" mechanisms, or the %{p} macro, there MUST be a limit of no more than 10 MX or PTR RRs looked up and checked.

Note that the former is a limit on the number of mechanisms, not the number of lookups performed; but it is still a limit.

As far as I can tell, yes, these limits are enforced fairly hard. They're designed to stop people constructing arbitrarily complex SPF records and using those to DoS servers that check their record by grinding them to a halt in a huge chain of DNS lookups, so it's in the best interests of anyone who implements an SPF checker to honour them.

You are right to note that nested includes are likely to cause the biggest problem with these limits, and if you decide to include several domains each of which makes heavy use of includes themselves, then you can fairly quickly go over them. It's not too hard to find examples of people for whom this has created concrete issues.

The upshot seems to be that problems generally arise when people decide to use both SPF and several distinct and disparate companies to handle their outgoing email. I infer from your question that you fit into that category. SPF does not seem to be designed to serve people who choose to do this. If you insist on doing this, you will likely have to have some kind of cron job on your DNS server that constantly evaluates all the SPF records you would have wished to include, expresses them as a series of ip4: and ip6: mechanisms (on the number of which there is no limit), and republishes the result as your SPF record.

Don't forget to finish with a -all, or the whole exercise was pointless.