RFC that requires DNS servers to respond to unknown domain requests

My domain registrar and DNS provide currently ignores DNS requests to unknown domains. By ignore I mean black-holes and never responds which causes my DNS clients and resolver libraries to retry, back off, and finally timeout.

dig @NS3.DNSOWL.COM somedomainthatdoesntexist.org
...
;; connection timed out; no servers could be reached

In surveying other popular domain name services, I see that this behavior is pretty unique since other providers return an RCODE of 5 (REFUSED):

dig @DNS1.NAME-SERVICES.COM somedomainthatdoesntexist.org
dig @NS-284.AWSDNS-35.COM somedomainthatdoesntexist.org
dig @NS21.DOMAINCONTROL.COM somedomainthatdoesntexist.org

All return something like the following:

;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 64732

or

;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 31219

Returning REFUSED or NXDOMAIN immediately is appropriate IMHO as opposed to just dropping the request on the server room floor.

When I complain to my provider about their servers not responding, they ask me to quote the RFC that their servers are violating. I know it's strange that they are asking me to prove that their servers should respond to all requests but so be it.

Questions:

  • It is my stipulation that unless there are duplicate request ids or some sort of DOS response, a server should always respond to the request. Is this correct?
  • What RFC and specific section should I quote to support my stipulation?

To me, it is bad to not respond to a DNS query. Most clients will back off and then retransmit the same query to either the same DNS server or another server. Not only are they slowing clients down but they are causing the same query to be done again by their own or other servers depending on the authoritative name servers and NS entries.

In RFC 1536 and 2308 I see a lot of information about negative caching for performance reasons and to stop retransmission of the same query. In 4074 I see information about returning an empty answer with an RCODE of 0 so the client knows there is not ipv6 info which should cause the client to ask about A RRs which is another example of an empty response.

But I can't find an RFC which says that a DNS server should respond to a request, probably because it is implied.

The specific problem happens when I migrate my domain (and the associated DNS records) to their servers or the first X minutes after I register a new domain with their service. There is a lag between the time the authoritative name servers change (which is pretty damn fast these days) and their servers starting serving my DNS records. During this lag time, DNS clients think that their servers are authoritative but they never respond to a request -- even with a REFUSED. I understand the lag which is fine but I disagree with the decision to not respond to the DNS requests. For the record, I understand how to work around these limitations in their system but I'm still working with them to improve their services to be more in line with the DNS protocol.

Thanks for the help.


Edit:

Within a couple of months of posting this and following up with my provider, they changed their servers to return NXDOMAIN for unknown domains.


Shane's advice is correct. Failure to migrate data from one authoritative server to another prior to initiating a cutover is an invitation for an outage. Regardless of what happens from that point onward, this is an outage initiated by the person who swung the NS records. This explains why more people are not making this complaint to your provider.

That said, this is still an interesting question to answer so I'm going to take my crack at it.


Basic functionality of DNS servers is covered by documents RFC 1034 and RFC 1035, which collectively form STD 13. The answer must either come from these two RFCs, or be clarified by a later RFC which updates it.

Before we continue, there's a massive pitfall here outside the scope of DNS that needs to be called out: both of these RFCs predate BCP 14 (1997), the document which clarified the language of MAY, MUST, SHOULD, etc.

  • Standards which were authored before this language was formalized MAY have used clear language, but in several cases did not. This led to divergent implementations of software, mass confusion, etc.
  • STD 13 is unfortunately guilty of being interpretive in several areas. If language is not firm on an area of operation, it is frequently necessary to find a clarifying RFC.

With that out of the way, let's start with what RFC 1034 §4.3.1 has to say:

  • The simplest mode for the server is non-recursive, since it can answer queries using only local information: the response contains an error, the answer, or a referral to some other server "closer" to the answer. All name servers must implement non-recursive queries.

...

If recursive service is not requested or is not available, the non- recursive response will be one of the following:

  • An authoritative name error indicating that the name does not exist.

  • A temporary error indication.

  • Some combination of:

    RRs that answer the question, together with an indication whether the data comes from a zone or is cached.

    A referral to name servers which have zones which are closer ancestors to the name than the server sending the reply.

  • RRs that the name server thinks will prove useful to the requester.

The language here is reasonably firm. There is no "should be", but a "will be". This means that the end result must either be 1) defined in the list above, or 2) allowed for by a later document on the Standards Track which amends the functionality. I am not aware of any such verbiage existing for ignoring the request and I would say that the onus is on the developer to find language which disproves the research.

Given the frequent role of DNS in network abuse scenarios, let it not be said that DNS server software doesn't provide the knobs to selectively drop traffic on the floor, which would technically be a violation of this. That said, these are either not default behaviors or with very conservative defaults; examples of both would be the user requiring the software to drop a specific name (rpz-drop.), or certain numerical thresholds are being exceeded (BIND's max-clients-per-query). It is almost unheard of in my experience for the software to completely alter the default behavior for all packets in a way that violates the standard, unless the option is one that increases tolerance for older products violating a standard. That is not the case here.

In short, this RFC can and does get violated at the discretion of operators, but usually this is done with some manner of precision. It is extremely uncommon to completely disregard sections of the standard as is convenient, especially when the professional consensus (example: BCP 16 §3.3) errs in the favor of it being undesirable to generate unnecessary load on the DNS system as a whole. Unnecessary retries from dropping all requests for which no authoritative data is present is less than desirable with this in mind.


Update:

Regarding it being undesireable to drop queries on the floor as a matter of course, @Alnitak shared with us that there is currently a Draft BCP covering this topic in detail. It's a bit premature to use this as a citation, but it does help to reinforce that community consensus aligns with what is being expressed here. In particular:

Unless a nameserver is under attack, it should respond to all queries directed to it as a result of following delegations. Additionally code should not assume that there isn't a delegation to the server even if it is not configured to serve the zone. Broken delegations are a common occurrence in the DNS and receiving queries for zones that the server is not configured for is not necessarily an indication that the server is under attack. Parent zone operators are supposed to regularly check that the delegating NS records are consistent with those of the delegated zone and to correct them when they are not [RFC1034]. If this was being done regularly, the instances of broken delegations would be much lower.

This answer will be updated when the status of this document changes.


When you're moving authoritative DNS for a domain to a new provider, you should always (always!) test explicitly against the new provider (and ensure they're sending accurate, configured records) before you alter your domain registration (whois) information to point to the new authoritative DNS servers.

Roughly, the steps you'll take:

  1. Set everything up on the new DNS provider. You should create and populate all the zones.
  2. Make sure the new authoritative servers are working correctly. Query them explicitly:

    dig @new-ns.example.com mydomain.com
    

    What it sounds like, from your question, is that they're not responding to these queries? But, you said "unknown domains" which it shouldn't be at this point, it should be fully configured in their system (and responding with the records you configured).

    But, if you have already configured the domain in their system, it has to be responding with the correct records at this point. If it's not then they're not hosting the zone properly, and you should yell at them; whether or not it responds to a domain it doesn't have configured should be inconsequential. (If I'm still somehow missing what you're saying, please let me know).

  3. Switch authoritative name servers with your domain registrar (whois), leaving the old DNS provider up and running until traffic is no longer hitting it (give it at least 24 hours).

If the new provider absolutely cannot have the records populated before you make the switch, then how they respond is really not going to matter - pointing users to an authoritative that refuses the query completely will incur downtime for your domain just the same as if you were getting no response at all.