When are NS records at the apex of a DNS domain queried?

Solution 1:

For the casual user, it really shouldn't matter. If what you're after is consistent delivery and uptime for your domain, the rules are pretty simple:

  • The NS records on your nameservers should only point at A and AAAA records. (not CNAME records, etc.)
  • Make sure to include IPv4 addresses (A records), or it will not be possible for DNS servers running single-stack IPv4 to obtain information about your domain.
  • The names and IP addresses of those nameservers should match how your domain is configured within the registrar's control panel.

That's it. The how and why of the implementation doesn't matter so much. If you diverge from this recommendation, the result will be a great deal of inconsistent and unpredictable behavior. Scary phrases such as "undefined behavior" and "implementation specific behavior" both apply here.

With that said, the question being asked by the OP is a completely fair one. Excluding explicit requests by clients and excluding indirect referencing within the authority section of other answers, when are these NS records explicitly requested by recursive nameservers?


You've inadvertently walked into one of the more ambiguous areas of how recursive DNS servers operate. To the best of my knowledge, we still don't have an amendment to the governing Internet Standard clarifying how this is "supposed" to work.

A high level overview of how a recursive DNS server learns about your domain goes like this:

  • Recursive server gets a request for www.example.com. IN A.
  • If this DNS record is in cache, it is answered from cache.
  • If the DNS record is not in cache, it needs to find a nameserver which can provide the answer. It starts by checking its memory to see if it has already identified nameservers related to the domain. It will consult the nameservers for the most specific zone (aka domain) it is aware of. If referrals for more specific zones are encountered, those referrals will be followed until a server identifies itself as authoritative for www.example.com. IN A. (or until an error prevents it from following the path further)
  • In a "cold cache" scenario (imagine a freshly restarted DNS server), it would have to start from scratch with least specific and work its way up to most specific. For our example of www.example.com. IN A, it would to follow the following set of referrals:

    • .: AKA "root" nameservers.
    • com.: The Top Level Domain nameservers for com., learned from . nameservers.
    • example.com.: The nameservers listed for example.com. in the com. registry, learned from the com. nameservers.
    • www.example.com: This happens only if the example.com nameservers provided a referral to a different set of nameservers for www. For this example let's assume that's not the case. Our answer for the A record will have come directly from the nameservers for example.com.

Each step along this path, the recursive server asked if these servers were responsible for www.example.com and received a referral to a more specific set of DNS servers. At no point in this walk did we need to ask for the NS records. We learned about the more specific servers through referrals until one set of servers finally replied with an authoritative answer for www.example.com. (in this case, the example.com. nameservers had our answer)

This is where things get weird.

The NS records we have in memory at this point were learned through referrals. For the purposes of the nameserver this is "good enough", but we now have two problems:

  • What happens when the TTL associated with the NS records in the referral expire?

  • What happens when someone asks us for the value of these NS records?

We'll explore each of these.

The TTL of NS records learned through referrals has expired. Now what?

This is where nameserver behavior diverges greatly. While it has some age on it (March 2011), I strongly recommend reading Ólafur Guðmundsson's presentation that covers the topic. Slides 11 - 13 introduce us to several patterns of nameserver behavior. I'm going to borrow the same terms from Ólafur's presentation:

Child Centric non sticky:
PPPCCCPPPCCCPPPCCCPP

Child Centric sticky
PPPCCCCCCCCCCCCCCCCC

Parent Centric
PPPPPPPPPPPPPPPPPPPP

In this instance, "parent" is referring to the NS records that we learned of through the referral. "child" is referring to the NS records that we learned through the authoritative answer we receive when we query the first set of NS records for the value of example.com. IN NS. (i.e. when we are asking those nameservers to return their own NS record...in theory)

The commonality with all of these patterns is that the NS data in memory is first learned from the parent. This is a given, as it's fundamental to how the process works. Where implementations differ is what they do afterwards:

  • Child centric non-sticky will initially prefer parent, then swap to the child. Once the child expires, the NS records are "forgotten" and re-learned from scratch in order to provide an opportunity for changes on the parent nameservers to be incorporated. Without this, changes in nameservers related to expired domains would not be caught -- both the expiration and renewal thereof. The disadvantage is that occasionally these NS record definitions do not agree, resulting in the recursive server returning different responses for a specific DNS record (i.e. www.example.com. IN A) depending on which servers it is currently hitting.

  • Child centric sticky is a very problematic implementation where the nameservers get "stuck" on the child side of the definition and the parent side is not re-evaluated until the cache is purged or the server is restarted. It is generally considered to be the worst of these implementations due to the very obvious problems that are associated with it. (an example would be this Q&A where someone is observing the behavior)

  • Parent centric is an interesting implementation that eschews the value of the child/authoritative NS records entirely. The general idea behind it is that alternating between the values of the parent and the child cause much more trouble and confusion than it's worth. By ignoring the "authoritative" version of the NS records completely and always preferring the referral (without which it's not possible to learn about the authoritative records anyway), you avoid the "flip-flop" problem of child centric non-sticky entirely. The main disadvantage are some edge cases where the NS records from the child side can help expedite a migration off of old nameservers prior to the change being made at the registry. This can can be beneficial when you're dealing with certain boneheaded registrars who also provide DNS services, but immediately kill all of your DNS data when you change the servers for your domain to point somewhere else.

As you can see, this is a complicated topic and one that is extremely difficult to document without extensive testing. It works this way because the standards remain loose in this area to this day, at least to the best of my knowledge.

What happens when a client asks a recursor for the value of NS records?

Once again, it depends.

RFC 2181 strongly discourages nameservers from returning cached nameserver data learned from referrals in the answer section, but does not outright forbid it: ("should not")

Unauthenticated RRs received and cached from the least trustworthy of     
those groupings, that is data from the additional data section, and
data from the authority section of a non-authoritative answer, should
not be cached in such a way that they would ever be returned as
answers to a received query.  They may be returned as additional
information where appropriate.  Ignoring this would allow the
trustworthiness of relatively untrustworthy data to be increased
without cause or excuse.

[...] Note that throughout this document, "authoritative" means a
reply with the AA bit set.

Despite this warning, we can return the NS records observed from the referral in our answer as it's not explicitly forbidden. I suspect it's more likely to happen with parent centric implementations, but I don't have any good data in front of me at the moment. I'll do some testing on my own when I find the time and update this answer.

What happens if the server has the nameservers from the referral cached, and does honor RFC 2181? In the case of ISC BIND (at least in the 9.10 and 9.11 implementations that I have most experience with), the explicit request for the NS records from the client triggers an immediate refresh against the child nameservers. It's easiest to observe when the client nameservers are pointing at something that BIND considers broken, such as NS records that point at CNAME records. BIND will initially be able to answer for the domain using the information it received from the initial referral (glue included), but the domain will immediately stop working the moment the NS record request is received and the nameserver attempts to re-learn the nameserver information it needs to communicate with.


Closing Disclaimer: This is an extremely vague and confusing area of recursive server operation. Some things may have changed since I last explored the topic in-depth. I'm happy to amend any information provided here, but please provide specific data citations where possible.