Windows DNS Server 2008 R2 fallaciously returns SERVFAIL
I have a Windows 2008 R2 domain controller which is also a DNS server. When resolving certain TLDs, it returns a SERVFAIL:
$ dig bogus.
; <<>> DiG 9.8.1 <<>> bogus.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 31919
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;bogus. IN A
I get the same result for a real TLD like com.
when querying the DC as shown above. Compare to a BIND server that is working as expected:
$ dig bogus. @128.59.59.70
; <<>> DiG 9.8.1 <<>> bogus. @128.59.59.70
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 30141
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;bogus. IN A
;; AUTHORITY SECTION:
. 10800 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2012012501 1800 900 604800 86400
;; Query time: 18 msec
;; SERVER: 128.59.59.70#53(128.59.59.70)
;; WHEN: Wed Jan 25 14:09:14 2012
;; MSG SIZE rcvd: 98
Similarly, when I query my Windows DNS server with dig . any
, I get a SERVFAIL but the BIND servers return the root zone as expected.
This sounds similar to the issue described in http://support.microsoft.com/kb/968372 except I am using two forwarders (128.59.59.70 from above as well as 128.59.62.10) and falling back to root hints so the preconditions to expose the issue are not the same. Nevertheless, I also applied the MaxCacheTTL
registry fix as described and restarted DNS and the whole server as well but the problem persists. The problem occurs on all domain controllers in this domain and has occurred since half a year ago, even though the servers are getting automatic Windows updates.
EDIT
Here is a debug log. The client is 160.39.114.110, which is my workstation.
1/25/2012 2:16:01 PM 0E08 PACKET 000000001EA6BFD0 UDP Rcv 160.39.114.110 2e94 Q [0001 D NOERROR] A (5)bogus(0)
UDP question info at 000000001EA6BFD0
Socket = 508
Remote addr 160.39.114.110, port 49710
Time Query=1077016, Queued=0, Expire=0
Buf length = 0x0fa0 (4000)
Msg length = 0x0017 (23)
Message:
XID 0x2e94
Flags 0x0100
QR 0 (QUESTION)
OPCODE 0 (QUERY)
AA 0
TC 0
RD 1
RA 0
Z 0
CD 0
AD 0
RCODE 0 (NOERROR)
QCOUNT 1
ACOUNT 0
NSCOUNT 0
ARCOUNT 0
QUESTION SECTION:
Offset = 0x000c, RR count = 0
Name "(5)bogus(0)"
QTYPE A (1)
QCLASS 1
ANSWER SECTION:
empty
AUTHORITY SECTION:
empty
ADDITIONAL SECTION:
empty
1/25/2012 2:16:01 PM 0E08 PACKET 000000001EA6BFD0 UDP Snd 160.39.114.110 2e94 R Q [8281 DR SERVFAIL] A (5)bogus(0)
UDP response info at 000000001EA6BFD0
Socket = 508
Remote addr 160.39.114.110, port 49710
Time Query=1077016, Queued=0, Expire=0
Buf length = 0x0fa0 (4000)
Msg length = 0x0017 (23)
Message:
XID 0x2e94
Flags 0x8182
QR 1 (RESPONSE)
OPCODE 0 (QUERY)
AA 0
TC 0
RD 1
RA 1
Z 0
CD 0
AD 0
RCODE 2 (SERVFAIL)
QCOUNT 1
ACOUNT 0
NSCOUNT 0
ARCOUNT 0
QUESTION SECTION:
Offset = 0x000c, RR count = 0
Name "(5)bogus(0)"
QTYPE A (1)
QCLASS 1
ANSWER SECTION:
empty
AUTHORITY SECTION:
empty
ADDITIONAL SECTION:
empty
Every option in the debug log box was checked except "filter by IP". By contrast, when I query, say, accounts.google.com, I can see the DNS server go out to its forwarder (128.59.59.70, for example). In this case, I didn't see any packets going out from my DNS server even though bogus.
was not in the cache (the debug log was already running and this is the first time I queried this server for bogus.
or any TLD). It just returned SERVFAIL without consulting any other DNS server, as in the Microsoft KB article linked above.
We have had similar issue on Microsoft DNS servers with no forwarders configured. Looks like this hotfix is related: http://support.microsoft.com/kb/2508835 . I wouldnt say that using forwarders would negate the applicability.