Cannot resolve websites intermittently (mostly .gov)

We are using Windows Server 2012. Sometimes we are unable to resolve .gov websites. When we check with the following commands, they do resolve, so we know that the .gov websites are available.

nslookup www.fda.gov 8.8.8.8
Server:  google-public-dns-a.google.com
Address:  8.8.8.8

Non-authoritative answer:
Name:    a1715.dscb.akamai.net
Addresses:  2607:f7d8:801:100::40ba:2f29
          2607:f7d8:801:100::40ba:2f30
          23.3.96.168
          23.3.96.89
Aliases:  www.fda.gov
          www.fda.gov.edgesuite.net

Using our DNS forwarder:

nslookup www.fda.gov [IP address of DNS forwarder]
Server:  [FQDN of DNS forwarder]
Address:  [IP address of DNS forwarder]

Non-authoritative answer:
Name:    a1715.dscb.akamai.net
Addresses:  2607:f7d8:801:100::40ba:2f30
          2607:f7d8:801:100::40ba:2f29
          23.3.96.168
          23.3.96.89
Aliases:  www.fda.gov
          www.fda.gov.edgesuite.net

Using our DNS Server:

nslookup -d2 www.fda.gov
------------
SendRequest(), len 43
    HEADER:
        opcode = QUERY, id = 1, rcode = NOERROR
        header flags:  query, want recursion
        questions = 1,  answers = 0,  authority records = 0,  additional = 0

    QUESTIONS:
        11.1.168.192.in-addr.arpa, type = PTR, class = IN

------------
------------
Got answer (126 bytes):
    HEADER:
        opcode = QUERY, id = 1, rcode = NXDOMAIN
        header flags:  response, auth. answer, want recursion, recursion avail
        questions = 1,  answers = 0,  authority records = 1,  additional = 0

    QUESTIONS:
        11.1.168.192.in-addr.arpa, type = PTR, class = IN
    AUTHORITY RECORDS:
    ->  1.168.192.in-addr.arpa
        type = SOA, class = IN, dlen = 49
        ttl = 3600 (1 hour)
        primary name server = iss3.iss.local
        responsible mail addr = hostmaster.iss.local
        serial  = 163
        refresh = 900 (15 mins)
        retry   = 600 (10 mins)
        expire  = 86400 (1 day)
        default TTL = 3600 (1 hour)

------------
Server:  UnKnown
Address:  [server IP]

------------
SendRequest(), len 29
    HEADER:
        opcode = QUERY, id = 2, rcode = NOERROR
        header flags:  query, want recursion
        questions = 1,  answers = 0,  authority records = 0,  additional = 0

    QUESTIONS:
        www.fda.gov, type = A, class = IN

------------
DNS request timed out.
    timeout was 2 seconds.
timeout (2 secs)
SendRequest failed
------------
SendRequest(), len 29
    HEADER:
        opcode = QUERY, id = 3, rcode = NOERROR
        header flags:  query, want recursion
        questions = 1,  answers = 0,  authority records = 0,  additional = 0

    QUESTIONS:
        www.fda.gov, type = AAAA, class = IN

------------
DNS request timed out.
    timeout was 2 seconds.
timeout (2 secs)
SendRequest failed
*** Request to UnKnown timed-out

When I use set vc in the nslookup command:

Server:  UnKnown
Address:  192.168.1.11

*** UnKnown can't find www.fda.gov: Server failed

We did have IPv6 disabled, so I re-enabled it, but the problem persists. We would appreciate any advice or troubleshooting steps to resolve this issue.

We are in the United States. Here is a ping at a time that we have access to the FDA website. (As mentioned, this is intermittent.):

Pinging a1715.dscb.akamai.net [23.3.96.168] with 32 bytes of data:
Reply from 23.3.96.168: bytes=32 time=97ms TTL=57
Reply from 23.3.96.168: bytes=32 time=14ms TTL=57
Reply from 23.3.96.168: bytes=32 time=36ms TTL=57
Reply from 23.3.96.168: bytes=32 time=20ms TTL=57

Ping statistics for 23.3.96.168:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 14ms, Maximum = 97ms, Average = 41ms

Here is a ping when it is not available:

ping www.fda.gov
Ping request could not find host www.fda.gov. Please check the name and try again.

ping www.fda.gov.
Ping request could not find host www.fda.gov.. Please check the name and try again.

Others have had similar problems. We tried changing the MaxCacheTTL to 30 minutes (1800) as it resolved the problem in that thread, but the problem persists for us.

We also just tried changing the MaxCacheTTL to 0. That did not work. But we also discovered that we cannot access www.paypal.com at the same time we cannot access these other .gov websites. What is interesting is that when we are able to access www.fda.gov, we are also able to access www.paypal.com. That indicates to me that it cannot be a problem with TTL since TTL happens on a per-record basis. Also, the fact that adjusting the MaxCacheTTL the first time did not work should have been evident enough.

We performed a detailed logging action on DNS for www.fda.gov. The results are fascinating, but we don't know what to do with it. It appears that the DNS server looks for it as a subdomain in our domain: www.fda.gov.[domain].local.

3/9/2017 11:33:10 AM 448C PACKET  000000010655E8A0 UDP Rcv [server IP]    0002   Q [0001   D   NOERROR] A      (3)www(3)fda(3)gov(3)[domain](5)local(0)
UDP question info at 000000010655E8A0
  Socket = 492
  Remote addr [server IP], port 60700
  Time Query=2151068, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x0027 (39)
  Message:
    XID       0x0002
    Flags     0x0100
      QR        0 (QUESTION)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        0
      Z         0
      CD        0
      AD        0
      RCODE     0 (NOERROR)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   0
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(3)[domain](5)local(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
      empty
    ADDITIONAL SECTION:
      empty

3/9/2017 11:33:10 AM 448C PACKET  000000010655E8A0 UDP Snd [server IP]    0002 R Q [8385 A DR NXDOMAIN] A      (3)www(3)fda(3)gov(3)[domain](5)local(0)
UDP response info at 000000010655E8A0
  Socket = 492
  Remote addr [server IP], port 60700
  Time Query=2151068, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x0064 (100)
  Message:
    XID       0x0002
    Flags     0x8583
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        1
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     3 (NXDOMAIN)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   1
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(3)[domain](5)local(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
    Offset = 0x0027, RR count = 0
    Name      "(3)[domain](5)local(0)"
      TYPE   SOA  (6)
      CLASS  1
      TTL    3600
      DLEN   40
      DATA   
        PrimaryServer: (4)servername[C027](3)[domain](5)local(0)
        Administrator: (10)hostmaster[C027](3)[domain](5)local(0)
        SerialNo     = 2735
        Refresh      = 900
        Retry        = 600
        Expire       = 86400
        MinimumTTL   = 3600
    ADDITIONAL SECTION:
      empty

When it is working:

3/9/2017 11:33:10 AM 448C PACKET  000000010672E9F0 UDP Snd [server IP]    0004 R Q [8081   DR  NOERROR] A      (3)www(3)fda(3)gov(0)
UDP response info at 000000010672E9F0
  Socket = 492
  Remote addr [server IP], port 60702
  Time Query=2151068, Queued=2151068, Expire=2151071
  Buf length = 0x0200 (512)
  Msg length = 0x0077 (119)
  Message:
    XID       0x0004
    Flags     0x8180
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     0 (NOERROR)
    QCOUNT    1
    ACOUNT    3
    NSCOUNT   0
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
    Offset = 0x001d, RR count = 0
    Name      "[C00C](3)www(3)fda(3)gov(0)"
      TYPE   CNAME  (5)
      CLASS  1
      TTL    128
      DLEN   25
      DATA   (3)www(3)fda(3)gov(7)edgekey(3)net(0)
    Offset = 0x0042, RR count = 1
    Name      "[C029](3)www(3)fda(3)gov(7)edgekey(3)net(0)"
      TYPE   CNAME  (5)
      CLASS  1
      TTL    3992
      DLEN   25
      DATA   (6)e11872(4)dscb(10)akamaiedge[C03D](3)net(0)
    Offset = 0x0067, RR count = 2
    Name      "[C04E](6)e11872(4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    20
      DLEN   4
      DATA   184.31.201.196
    AUTHORITY SECTION:
      empty
    ADDITIONAL SECTION:
      empty

3/9/2017 11:33:32 AM 4988 PACKET  00000001050E88F0 UDP Rcv [server IP]   9658   Q [0001   D   NOERROR] A      (3)www(3)fda(3)gov(0)
UDP question info at 00000001050E88F0
  Socket = 492
  Remote addr [server IP], port 62657
  Time Query=2151089, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x001d (29)
  Message:
    XID       0x9658
    Flags     0x0100
      QR        0 (QUESTION)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        0
      Z         0
      CD        0
      AD        0
      RCODE     0 (NOERROR)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   0
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
      empty
    ADDITIONAL SECTION:
      empty

3/9/2017 11:40:36 AM 0F98 PACKET  0000000102B32600 UDP Snd [server IP]    23f2 R Q [8081   DR  NOERROR] A      (3)www(3)fda(3)gov(0)
UDP response info at 0000000102B32600
  Socket = 492
  Remote addr [server IP], port 55901
  Time Query=2151514, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x0184 (388)
  Message:
    XID       0x23f2
    Flags     0x8180
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     0 (NOERROR)
    QCOUNT    1
    ACOUNT    3
    NSCOUNT   9
    ARCOUNT   5
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
    Offset = 0x001d, RR count = 0
    Name      "[C00C](3)www(3)fda(3)gov(0)"
      TYPE   CNAME  (5)
      CLASS  1
      TTL    300
      DLEN   25
      DATA   (3)www(3)fda(3)gov(7)edgekey(3)net(0)
    Offset = 0x0042, RR count = 1
    Name      "[C029](3)www(3)fda(3)gov(7)edgekey(3)net(0)"
      TYPE   CNAME  (5)
      CLASS  1
      TTL    15195
      DLEN   25
      DATA   (6)e11872(4)dscb(10)akamaiedge[C03D](3)net(0)
    Offset = 0x0067, RR count = 2
    Name      "[C04E](6)e11872(4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    20
      DLEN   4
      DATA   23.194.99.134
    AUTHORITY SECTION:
    Offset = 0x0077, RR count = 0
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n6dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x008c, RR count = 1
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n7dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00a1, RR count = 2
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)a0dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00b6, RR count = 3
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n0dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00cb, RR count = 4
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n1dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00e0, RR count = 5
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n2dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00f5, RR count = 6
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n3dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x010a, RR count = 7
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n4dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x011f, RR count = 8
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n5dscb[C05A](10)akamaiedge[C03D](3)net(0)
    ADDITIONAL SECTION:
    Offset = 0x0134, RR count = 0
    Name      "[C0D7](6)n1dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    807
      DLEN   4
      DATA   69.22.155.207
    Offset = 0x0144, RR count = 1
    Name      "[C0EC](6)n2dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    3922
      DLEN   4
      DATA   69.22.155.209
    Offset = 0x0154, RR count = 2
    Name      "[C101](6)n3dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    1418
      DLEN   4
      DATA   24.143.193.180
    Offset = 0x0164, RR count = 3
    Name      "[C083](6)n6dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    3973
      DLEN   4
      DATA   23.220.96.109
    Offset = 0x0174, RR count = 4
    Name      "[C098](6)n7dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    279
      DLEN   4
      DATA   23.220.96.86

When it is not working:

3/9/2017 11:50:47 AM 2988 PACKET  00000001058C3ED0 UDP Snd [server IP]    44af R Q [8281   DR SERVFAIL] A      (3)www(3)fda(3)gov(0)
UDP response info at 00000001058C3ED0
  Socket = 492
  Remote addr [server IP], port 54261
  Time Query=2152117, Queued=2152121, Expire=2152124
  Buf length = 0x0200 (512)
  Msg length = 0x001d (29)
  Message:
    XID       0x44af
    Flags     0x8182
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     2 (SERVFAIL)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   0
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
      empty
    ADDITIONAL SECTION:
      empty

I discovered RAS had received an IP address that was then being reported as a DNS Name Server. I have modified those settings to remove that IP address, but the problem remains.

Below is a snapshot of the DNS properties for the Forward Zone of [domain].local. Forward Zone of [domain].local

Below is a snapshot of the DNS server properties: DNS server properties


Solution 1:

Based on the fact disabling your DNS Forwarders and instead using only the Root Hint servers has eliminated the problem, it is reasonable to believe the problem is related to your forwarders. Your extensive search for mis-configuration of your DNS server has turned up nothing. A clear explanation as to why you're experiencing this problem doesn't seem forthcoming, so you may need to go with what works.

That said however, in this case, you have a couple of options:

  1. Continue using the root hint servers exclusively. While using DNS Forwarders can potentially provide faster lookup times (e.g. due to being "closer" to your network and having cached records of popularly accessed sites), there's nothing wrong with using the root hints.
  2. Try different forwarders. You could use Google's DNS servers (8.8.8.8 and 8.8.4.4), Verisign's Public DNS servers (64.6.64.6 and 64.6.65.6), or pick one from a list.

Solution 2:

I did make a change a few months ago and wanted to confirm that it worked before answering my question. It turns out that the problem is not the DNS Server; it is the firewall. We use a Cisco ASA 5500, and it did not have EDNS0 (extension mechanisms for DNS) enabled. We used the workaround described in the article to resolve the problem. Basically, the idea is to allow DNS packets to change their “Maximum Packet Length” from 512 to 4096. Apparently, the .gov servers are using DNS extensions. We haven't had issues since. And I intend on changing the DNS settings back to the IP address of our ISP DNS servers in the near future.