named/bind is refusing to serve some domains after resolving them itself

Why is bind refusing some of my queries? This only happens for certain domains.

A query through named fails:

$ dig -t A fedoraproject.org @127.0.0.1
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 33117

$ journalctl -n10
...
Aug 01 17:07:11 ns3.r3.mclarkdev.com named[10807]: resolver priming query complete
Aug 01 17:09:57 ns3.r3.mclarkdev.com named[10807]: timed out resolving 'fedoraproject.org/DNSKEY/IN': 8.8.8.8#53
Aug 01 17:09:59 ns3.r3.mclarkdev.com named[10807]: timed out resolving 'fedoraproject.org/DNSKEY/IN': 8.8.8.8#53

However a direct query to the forwarder works:

$ dig -t A fedoraproject.org @8.8.8.8
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42249

  ... records ...

Bind is using a pretty default configuration.
The only things that I've changed were allowing queries from anywhere and adding a zones file for serving a some local records.

options {
    listen-on port 53 { any; };
    allow-query     { any; };
    forwarders      { 8.8.8.8; };
    recursion yes;
    ...
    dnssec-enable yes;
    dnssec-validation yes; // also tried auto
}

...

// includes two additional `zone` definitions
include "/opt/dns/named.zones";

OS Version: CentOS Linux release 8.4.2105
Kernel Version: 4.18.0-305.10.2.el8_4.x86_64
Named Version: BIND 9.11.26-RedHat-9.11.26-4.el8_4

Watching tcpdump, I can see that named is reaching out to the forwarder and retrieving the A records, but is refusing to serve them to the client after doing some additional queries.

localhost.49683 > localhost.domain: 14274+ A? fedoraproject.org. (35)
ns3.r3.mclarkdev.com.56668 > 8.8.8.8.domain: 21852+% [1au] A? fedoraproject.org. (58)
localhost.39587 > localhost.domain: 53253+ PTR? 8.8.8.8.in-addr.arpa. (38)
ns3.r3.mclarkdev.com.55378 > 8.8.8.8.domain: 61019+% [1au] PTR? 8.8.8.8.in-addr.arpa. (61)
8.8.8.8.domain > ns3.r3.mclarkdev.com.56668: 21852$ 12/0/1 fedoraproject.org. A 140.211.169.206, fedoraproject.org. A 152.19.134.198, fedoraproject.org. A 8.43.85.73, fedoraproject.org. A 152.19.134.142, fedoraproject.org. A 38.145.60.21, fedoraproject.org. A 140.211.169.196, fedoraproject.org. A 209.132.190.2, fedoraproject.org. A 8.43.85.67, fedoraproject.org. A 67.219.144.68, fedoraproject.org. A 38.145.60.20, fedoraproject.org. RRSIG, fedoraproject.org. RRSIG (528)
  /\ bind has the A records

ns3.r3.mclarkdev.com.52120 > 8.8.8.8.domain: 7073+% [1au] DNSKEY? fedoraproject.org. (58)
8.8.8.8.domain > ns3.r3.mclarkdev.com.55378: 61019 1/0/1 8.8.8.8.in-addr.arpa. PTR dns.google. (73)
ns3.r3.mclarkdev.com.55309 > 8.8.8.8.domain: 23607+% [1au] DS? 8.in-addr.arpa. (55)
localhost.48388 > localhost.domain: 55328+ PTR? 201.23.16.172.in-addr.arpa. (44)
  /\ bind makes some extra queries

localhost.domain > localhost.48388: 55328 NXDomain* 0/1/0 (98)
  /\ bind serves NXDomain to client

Why is named refusing to serve the result to the client? It happens only for about 1% of domains.


The tcpdump shows that it's successfully getting the A record of fedoraproject.org, but it's also trying to get the DNSKEY record, which is used for DNSSEC validation. But there's no response to that.

I queried 8.8.8.8 for this DNSKEY record, and it worked fine.

$ dig fedoraproject.org dnskey @8.8.8.8

; <<>> DiG 9.10.6 <<>> fedoraproject.org dnskey @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63666
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;fedoraproject.org.     IN  DNSKEY

;; ANSWER SECTION:
fedoraproject.org.  108 IN  DNSKEY  256 3 5 AwEAAcCWNQWl5pCI3iOOP2r8nStL60Zjb/2JQLQytamVap0L44z0YWft u7pu0hx3cnIM1ejQOsEwbg2/10IyC+38cYqJDXbSdFg1zGztOS5xNz7r 9hzSRK5N2jkycdJ/BoByJ4Y+XGpDqfG4I97++8sIzSrw60TmGAKTvM9v iL3ByeCN
fedoraproject.org.  108 IN  DNSKEY  257 3 5 AwEAAdTXJc0joiKGfTvLXi+LXxGpKvPvOoJEst9PR8TCCvXGVp7h3BY3 uXLkjckuT0aopCp2KF8zHgNgpMK03p1fd94pn9JZSuxfqvKsiYH2KvNO a/655oPj06jRhqAP5grX01Iz4BH411ZhGxIQ1BzZtOr1wAazojMJzLUg ChRJs8GVt3LU0e6T8z1RQF33Dt9UMHIR5EAsFAqfZ/tsbfJDYktGoZi3 nFlW7A745+ObM1LNXOWq3FcYPVzhH08Q7/7WpxmzM6/ET8VeqWIsvh8E nZNDNMfJyPbY9B1BOIrFCpE03ALgFMejaBZwmeQaX+D4Duup5xGOmdtC O4GSpM1YH6c=
fedoraproject.org.  108 IN  DNSKEY  257 3 14 7ttmhus8JD56ybsvMVZVsXa3U2R+2+WmOPIP7BU6t2LicosMZ2Ju3pfv ijsa5LvBvVCB4xVtLSqEdLSvW4vJPLSAB2uyJwHPJMezh0SzGmVCImLU 6qDxsxjHqtZ76/Sf
fedoraproject.org.  108 IN  DNSKEY  256 3 14 04ZsDOgyzs3kJsJ4jEY3MYufkCOWm1OI8N4M+dlBOBmweln0TSaKfafH zNCkaPiVG4bdgdnrzwxmjpK5GQgsiB47np+I8850Ea3EJG5ORDl3f//l rr92HiYh5DxCNhkG

So I suspect something in your environment is blocking the query or response. It could be a firewall issue, filtering DNS record types or large responses.