Tell bind to ignore SOA domain check on forwarder zones
I've a weird issue with bind.
Premise: I'm using bind (version 9.16_11) installed on pfSense, but despite this I can change almost anything on bind configuration.
I've configured a simple forward zone, the configuration is something like this:
zone "dom001.my-domain.com" {
type forward;
forward only;
forwarders { 192.168.29.10; };
};
Now, if I try to do a nslookup to an host in this domain I see an error. Example:
Non-authoritative answer:
Name: mail2.dom001.my-domain.com
Address: 192.168.210.126
** server can't find mail2.dom001.my-domain.com: SERVFAIL
The weird thing is that the answer is received (you can see the address in the response) but despite this I see the SERVFAIL error.
Other weird thing, dig doesn't reports any error:
; <<>> DiG 9.16.6 <<>> mail2.dom001.my-domain.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53129
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 3218b8a1b8f64565eb9bd6636124bf73640809a4347f3bcf (good)
;; QUESTION SECTION:
;mail2.dom001.my-domain.com. IN A
;; ANSWER SECTION:
mail2.dom001.my-domain.com. 30 IN A 192.168.210.126
;; Query time: 30 msec
;; SERVER: 172.16.0.2#53(172.16.0.2)
;; WHEN: Tue Aug 24 11:44:19 CEST 2021
;; MSG SIZE rcvd: 110
During these queries I see some 'warnings' on bind's logs:
Aug 24 10:42:58 named 19540 lame-servers: info: FORMERR resolving 'mail2.dom001.my-domain.com/AAAA/IN': 192.168.29.10#53
Aug 24 10:42:58 named 19540 resolver: notice: DNS format error from 192.168.29.10#53 resolving mail2.dom001.my-domain.com/AAAA for client 10.16.16.41#38299: Name cluster.local (SOA) not subdomain of zone dom001.my-domain.com -- invalid response
I've checked further and it seems that the issue is related to SOA records on forwarder server:
;; QUESTION SECTION:
;mail2.dom001.my-domain.com. IN SOA
;; ANSWER SECTION:
cluster.local. 30 IN SOA ns.dns.cluster.local. hostmaster.cluster.local. 1629766398 7200 1800 86400 30
In fact the answer is cluster.local
instead of dom001.my-domain.com
.
This issue is causing strange behavior depending on OS used. For example I see that most Linux server are working fine, while some version of Alpine Linux cannot resolve hostnames on that domain.
And even with the server that are working fine, I have bind's logs full of errors due to this issue.
Unlucky I cannot control the forwarder server and change the SOA record.
My question is: how I can configure bind in order to ignore the SOA record of that forwarder and accept the answer even if the SOA is not coerent?
I know that's not the best solution, but I need to workaround the misconfigured forwarder.
Thanks in advance for your help!
Solution 1:
I don't believe there are any options in BIND that will make it accept that answer, as it appears unrelated to the query.
Seeing that type of inconsistent answer is definitely not expected and I think that if you truly want to (temporarily?) accept and pass on these answers (clients may not like them either, of course), you may have to look at other software that does not care for the response contents in the same way.
(I suspect that dnsdist, being a proxy rather than a recursor, could do this for you.)
That said, I think I can somewhat clear up some of the confusion...
The nslookup
situation is based on how nslookup
sends two separate queries by default, one for A
and one for AAAA
.
The A
query is successful and clearly had the relevant A
record as the answer, the AAAA
query was not successful, presumably there was no AAAA
record and as negative responses always come with the (supposed to be) relevant SOA
record that probably triggers the exact problem you described.
I expect that you can also reproduce the problem with dig
just fine if you make it send the same query that failed, so you would need to send a query for AAAA
to get the same failure that nslookup
got for one of its two queries.
As for the behavior of the other nameserver, it's not really a case of "editing the SOA
", it's more some kind of logic bug in the nameserver software. It should not actually be possible to find a cluster.local
record when looking up mail2.dom001.my-domain.com
, that is in a whole different branch of the tree.