BIND/DNS - dig +trace = Bad Referral and Bad Horizontal Referral

Solution 1:

Per @andrew-b's comment, this is usually due to a mismatch in delegation.

I came across this same error where a developer was attempting to do a +trace lookup of a record along the lines of host.subdomain.example.org. The exact cause will likely differ - but will probably be of a similar theme.

The cause in our case was that we have a firewall rule that captures-and-redirects* DNS lookups sent out to "unauthorised" servers. The request would instead reach our own DNS server which then performed a recursive lookup. The client would think it was sending each successive lookup to the Internet but these requests would actually be responded to by our internal server.

The fix was to remind the developer of the fact that DNS requests would be intercepted - and that they could do testing from a server that was whitelisted to bypass the DNS redirect rule.

See redacted error as the developer received it below:

tricky-desktop:~ tricky$ dig +trace host.subdomain.example.org

; <<>> DiG 9.8.3-P1 <<>> +trace host.subdomain.example.org
;; global options: +cmd
.           3600    IN  NS  g.root-servers.net.
.           3600    IN  NS  l.root-servers.net.
.           3600    IN  NS  j.root-servers.net.
.           3600    IN  NS  k.root-servers.net.
.           3600    IN  NS  b.root-servers.net.
.           3600    IN  NS  m.root-servers.net.
.           3600    IN  NS  d.root-servers.net.
.           3600    IN  NS  i.root-servers.net.
.           3600    IN  NS  e.root-servers.net.
.           3600    IN  NS  c.root-servers.net.
.           3600    IN  NS  h.root-servers.net.
.           3600    IN  NS  a.root-servers.net.
.           3600    IN  NS  f.root-servers.net.
;; Received 477 bytes from 192.168.1.2#53(192.168.1.2) in 87 ms

subdomain.example.org.  0   IN  NS  ns-outside-1.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-2.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-3.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-4.example.org.
;; Received 295 bytes from 199.43.133.53#53(199.43.133.53) in 14 ms

subdomain.example.org.  0   IN  NS  ns-outside-2.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-3.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-4.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-1.example.org.
;; BAD (HORIZONTAL) REFERRAL
;; Received 295 bytes from 199.43.135.53#53(199.43.135.53) in 5 ms

... 29 REPEATS REDACTED ...

subdomain.example.org.  0   IN  NS  ns-outside-4.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-1.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-2.example.org.
subdomain.example.org.  0   IN  NS  ns-outside-3.example.org.
;; BAD (HORIZONTAL) REFERRAL
dig: too many lookups
tricky-desktop:~ tricky$

The firewall rule was originally necessitated by BYOD staff not being able to look up private internal services due to "Smart DNS" services changing their DNS configuration.