Search and Domain in resolv.conf = slow lookups on Ubuntu

I have a machine running ubuntu 10.04 server. I've started getting long (5-10 second) delays when making connections to (some) sites outside of the LAN using tools like curl and wget.

Using tcpdump and wireshark, I've found the problem to be in the DNS lookups that are being done to setup the connection:

EXAMPLE

When I run:

wget www.site1.com

I see the following behavior:

LOOKUP: AAAA www.site1.com       
        # => fail, no delay, site1 doesn't have an IPv6 AAAA record
LOOKUP: AAAA www.site1.com.mydomain.lan
        # => fail, BIG DELAY, crazy domain doesn't exist
LOOKUP: A www.site1.com
        # => success, no delay, resolves as expected (site1 has IPv4 A record)
CONNECTION PROCEEDS ...

MY SETUP

My server's resolv.conf looks like this:

nameserver 192.168.0.1  # my router
domain mydomain.lan    # made up domain name, for my lan
search mydomain.lan

My server's hosts file looks like this:

127.0.0.1       localhost.localdomain   localhost
192.168.0.10    server1.mydomain.lan   server1
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

RESOLUTIONS?

Why is my resolv.conf search list being used in constructing the name for the 2nd lookup, when the resolv.conf man page suggests that it is only used when looking up host names (no dots):

"Resolver queries having fewer than ndots dots (default is 1) in them will be attempted using each component of the search path in turn until a match is found."

I am under the impression, the 2nd lookup is erroneous and should not be being performed at all...

If I remove the domain and search lines from resolv.conf, the 2nd lookup is no longer done and my delays go away.

(also, if I force wget to only deal with IPv4, the AAAA lookups aren't done, thus the delays dissappear) :

wget --inet4-only www.site1.com

This behavior is by design.

IPv6 is being preferred - so the status of the resource in AAAA terms is determined first. An NXDOMAIN response comes back - so the client figures it needs to append the search path.

Note that the ndots remark you've made is correct - but not the whole story. If the ndots number higher than the name being queried (if it were a single label name, in this case), the only difference to the query behavior is that the query with the suffix appended will occur before the raw name is attempted. Since you're over the ndots threshold, the name is tried first, as provided. See further down in the man page:

The default for n is 1, meaning that if there are any dots in a name, the name will be tried first as an absolute name before any search list elements are appended to it.

That query failed, so the search list must be used. Note the difference in query behavior with wget http://site1/.


What you're seeing is intended behavior - what I think you need to fix is the confluence of factors that's causing this slow lookup.

  • Fix your DNS server, or fix the upstream it's recursing from. A recurser should easily cache the NXDOMAIN it gets from the roots when it tries to look up a non-existent TLD. Since turning off IPv6 fixes it, you may have a DNS server in the path that's failing miserably at caching when AAAA lookups are involved. Try changing your resolver to 8.8.8.8 to verify.
  • Stop adding a search path for a DNS zone that you apparently can't do lookups for. If your DNS server were authoritative for that zone (which is what it would need to be for that search setting to be of any use, since it's not a valid name in the public hierarchy), it would respond immediately. You probably don't need that search configuration - but set it to something that will resolve, so that it doesn't try to guess it from the machine's hostname. search com should do nicely.