How is DNS used by individual processes?

When resolving FQDNs or machine names to IP addresses on my local network (mycompany.internal) I can use dig on the command line (linux/mac) or nslookup (windows) to query the configured server and get a response. But trying to enter the FQDN or even just the machine name in a ping command or in a web browser results in 'Unknown Host' or DNS errors. Here's a sample, this one from the Mac:

mac:~ atroon$ dig server.mycompany.internal


; <<>> DiG 9.6.0-APPLE-P2 <<>>
server.mycompany.internal ;; global
options: +cmd ;; Got answer: ;;
->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5219 ;; flags: qr aa rd
ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0,
ADDITIONAL: 0

;; QUESTION SECTION:
;server.mycompany.internal.  IN A

;; ANSWER SECTION:
server.mycompany.internal. 1200 IN A 172.16.254.36

;; Query time: 0 msec ;; SERVER:
172.16.254.8#53(172.16.254.8) ;; WHEN: Wed Dec 16 11:39:15 2009 ;; MSG SIZE 
rcvd: 55

mac:~ atroon$ ping server.mycompany.internal<br>
ping: cannot resolve server.mycompany.internal: Unknown host

I cannot for the life of me figure this one out. The DNS server is a SBS 2003 box which handles AD, some file/print, etc for a small company network. This issue happens to me about three times a week, and when I'm connected to the local network directly, the same switch as the server even. I can make any connection I want with IP addresses, I just can't make DNS work. Additionally, at the same time I'm experiencing this, other users are fine, which makes me think it's a problem on my Mac. But what sort of problem? How can dig send a query and get a reply, and ping say 'unknown host'?

I'm posting here vs. serverfault because I think this is a local problem not a server problem...but if anyone can point me at the server, I guess we'll head down the street a domain or two.


Solution 1:

Depending on what version of Mac OS X you're using the way DNS is handled by the system has changed.

Essentially there are two DNS resolving mechanisms in Mac OS X. The standard UNIX approach (/etc/resolv.conf) which is used by dig and then the approach used by the rest of the system.

In Mac OS X 10.4 and 10.5 the two approaches were much more tightly tied together; refreshing one tended to refresh them both. However in 10.6 and to a much lesser extent 10.5 it's possible to have dig give you the right value while the system resolving mechanism still has a bad value.

To flush the DNS cache for each of the versions of Mac OS X:

  • 10.4: lookupd -flushcache
  • 10.5: dscacheutil -flushcache
  • 10.6: sudo dscacheutil -flushcache or sudo killall -HUP mDNSResponder (The first command should perform the second command for you now but in earlier versions of 10.6 it didn't appear to)

ping if I recall uses the system lookup - so different resolving mechanism. /etc/resolv.conf will always use the DNS servers in order, whereas mDNSResponder tries to be 'smart' which can bite you in the rear depending on your setup.

Also, do you have multiple DNS servers specified on your Mac and/or via DHCP? Snow Leopard has introduced a different behaviour (bug?) where the order of the DNS servers will change. This plays havoc on split DNS (internally you use one IP, but externally a different one) as there are times it will stop asking the internal DNS server first before asking the second server (external this time) in line. It's supposedly a method to contact the fastest DNS server to avoid DNS related delays. The easiest fix prior to 10.6.3 is to only serve the internal DNS server via DHCP and make sure your forwarding settings on the DNS server are set accordingly.

Now as of 10.6.3 it's possible to tell mDNSResponder to always use the proper order and not try and optimize DNS request times. You can do this by adding the key StrictUnicastOrdering and setting it to true to mDNSResponder's Launch Daemon plist (and reload it as necessary).

In Mac OS X v10.6, the default DNS server searching behavior is that when a server does not return a result (returning SERV_FAIL for a query), and other servers are available to query, the server is temporarily disabled in the search order for about thirty seconds. If there is more than one server for the query and all of them have returned SERV_FAIL, the servers will be queried in the order that they were disabled (that is, the server that has been disabled the longest will be used first).

(Source: support.apple.com and thanks to Yar who put this up before I did.)

You can automate this (a little faster and easier than Apple's commands) by running the following commands:

sudo /usr/libexec/PlistBuddy -c "Add :StrictUnicastOrdering bool true" /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist

and reverse it by running:

sudo /usr/libexec/PlistBuddy -c "Delete :StrictUnicastOrdering" com.apple.mDNSResponder.plist

After either or you'll need to reload the job in launchd to restart mDNSResponder by running:

sudo launchctl unload /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist
and then
sudo launchctl load /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist

Solution 2:

Finally, a fix thanks to 10.6.3 and a little fiddling. Basically you modify the com.apple.mDNSResponder.plist and then restart the dnsresponder.

I could be wrong, but I think that the instructions are off, and should say sudo cp where they say sudo mv at the beginning.

Solution 3:

Look at what's in /etc/resolv.conf to see which nameservers your Mac is using. You can also add nameservers in the network preferences -- choose the adaptor you're using, click the "Advanced..." button and then click on the "DNS" tab.

On a mac the command line tool to flush the DNS cache is:

dscacheutil -flushcache

Update:

I found lots of good stuff in this thread on discussions.apple.com. For example:

dig(1) (and host(1) and nslookup(1)) all directly use the DNS resolver and as such the DNS server ordering as present in /etc/resolv.conf.

However, ping(8) uses the internal Mac OS X name resolution system which uses a "super DNS search client" which uses the results that are listable via scutil --dns to order queries.

You can see into what has been cached by running sudo killall -INFO mDNSResponder and then looking in /var/log/system.log