Should I run my own DNS recursor or local cache daemon?

Solution 1:

nscd does more than just caching DNS requests; it also caches lookups for usernames and groups along with some other less common uses. It's standard on Linux systems (it's packaged as part of glibc) and is probably already installed, and it uses very little memory, so there's no reason not to run it. It will provide good caching behavior without needing any further configuration.

Since EC2 charges for external traffic, and traffic to 8.8.8.8 (the Google resolver) is going to be much slower than traffic internal to the datacenter, you should prefer EC2 DNS unless you have a very specific reason not to. You can set up the Google DNS (8.8.8.8 and 8.8.4.4) as backups for the Amazon DNS if you like, but it's very unlikely that they'll be down when the rest of the zone is working.

My recommendations for your EC2 virtual machines:

  • Use nscd, which should be set up by default (/usr/sbin/nscd; you should check your distribution's run configuration to make sure the service is started at boot).
  • Use the Amazon DNS servers as your defaults.
  • Add the Google servers as backups if you like. How you do this will vary based on your distribution. If you're not sure, check /etc/resolv.conf, which is the file that glibc (nscd) looks at, and there will usually be a comment telling you how it was configured. Servers are checked in the order they're listed in resolv.conf, so adding the Amazon IPs first and then the Google IPs will let nscd fall back to Google if for some reason Amazon isn't working.

Sources: man pages for nscd(8) and resolv.conf(5)

Solution 2:

Install dnsmasq or dnscache on three or more machines in your network. I'd recommend using AWS VPC's for the whole infrastructure but that is a somewhat separate issue.

Point all your hosts to these three nameservers.

Configure your resolv.conf with the following:

nameserver IP_ADDRESS_1
nameserver IP_ADDRESS_2
nameserver IP_ADDRESS_3
options rotate 
options timeout:1

The above setup has many advantages. First, you have resiliency at the recursive nameserver level by having a minimum of three hosts. Second, you gain the benefits of caching such that when server one does a lookup against IP_ADDRESS_1 for the first time, that nameserver on IP_ADDRESS_1 will cache the result. When another server does a lookup, the result will be returned much faster on a cache hit. Third, by setting the rotate option you balance the load across your recursive DNS infrastrucutre. Finally, by setting timeout:1 you minimize the impact of having one of your DNS servers down for maintenance.

Solution 3:

Ubuntu install dnsmasq by default and should provide a reasonably secure and fast way to setup a DNS cache, without any drawbacks.

More details on https://unix.stackexchange.com/a/59424

Solution 4:

The GoDaddy article you linked to is outlining the problems of running an open recursing nameserver. Indeed, that would be a can of worms, and you wouldn't want to do it. As long as your recursor is listening only on loopback or within your internal interface and/or firewalled so no one else can access it, the article doesn't apply.

Your line of thinking is excellent, and all the options you're considering are great. If you trust either EC2's or Google's recursor, by no means go ahead.

Indeed it is quite common for many mid-to-large sized organizations to run their own recursors.

For performance, I would install a pair of recursors in each availability zone, and configure them to be the first two nameservers in /etc/resolv.conf, then append the EC2 recursor. This way, you can be sure that

Installing your own recursor ensures minimal latency (as opposed to going to 8.8.8.8), and that your cache is not shared with others (which has both pros and cons.)

For a modern, well-maintained, lightweight and high performance recursor, I would highly recommend Unbound (see independent recommendation here: http://info.menandmice.com/blog/bid/37244/10-Reasons-to-use-Unbound-DNS)