Some DHCP clients end up with wrong DNS server
The scenario:
- DC running Windows Server 2008 R2 providing DNS + DHCP
- Cisco 1811 Router as the gateway
- 30 Windows XP DHCP clients on the LAN
The problem:
- Some workstations are spontaneously switching to an incorrect DNS server. Specifically,
ipconfig /all
shows that they start using the gateway as a DNS server. - This happens about 5-10 times a day to various computers, sometimes more than once per day.
The workaround:
- Repairing the connection on the XP client always fixes the problem, and the correct DNS server address is obtained.
We lost our main DNS/DHCP machine a week ago, and had to bring this one online as a spare. We've been having this issue since then. DHCP leases on the old and new servers are configured for "wired" (8 day) duration. There are definitely no other DHCP servers active on the LAN. So far there is no discernible pattern about which clients will show this problem, or when.
When I ran DCDIAG /test:DNS
it came back clean. Manual inspection of the DNS zone shows that all the records are appearing as expected, with no traces of the previous machine in there.
Update Feb 27: Added screenshots.
Here is a screenshot of the DHCP scope options on the 2008 R2 server. http://nicwaller.com/screens/dhcpscope.png
And here is a screenshot of ipconfig /all
running on a healthy host. I don't have any ailing hosts at the moment, but will grab a screencap next time it happens.
http://nicwaller.com/screens/ipconfigall.png
Update Feb 28: More screenshots.
Here's a screenshot of DHCP and DNS traffic from a healthy client when repairing the local area connection. There's definitely only one server responding, but it does seem strange that the negotiation takes place twice. I'll try to get a similar capture from a sick machine this coming week. http://nicwaller.com/screens/dhcprenew_screen.png
Update Mar 01: Caught a bad ipconfig.
Here's a screenshot of ipconfig /all
from a client that had this issue. It says the lease was issued this morning, but it doesn't even have an entry for the secondary DNS I set up yesterday. Both DNS servers were discovered properly when repairing the connection.
http://nicwaller.com/screens/bad_dns.png
Update Mar 01: It even got the sysadmin!
This issue finally affected my personal workstation this morning. Unfortunately I had just rebooted and wasn't running a packet dump at the time. I set up a secondary server yesterday, and was logging all DNS traffic to it. My machine had not contacted the secondary DNS in over half an hour, so that says to me that it's just spontaneously reverting to the gateway without even failing over to secondary DNS first.
Today I swapped the order of the DNS servers in DHCP, so the secondary is primary and vice versa. I will update again once I know how that goes.
Solution 1:
I would run a packet dump on a few of these boxes until it happens. See if you can find anything network related. Maybe you will see some packets that give you an idea if it is not that.
Can a group policy in Windows set the DNS server. Maybe somehow there has been a strange GP applied on the domain?
Update:
I have never done this, but since it seems like you are getting a little desperate, what about blowing away the current DHCP database. These instructions say how to back up the mdb file, so maybe moving it somehwere else will make it so DHCP creates a new one after restarting. That might fix the problem...
The thing that doesn't jive in my mind, :-), is why clients would be getting new information if their lease hasn't expired yet and they haven't rebooted... is this what is happening?
Solution 2:
Check your router to make sure that it isn't providing any sort of DHCP service. If you telnet into the router and it has lines in the configuration that start with "ip dhcp", then it is providing some sort of DHCP response.
Solution 3:
Clients will change from Primary to Secondary DNS if the primary DNS doesn't respond in a timely manner and they won't switch back until the lease is renewed. I think DNS will just fail if the secondary doesn't respond, ie. I don't think they then switch to the Gateway but it's possible. This could be tested pretty easily.
The options that I can think of:
-The scope actually has the incorrect IP as a DNS option. Corrupt DHCP scope, delete and recreate or post a screen shot or export of the DHCP scope options.
-There is another DHCP server running. Ipconfig /all lists the IP of the DHCP server the client obtained the lease from & the timestamp when it was obtained.
-On the clients there is a DNS server(s) setup under the Static or Alternative configurations.
-The clients are changing between wired and wireless and getting a different lease on the wireless network.
If IPConfig shows the DNS server listed and the IP of the DHCP server is also as expected then something on that IP is giving out the bad/unexpected leases.
As Kyle suggested, wireshark/netmon on the server will confirm if the lease is actually coming from that server with the bad info.