PXE boot failing with E11 (ARP timeout) on upgraded Linux server

I just physically replaced a VERY old (2005) IBMThinkCentre NIS/DNS/DHCP/Gateway server running Fedora Core 4 with a slightly less old SUN server running CentOS 6. The newer server has the same IP address as the old one and the Gateway/NIS/DNS functions are all running fine.

An IBM Blade center is behind the gateway and we use PXE boot and read only root file system for the blades. I've tested tftp from another host on the private net and downloading pxelinux.0 from /tftpboot on the new server works fine too.

I copied dhcpd.conf directly from the old server to the new one. dhcpd starts normally, and when I try to boot a blade I see a dhcp request / dhcp ack in /var/log/messages, but after that I get the ARP timeout PXE E11.

I took the old server, gave it a new IP address on the private network, tweaked dhcpd.conf appropriately, started dhcpd on the old server and booting works fine again.

I then cloned another old IBM ThinkCentre running Fedora Core 4 as a DHCP server using the files from the original server and the blades boot fine from it too!

I've turned off iptables and ip6tables on the new server to no effect. Selinux is configured as permissive. Any ideas on whats going on would be appreciated.


Solution 1:

1) reset the arp tables

ip link set arp off dev eth0
ip link set arp on dev eth0

2) Get a Wireshark traffic capture and see what's really going on