Failover cluster failed to failover due to mysterious IP conflict?

The IP Address conflict error occurs when more than one node in a cluster attempts to bring a resource group (and its associated IP(s)) online at the same time.

This can happen if the cluster nodes momentarily lose contact with each other. Each node assumes the other node has failed, as a result the 'passive' node will bring all resource groups online when they are in fact still online on the 'active' node.

I have seen this problem in our VMWare environment when one of the ESX(i) hosts are overloaded - sometimes even just during HBA bus rescans, suddenly the MSCS nodes very breifly lose contact and this mess occurs.


Use the script on this page to query VM mac addresses:

http://www.virtuallyghetto.com/2011/05/how-to-query-for-macs-on-internal.html

Match it to your misbehaving MAC address and examine the machine carefully.


IMHO any logical service-IP should have a subnet-mask of /32. The network should be served by the physical IP which should have a subnet-mask matching the subnet used.