Clearing ARP cache on ESXi 4.1

We recently migrated our entire VMware cluster from ESX over to ESXi. For the most part, the transition was seamless, and I haven't missed having access to the SC. Until now.

We're trying to diagnose some odd unicast flooding behavior that's happening during vMotions, and we suspect that the cause may be related to a discrepancy between switchgear CAM table cache expiry and the ARP table expiry on each ESXi host. As such, I've been trying to figure out how to view and clear the ARP table in ESXi.

On ESX (with the full SC), this would have been a cinch - just ssh in and run an arp -a. Unfortunately the neutered shell within ESXi doesn't include the ARP command, and I have not been able to find a single piece of documentation on this within VMware's KB.

I do have a support request in with VMware on this (going on 30 hours without an answer), but figured I'd toss it over here first to see if anyone has ideas. Thanks!


Solution 1:

Without the service console you need to use vCLI. It works with ESX/ESXi hosts.

Right now, I can't find a documented way to clear the ARP tables via RemoteCLI. The best I can find is here: Top Five New vCLI commands in vSphere 4.1

list all active connections: esxcli network connection list

list all ARP table entries: esxcli network neighbor list

Hope this helps. Let us know what support says.

Solution 2:

After discussing with VMware, I learned that there is no way to clear or otherwise manipulate the ARP table on ESXi 4.1. I feel strongly that being able to perform these actions can be critical for troubleshooting, and I sure hope that they add this functionality in future versions of the product.

Solution 3:

ESXi 4.1 has the Remote CLI you can use, or if that doesn't support what you need, there's always the unspported way. However, the best part is, because you're using the latest and greatest 4.1 you can actually officially enable SSH.

Solution 4:

Make sure you have all your vkernel ports on separate subnets e.g. separate vmotion/management/iscsi. Failure to do this can cause lots of flooding during vmotion as the physical switch does not learn the MAC address for the vmotion port correctly. And continuously broadcasts to find it.