Network speed between a VM and another machine which is not residing on the same host, is 11MB/s at most

Problem
Network speed between a VM and another machine which is not residing on the same host, is 11MB/s at most.

Topology

topology

Facts

  • ESXi5 version is 5.0.0.504890
  • VM has the latest Vmware Tools installed
  • VM is using E1000 network driver
  • Physical box has Win Srv 2008 R2 as the OS
  • CrystalDiskMark says the drive on physical box can read/write 100MB/s
  • vCenter is another vm on esx
  • both vm and physical box are showing 1Gbps link speed
  • Configuration > Networking shows vmnic0 as 1000 Full
  • NTttcp is a client/server tool from Microsoft for measuring pure network throughput

Here's what I've done so far:

Test1:

  • VM is running Filezilla FTP Server (default settings, one user account made)
  • Physical box is running Filezilla FTP Client (default settings)
  • Physical box is uploading a big file to FTP server
  • Transfer speed (as observed by Windows Task Manager on both machines): ~11MB/s (bad)
  • Physical box is downloading that file from FTP server
  • Transfer speed (as observed by Windows Task Manager on both machines): still ~11MB/s (bad)

Could it be disk performance issue?

Test2:

  • Physical box is running ntttcpr.exe -a 6 -m 6,0,VM_IP_ADDRESS
  • VM is running ntttcps.exe -a 6 -m 6,0,PHY_BOX_IP_ADDRESS
  • Transfer speed (as observed by Windows Task Manager on both machines): ~11MB/s (bad)

Could it be switch performance issue?

Test3:

  • physical box is running vSphere Client
  • I open Summary > Storage > datastore > Browse Datastore... from physical box and upload a file to datastore
  • Transfer speed (as observed by Windows Task Manager on physical box): ~26-36MB/s (good)

Could it be a vm specific issue?

Test4:

  • Installed ntttcp to another vm on the same esx server
  • Measured network performance between vms on the same esx server with NTttcp
  • Transfer speed (as observed by Windows Task Manager on physical box): ~90-120MB/s (excellent :)

Test5:
I have another esx server on the same site, connecting to the same datastore and same switch. Those two ESX servers have both 2 NICs. One NIC goes to switch while the other goes directly to the other ESX server.

  • vMotioned one of the testing vms off to the other ESX host
  • Measured network performance between vms on different esx servers with NTttcp
  • Transfer speed (as observed by Windows Task Manager on physical box): ~11MB/s (bad)

While I'm aware of these:

  • ESXi 4.1 slow file transfer
  • ESXi 5 network performance is slow
  • Debian Etch and ESXi slow network speeds
  • VMWare ESXi slow file copy to guest

they did not help (or I must have been missed something)


Solution 1:

11Mb/s is too close to 100Mbps to be only a coincidence. It's clear that you have a problem with one of the network ports, either on the switch or on one of your servers's NIC, not being set to 1Gbps/full duplex. There's no doubt about that. The question is which one.

Make sure all your NICs are set to 1Gbps/full duplex, and that every single port of all network devices between all of your servers and storage devices (switches and routers) are also set to 1Gbps/full duplex.

Solution 2:

When I put a crossover cable between physical box and a laptop and witnessed excellent speed and then put a switch between and the speed was still great and then changed the IP addresses from 192.168.0.x to real IP addressses that I had for the physical box and the VM, it occurred to me that while the ESX and physical box are just over the switch for each other, their different IP subnets dictate that all traffic shared between them has to go through the ISP router which is also connected to the same switch.

So, due to different subnets, the traffic went through my ISP's box which shaped it down to 100Mbps!