we've got 2 x HP DL360 G5s with Quad Quad Xeons 2.6GHz and 32GB of memory each running XenServer 5.5 and they access an OpenFiler box (with 8 x 320GB SAS 10K drives) via copper CAT5 (1GB) for the storage.

We've used this setup for testing a lot of stuff which has worked out perfectly, but now we are moving to use this setup in production and are experiencing performance issues. There are currently 27 VMs split across the two servers which are all in use (albeit not doing a lot of work) but they seem "slow", especially our employee thin clients - they always complain logging in times and accessing files via the network are slow.

Personally, I think it's a throughput issue and we should go SCSI or FC for our storage but I need some evidence to back my theory up and I'm quite new to Xen (it was setup by a previous employee).

My questions: from the info I've gave would it be possible that the storage box is overloaded, trying to squeeze too much over that one cable;? how do I monitor network access in real-time from the XenServers themselves?

Thanks :-)


I have seen this issue many times. I really love xenserver, however, its like an unpolished gem...

you should check with ifconfig -a (on dom0, xenserver console ) and look for dropped packets

you can use: ifconfig -a | grep dropped | awk {'print $3'} | grep -v ":0"

if you see dropped packets , you should do:

  1. On the Virtual Machines, Click Start, click Run, type regedit, and then click OK.
  2. Locate and then click the following registry subkey: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
  3. In the right pane, make sure that the DisableTaskOffload registry entry exists. If this entry does not exist, follow these steps to add the entry: a. On the Edit menu, point to New, and then click DWORD Value. b. Type DisableTaskOffload, and then press ENTER.
  4. Click DisableTaskOffload.
  5. On the Edit menu, click Modify.
  6. Type 1 in the Value data box, and then press ENTER.
  7. Exit Registry Editor.
  8. Restart all Virtual Machines

and on server xenserver console:

Get the UUID of the physical interface: xe pif-list host-name-label=XEN1

Disable checksum on the interfaces:

xe pif-param-set other-config:ethtool-tx="off" uuid=3281b044-2a93-2f1b-e8e1-eaf0faccbd1f; xe pif-param-set other-config:ethtool-rx="off" uuid=3281b044-2a93-2f1b-e8e1-eaf0faccbd1f


During high load or a period of perceived performance drop, run top on the server.

You're looking for three warning signs that could help you along with finding your bottleneck:

  1. %wa (near the top middle) - This is the IOWait measurement, or the amount of time that the CPU has to wait on i/o or storage requests to finish to continue working. If this is above 10-20%, you're going to start seeing problems. If this is the case, you need to upgrade your storage.
  2. load average (set of three numbers on top) - This is the average usage of your CPUs over 1, 5, and 15 minutes. This is a very rough number to troubleshoot with, but a good rule of thumb is that you want to stay under 1.0*Cores, so an 8-core system shouldn't go much above 8.0 load average. Anything higher means that applications are probably being limited by CPU (absent %wa issues). More info on load averages linked here
  3. Mem and Swap (usually lines 4 and 5) - If you're running out of RAM, you'll see that here. A warning sign here would be the combination of a low number for buffers, a low number of Mem free, and a high amount of Swap used. Low/high here is relative to your total Memory.

If you want to measure network access real-time, I suggest starting with something like bmon (linked here) to see just how much traffic is being generated.

One question though: Are you running storage and client access over a single network interface? You may want to separate those two out if you are.


Don't worry! All of your problems can be solved! All you need to do is upgrade to XenServer 5.6 sp2. The only downside of upgrading to this release is that you will not be able to use mdadm software local raid. From what you said, it seems that you aren't using this.

Citrix has introduced their "intellicache" technology with XenServer 5.6. This technology has personally revolutionized my XenServer infrastructure, removing all the slowness in my VM's. Intellicache works by caching reads from network attached storage on a local disk. The first time you boot the VM it will be just as slow as normal, but the next time you reboot, all of the reads will hit the local storage, saving IOPS on your OpenFiler. If you shut the VM down, and start on a different XenServer host, a new read-cache is built automatically in the background.

If you really want to see your VM's scream, I would suggest installing a SSD inside each XenServer host and configuring it for intellicache. This will give you excellent performance.

For install instructions, see:

http://support.citrix.com/article/CTX129387 pages 21-24 of the pdf.

For more information see:

http://www.youtube.com/watch?v=i-6ojYDdrLA http://support.citrix.com/article/CTX129052