Set the lowest number of vCPUs your servers need to perform their function, don't over-allocate them or you could easily slow down your VMs.


Typically, HT works well on workloads that are heavier on IO -- the CPU can schedule in more processing tasks from the queue of the other virtual CPU while the first virtual CPU waits on the IO. Really all the HT subsystems get you is hardware-accelerated context switching -- which is the workload pattern that's also used when switching between VMs. So, HT will (usually) reduce the slowdown a bit when you have more VMs then cores, provided each VM gets one virtual core.

Assigning multiple vCPUs to a VM can improve performance if the apps in the VM are written for threading, but it also makes life harder for the hypervisor; it has to allocate time on 2 or 4 CPUs at once -- so if you have a quad-core CPU and a quad-vCPU VM, only one VM can get scheduled during that timeslice (whereas it can run 4 different single-vCPU VMs at once).