Is the gang scheduling employed by VMware a serious drawback?
I was reading some technet articles as well as this one regarding the differences between the way VMware and hyper v doing CPU scheduling.
I was wondering if I could get some objective info on this. It would seem that the gang scheduling used by VMware is a HUGE disadvantage, but I don't want to just drink the coolaid. Does it seriously impact performance or do the latest iterations of VMware's hyper visors resolve this?
Edit: When I say disadvantage I mean relative to Hyper V's "free processor scheduling" or however KVM does it. The material I was reading didn't say there was any problems with "free processor scheduling" that are avoided with gang scheduling.
Like chanting Bloody Mary into a darkly-lit bathroom mirror, let's see if we can get Jake Oshins to show up...
Gang scheduling is also referred to as co-scheduling. I think VMware prefers the term co-scheduling to gang scheduling.
In ESX versions prior to version 3.x, VMware used "strict" co-scheduling, which had the synchronization drawbacks. In ESX 3.x and above, VMware switched to "relaxed" co-scheduling.
Relaxed co-scheduling replaced the strict co-scheduling in ESX 3.x and has been refined in subsequent releases to achieve better CPU utilization and to support wide multiprocessor virtual machines. Relaxed co-scheduling has a few distinctive properties compared to the strict co-scheduling algorithm. Most important of all, while in the strict co-scheduling algorithm, the existence of a lagging vCPU causes the entire virtual machine to be co-stopped. In the relaxed co-scheduling algorithm, a leading vCPU decides whether it should co-stop itself based on the skew against the slowest sibling vCPU. If the skew is greater than a threshold, the leading vCPU co-stops itself. Note that a lagging vCPU is one that makes significantly less progress than the fastest sibling vCPU, while a leading vCPU is one that makes significantly more progress than the slowest sibling vCPU. By tracking the slowest sibling vCPU, it is now possible for each vCPU to make its own co-scheduling decision independently. Like co-stop, the co-start decision is also made individually. Once the slowest sibling vCPU starts progressing, the co-stopped vCPUs are eligible to co-start and can be scheduled depending on pCPU availability. This solves the CPU fragmentation problem in the strict co-scheduling algorithm by not requiring a group of vCPUs to be scheduled together. In the previous example of the 4- vCPU virtual machine, the virtual machine can make forward progress even if there is only one idle pCPU available. This significantly improves CPU utilization.
The above snippet is from VMware's own documentation.
So VMware is not using strict gang scheduling anymore. I would treat documentation directly from the vendor as being more authoritative.
The only thing that will give you hard numbers is a benchmark, and it will be entirely dependent on the kinds of code that the CPUs are running. But I can tell you that if VMware was at such a disadvantage, then they would not still have lion's share of the virtualization market.
Okay, Ryan, you made my day. I don't read this forum as much as I used to, but I happened to check in.
Red888, you should know up front that I'm a software architect who works on Hyper-V at Microsoft. I assume most people reading this are perfectly capable of clicking on my name link below this and discovering that, or even Googling me, but for this answer it's useful to be entirely certain that the people reading this have no doubt about my perspective.
In general, gang scheduling is useful if the hypervisor doesn't have any way to influence the behavior of the OS running within the VM. This is, of course, why VMware started out this way. They don't own any operating systems and so their goal was to make existing operating systems work well. If I were them, this is where I would have started.
Gang scheduling, and VMware would probably say that I'm right about this, leaves lots of limitations on how you can use the physical processors within the machine. The hypervisor often can't find the right resource fit for the moment. So they've modified their algorithm over the years, looking for ways to do scheduling that work better.
Microsoft (and probably several other companies) started off with a different view. We own Windows. We'll make Windows behave well when virtualized. And thus gang scheduling won't be necessary. We won't even bother to build a gang scheduler.
Interestingly, we at Microsoft care more about Windows running well in comparison to other operating systems than we care about Hyper-V looking better than VMware, or KVM, or Xen, or Oracle, or Unisys, etc. So we published the interfaces that Windows uses to cooperate with a hypervisor. Here's a link if you're curious, though I don't recommend it as bedtime reading:
http://www.bing.com/search?q=Hypervisor+Top-Level+Functional+Specification+3.0a%3A+Windows+Server+2012&src=IE-SearchBox&FORM=IESR02
So any hypervisor vendor can expose the stuff that will trigger cooperative behavior from Windows. Several of them have. I honestly don't know if VMware has, or does, or will expose this. You'd have to ask them, or somebody who pays a lot of attention to them. And if they do, I'd be very surprised if they hadn't modified their scheduler to relax even more. That last statement, of course, is pure speculation.
So my bottom line answer is that I doubt that you should make a purchasing decision in 2014 based on how the hypervisor scheduler works. I suspect that they're all pretty good by now. A few years ago, that might not have been true.
You should try your workloads on the various systems and see how they work. I'll bet your ultimate performance comes down to whether your storage and networking meet your needs.