Microsoft recently published a video of their upcoming Longhorn virtualization. Interestingly, they demonstrate a feature that they claim no competitor can match, namely an 8-core virtual machine. I’m really curious to see the final product. Being in the field, I know exactly what kind of problem virtual machines run into with scalability.
A key problem is relatively simple to explain, and having explained it to many HP customers, I see no reason not to talk about it on this blog. When you schedule more than one virtual CPU, there are essentially two ways of doing it.
- You can use something called gang scheduling. In that case, all virtual CPUs run at the same time. One major benefit is that if a virtual CPU is waiting for a lock, the virtual CPU that holds the lock is also running. So there is no major increase in lock contention, a key factor in scalability. The major drawback is that if you have an 8-way virtual machine with 1 CPU busy, it actually consumes 8 CPUs. That is not very good use of the CPUs.
- An alternative is to schedule each virtual CPU individually. This is much more difficult to get right, in particular because of the lock contention issue outlined above. On the other hand, the benefit in terms of CPU utilization for virtual machines that use only some of their virtual CPUs are enormous.
So here is a test I’m interested in running on Longhorn virtualization when it becomes available: if I have two 4-way virtual machines on an 4-way hardware, and if I start one “spinner” process in each that counts as fast as possible, do I see:
- 4 processors pegged at 100%, with the two spinners running at half their nominal rate, since they each get essentially 50% of one CPU? As far as I know, this is the behavior for VMware, and it’s characteristic of gang scheduling.
- 2 processors pegged at 100% and 2 processors sitting mostly idle, with each spinner getting 100% of a CPU? This is the behavior of Integrity Virtual Machines.
The bottom line is: there is a serious trade off between scalability and efficient use of resources. If Microsoft managed to solve that problem, kudos to them.