TurboTalk
Management in the Age of Virtualization

Anyone looking for a virtualization moment of zen today needed to look no farther than Cody Bunch of ProfessionalVMware.com, when he opined on Twitter:
%RDY %RDY %RDY – It’s like having customers waiting for donuts. When you have more customers than donuts, you have a problem. #donutzen
This is a great way to describe the CPU co-scheduling problems that can crop up when using VMware, and based on what we have seen in the field, these problems are quite common. The co-scheduling problem arises when ESX Servers need to schedule multiple processors to service a virtual symmetric multiprocessor (vSMP). In order to emulate the semantics of an SMP, these processors must be co-scheduled concurrently to service the vSMP.
So, if your SQL Server VM requires a vSMP with 4 vCPUs, then it will need to grab 4 physical processor cores in order to execute. The ESX co-scheduling mechanisms will first try to run it, even if it does not have 4 available vCPUs. However, as soon as it hits an event requiring all 4 vCPUs, it will place the vSMP into the CPU ready queue until 4 cores become available to service all 4 vCPUs. It is possible that the vSMP will remain waiting in the ready queue for a long time. Other VMs requiring only 1 vCPU may grab cores as soon as they become available, starving the vSMP VM.
Of course we can think about this in terms of Cody’s donuts. Suppose the donut store keeps you waiting for an order of 4 donuts, until it services all customers requiring only 1 donut. During the morning rush hour when the stream of customers requiring 1 donut seems never ending, you will wait forever until your order of 4 donuts may be satisfied.
OK, OK, I think you get the point…so let me continue…
This problem can be diagnosed by examining the %rdy values on the ESX Server, but solving it is another matter entirely. If you are willing to sacrifice all of your vSMP virtual machines then you could make this problem disappear instantly. However, many mission critical applications require the performance benefits of SMP architectures. Forcing them to avoid virtual infrastructures would significantly limit the value of virtualization.
Worse. Often, as traffic demands increase, one would like to allocate more resources to their vSMP virtual machines. Consider an application using that is running on a 2 vCPU VM. Suppose one wishes to accelerate the processing speed of peak traffic by doubling the allocation of vCPUs to 4. What is one to do if instead of improving processing speeds, they witness a dramatic decline? Such decline is due to increasing waiting time in the CPU ready queue; grabbing 4 vCPUs may take substantially longer than getting 2 vCPUs.
Most customers we speak with are running some fairly beefy virtual machines that require more CPU horsepower that a single vCPU virtual machine can provide. So, they try to strike a balance and periodically examine vSMP virtual machines to see if the %rdy values on a given ESX Server are high. If these values are high, they usually try to VMotion the VMs to different hosts to address the problem.
Unfortunately, solving this issue is a real time problem. CPU ready will fluctuate as demand on the virtual infrastructure changes, and as demands on the applications running on that infrastructure change. Taking a point-in-time snapshot of the environment may solve the problem right now, but it won’t cure it for good.
Now is probably a good time to mention that one of the most popular features in VMTurbo’s virtual appliance is real-time virtual machine rightsizing. Our virtual appliance ensures that all of your virtual machines are running at the right size, at the right time, even as demands on your applications and infrastructure change. If you’d like to rightsize your environment, let us know and we’ll give you a link to download our virtual appliance. Download and be rightsized in just minutes!
Category: Performance