Co-scheduling Visualized

May 21st, 2011 by jason Leave a reply »

I stumbled onto this time lapse video of 51 airplanes taking off (and others taxiing) at Boston’s Logan International Airport.  One thought immediately popped into my mind: co-scheduling, which is a function of The VMware vSphere CPU Scheduler.  The accelerated speed of the video really pronounced the importance of precision the scheduler is responsible for, which in this case is the air traffic controller (or controllers).


How does this video relate to co-scheduling?

  • Imagine the planes represent CPU execution (or more accurately CPU execution requests).
  • Imagine the various runways & taxiways represent the number of vCPUs in a VM.

The scheduler is responsible for managing the traffic, making sure there’s a clear path for each plane to move forward and to be on time. 

  • With less runways and taxiways (vCPUs in a VM), scheduling complexity is reduced.
  • Adding runways and taxiways (vCPUs in a VM) increases scheduling complexity but with a limited number of planes (guest OS CPU execution requests), scheduling will still be manageable and planes will arrive on time.
  • Now add a significant number of planes (4 vCPU, 8 vCPU) to our multitude of criss/crossing runways and taxiways.  The precision required to avoid accidents and maintain fairness becomes extremely complex.  The result is high %RDY time for VMs on the host.

How do we deal with scheduling complexity?

  1. Right size VMs whether they are new builds or P2V.  A minimalist approach to resource guarantees is the best place to start when we’re working with consolidated infrastructure and shared resources.
  2. If you’ve already right sized VMs and you’re running into high %RDY times:
    • Balance workloads by mixing VMs having both lower and higher number of vCPUs on the same host/cluster
    • Add cores to the host/cluster by:
      • Scaling up (increasing the core count in the host)
      • Scaling out (increasing the number of hosts in the cluster)

(Video source: @GuyKawasaki‘s Holy Kaw!)


No comments

  1. Pankaj says:

    Nice, good explaination with the help of this video….

  2. Leonardo says:

    Genial as an example, bravo.

  3. Interesting example.

    When you have 2+ vCPU/VM then you will need to have the 2+ runaways free at the same time, so the planes can land/ take off at the same time.

    This could be far most decisive impacting factor for %RDY time for VMs when a lot of planes need to land/take off.

  4. There is a lot of “panic” around co-scheduling.

    You are right about the importance to watch %rdy times. But how often you see higher ready times ?

    This nice post show how strongly you can overbook pCPUs (cores) with vCPUs without fearing a significant punishment.

    The airport is not a real good example (even when visual impressive) – my “esx 2.x SMP” example about strict co scheduling was a restaurant with 1 table serving singles and couples – it needs more time to free up a place for a couple than for a single.

    In the “relaxed” co scheduling world the couples are not keen to start and end each course togeher if the guy drank the soup he will wait with the dessert until the girl finished the main course.