vSphere 4.1: Multicore Virtual CPUs

July 25th, 2010 by jason Leave a reply »

With the release of vSphere 4.1, VMware has introduced Multicore Virtual CPU technology to its bare metal flagship hypervisor.  This is an interesting feature which had already existed in current versions of VMware Workstation.  VMware has consistently baked in new features in its Type 2 hypervisor products, such as Workstation, Player, Fusion, etc., more or less as a functionality/stability test before releasing the same features in ESX(i).  VMware highlights this new feature as follows:

User-configurable Number of Virtual CPUs per Virtual Socket: You can configure virtual machines to have multiple virtual CPUs reside in a single virtual socket, with each virtual CPU appearing to the guest operating system as a single core. Previously, virtual machines were restricted to having only one virtual CPU per virtual socket. See the vSphere Virtual Machine Administration Guide.

VMware multicore virtual CPU support lets you control the number of cores per virtual CPU in a virtual machine. This capability lets operating systems with socket restrictions use more of the host CPU’s cores, which increases overall performance.

Using multicore virtual CPUs can be useful when you run operating systems or applications that can take advantage of only a limited number of CPU sockets. Previously, each virtual CPU was, by default, assigned to a single-core socket, so that the virtual machine would have as many sockets as virtual CPUs.

You can configure how the virtual CPUs are assigned in terms of sockets and cores. For example, you can configure a virtual machine with four virtual CPUs in the following ways:

  • Four sockets with one core per socket (legacy, this is how we’ve always done it prior to vSphere 4.1)
  • Two sockets with two cores per socket (new in vSphere 4.1)
  • One socket with four cores per socket (new in vSphere 4.1)

VMware defines a CPU as:

The portion of a computer system that carries out the instructions of a computer program and is the primary element carrying out the computer’s functions.

VMware defines a Core as:

A logical execution unit containing an L1 cache and functional units needed to execute programs. Cores can independently execute programs or threads.

VMware defines a Socket as:

A physical connector on a computer motherboard that accepts a single physical chip. Many motherboards can have multiple sockets that can in turn accept multicore chips.

One of the benefits of multicore which physical computing had was increased density of the hardware.  VMs do not share this advantage as they are virtual to begin with and have no rack footprint to speak of.

VMware’s benefit statement for this feature is a legitimate one and is the primary use case.  It’s the same benefit which applied when multicore (as well as hyperthreading to some extent) technology was introduced to physical servers.  What VMware doesn’t advertise is that the limitation being discussed usually revolves around software licensing – a per-socket license model to be precise which is what many software vendors still use.  For example, if I own a piece of software and I have a single socket license, traditionally I was only able to use this software inside of a single vCPU VM.  With Multicore Virtual CPUs, Virtual Machines have now caught up with their physcial hardware counterparts in that a single socket VM can be created which has 4 cores per socket.  Using the working example, the advantage I have now is that I can run my application inside a VM which still has 1 socket, but 4 cores for a net result of 4 vCPUs instead of just 1 vCPU.  I didn’t have to pay my software vendor additional money for the added CPU power.  To show how this translates into dollars and cents, let’s assume a per socket license cost of my application to be $1,000 and then extrapolate those numbers using VMware’s example above of how CPUs can be assigned in terms of sockets and cores:

  • Four sockets with one core per socket = $1,000 x 4 sockets = $4,000 net license cost, 4 CPUs
  • Two sockets with two cores per socket = $1,000 x 2 sockets = $2,000 net license cost, 4 CPUs
  • One socket with four cores per socket = $1,000 x 1 socket = $1,000 net license cost, 4 CPUs
  •  

    Now, all of this said, the responsibility is on the end user to be in license compliance with his or her software vendors.  Just becasue you can do this doens’t mean you’re legally obliged to do so.  Be sure to read your EULA and check with your software vendor or reseller before implementing VMware Multicore Virtual CPUs.

    Implementation of Multicore Virtual CPUs was quite straightfoward in VMware Workstation.  Upon creating a new VM or editing an existing VM’s settings, the following interface was presented for configuring vCPUs and cores per vCPU in VMware Workstation.  In this example, a 2xDC (Dual Core) configuration is being applied which results in a total of 4 CPU cores which will serve the VM’s operating system, applications, and users. Note that here, the term “processors” on the first line translates to “sockets”:

    7-25-2010 11-39-53 AM

    Making the same 2xDC CPU configuration in vSphere 4.1 isn’t difficult but nonetheless it is done differently.  Configuring total vCPUs and cores per vCPU is achieved by applying configurations in two different areas of the VM configuration. The combination of the two configurations produces a mathematical calculation which ultimately determines cores per vCPU.

    First of all, the total number of cores (processors) is selected in the VM’s CPU configuration.  This hasn’t changed and should be familiar to you.  The number of cores (processors) available for selection here is going to be 1 thru 4 or 1 thru 8 if you have Enterprise Plus licensing.  I’ve purposely included the notation of the VM hardware version 7 which is required. An inconsistency here compared to VMware Workstation is that the term “virtual processors” translates to “cores”, not “sockets”:

     7-25-2010 11-41-09 AM

    Configuring the number of cores per processor is where VMware has deviated from the VMware Workstation implementation.  In ESX and ESXi, this configuration is made as an advanced setting in the .vmx file.  Edit the VM settings, navigate to the Options tab, choose General in the Advanced options list. Click the Configuration Parameters button which allows you to edit the .vmx file on a row by row basis.  Click the Add Row button and add the line item cpuid.coresPerSocket. For the value, your going to supply the number of cores per processor which is generally going to be a value of 2, 4, or 8 (Enterprise Plus licensing required).  Note, using a value of 1 here would serve no practical purpose because it would configure a single core vCPU which is what we’ve had all along up until this point:

    7-25-2010 11-45-38 AM

    As a supplement, here are the requirements for implementing Multicore Virtual CPUs:

    • VMware vSphere 4.1 (vCenter 4.1, ESX 4.1 or ESXi 4.1).
    • Virtual Machine hardware version 7 is required.
    • The VM must be powered off to configure Multicore Virtual CPUs.
    • The total number of vCPUs for the VM divided by the number of cores per socket must be a positive integer.
    • The cpuid.coresPerSocket value must be a power of 2. The documentation explicitely states a value of 2, 4, or 8 is required, but 1 works as well although as stated before it would serve no practical purpose.
      • 2^0=1 (anything to the power of 0 always equals 1)
      • 2^1=2 (anything to the power of 1 always equals itself)
      • 2^2=4
      • 2^3=8
    • When you configure multicore virtual CPUs for a virtual machine, CPU hot Add/Remove is disabled (previously called CPU hot plug).
    • You must be in compliance with the requirements of the operating system EULA.

    This feature rocks and I think customers have been waiting a long time for it.  Duncan mentioned it quite some time ago but obvioulsy it was unsupported at that time.  I am a little puzzled by the implementation mechanisms, mainly the configuration of the .vmx to specify cores per CPU.  I suppose it lends itself to scriptability and thus automation, but in that sense, we lack the flexibility to configure cores per CPU with guest customization when deploying VMs from a template.  Essentially this means cores per CPU needs to be hard coded in each of my templates or cores per CPU needs to be manually tuned after deploying each VM from a template.  When I take a step back, I guess that’s no different than any other virtual hardware configuration stored in templates, but with the cores per CPU setting being buried in the .vmx as an advanced setting, it’s that much more of a manal/administrative burden to configure cores per CPU for each VM deployed than it is to simply change the number of CPUs or amount of RAM.  It would be nice if the guest customization process offered a quick way to configure cores per processor.

    Advertisement

    No comments

    1. VitaRedux says:

      Nice feature, saves me from those mistakes where I provision a single vCPU and get stuck with that HAL when I really need more processors.

      What I (and probably other) REALLY want to know, is can I get round my per processor SQL license costs. Probably not knowing Microsoft. Their EULAs are so grey you have to ask them, and you just know what they’re gonna say.

    2. Justin Paul says:

      Thats pretty sweet, I’m sure that software makers will start changing their licensing model, but here is my question… Will this work with Fault Tolerance? 1 Physical CPU and multiple cores in it ? That would be pretty awesome if it could.

    3. Hari says:

      A naive question – for a windows server guest or any of the major linux server (SuSE/RedHat) guests, is there a difference if I configure 2 vCPUs or 1 vCPU with 2 cores? is there any performance implication? how does the cpu scheduling (relaxed co-scheduling) work?

    4. jason says:

      Universal answer of “What’s the performance difference between 2 traditional single core CPUs and 2 cores?” applies from a hardware perspective. Co-scheduling is performed at the logical CPU level. A logical CPU may be a single core CPU, a sinlge core within a multicore CPU, or a hyperthreaded CPU (although the scheduler is HTT aware and will attempt to perform exhause non-HTT resources before utilizing HTT resources)

      http://en.wikipedia.org/wiki/Multi-core_processor
      “A multi-core processor implements multiprocessing in a single physical package. Designers may couple cores in a multi-core device together tightly or loosely. For example, cores may or may not share caches, and they may implement message passing or shared memory inter-core communication methods. Common network topologies to interconnect cores include bus, ring, 2-dimensional mesh, and crossbar. Homogeneous multi-core systems include only identical cores, unlike heterogeneous multi-core systems. Just as with single-processor systems, cores in multi-core systems may implement architectures like superscalar, VLIW, vector processing, SIMD, or multithreading.”

    5. jason says:

      Multicore technology has been stealing license revenue from the software manufacturers for years. I’m sure they are not happy about it but they must adapt, and that doesn’t necessarily mean simply flipping to the “per core” chargeback model. That will create quite a rub.

      FT, today, is incompatible with vSMP. How you arrive at vSMP (by sockets or cores) is irrelevant.

    6. Jason,

      Good info, but did you know this also works with vSphere 4.0 not just vSphere 4.1? At least as of ESXi 4.0.0 261974 (the version running in my test lab).

      And you’re right, the licensing issue is a big one – especially where virtualizing SQL and some lower levels of OS where CPU vs CORE is concerned. It doesn’t really make sense for VMware not to have full parity with now-common hardware – and here’s another step closer.

      Of course, co-scheduling rules still apply, but low latency processors from Intel and AMD are making that “penalty” less of a problem.

      Cheers!

    7. jason says:

      Yes I knew it but it wasn’t offically supported by VMware. In 4.1, VMware boasts this as a newly supported “feature”.

    8. Todd says:

      Jason – Can you confirm that Enterprise *PLUS* licensing is required and not just Enterprise?

    9. jason says:

      Enterprise Plus licensing is required for 8 vCPUs

    10. Bryan says:

      Hi Jason, this is good stuff. Do you know if there is an official VMware doc available that outlines the procedures on setting this up?

    11. Mike says:

      Hate to be a nitpick, but 2^4=16, not 8. 2^3 however, is much closer to 8.

    12. jason says:

      Thanks for calling out the error Mike. I’ve updated the blog post with the correction.

    13. Philippe says:

      Hi Jason,
      Great feature but how can i configure a single virtual processor dual core?
      I want to install MS SQL server on a virtual machine with one processor (cost of a licence) but dual core for performance.
      My ESXi 4.1 host have a single processor 4 cores.

      Thank you for any help.

    14. James says:

      Is HA still available with multiple cores assigned?

    15. Jay Rogers says:

      http://bit.ly/aUWglE is KB showing this. Should guest show all cores or just # of vCPU assigned the guest?

    16. jason says:

      @Jay
      The VM is going to see cores and sockets. The VM will be able to take advantage of multiple cores per socket and realize all of the benefits that go along with that (namely licensing).

    17. Actually, odd number of cores do work, despite the documentation:

      http://solori.wordpress.com/2010/11/08/short-take-vsphere-multi-core-virtual-machines/

      While functional, I suspect such configurations have limited support by VMware. However, in AMD 2300/6100 and Intel 5600 systems, the ability to match the virtual core count to the physical core count in NUMA systems could be very important deployment considerations.

    18. ugh, AMD 2400/6100… not 2300/6100

    19. Parker Race says:

      We are having weird results with this. I created a machine for Red Hat Linux 5.6. I assigned to vcpus and and 2 cores per socket. When the OS was installed it saw 1 cpu with 2 cores? We had to assigne 4 cpus with 2 cores per socket.

    20. jason says:

      2 vCPUs and 2 cores per socket should have resulted in 2 vCPUs (cores) in 1 socket which is what you saw. To use this feature correctly in vSphere 4, you provide the absolute number of cores (vCPUs) in the pull down box. The extra parameter works as a divider to tell the guest OS and its applications how many sockets the environment lives on. Here are a few examples:

      Number of virtual processors: 1
      cpuid.coresPerSocket: 1 (single core)
      Number of cores OS/apps see: 1
      Number of sockets OS/apps see: 1

      Number of virtual processors: 2
      cpuid.coresPerSocket: 1 (single core)
      Number of cores OS/apps see: 2
      Number of sockets OS/apps see: 2

      Number of virtual processors: 2
      cpuid.coresPerSocket: 2 (dual core)
      Number of cores OS/apps see: 2
      Number of sockets OS/apps see: 1

      Number of virtual processors: 4
      cpuid.coresPerSocket: 1 (single core)
      Number of cores OS/apps see: 4
      Number of sockets OS/apps see: 4

      Number of virtual processors: 4
      cpuid.coresPerSocket: 2 (dual core)
      Number of cores OS/apps see: 4
      Number of sockets OS/apps see: 2

      Number of virtual processors: 4
      cpuid.coresPerSocket: 4 (quad core)
      Number of cores OS/apps see: 4
      Number of sockets OS/apps see: 1

      Number of virtual processors: 8
      cpuid.coresPerSocket: 1 (single core)
      Number of cores OS/apps see: 8
      Number of sockets OS/apps see: 8

      Number of virtual processors: 8
      cpuid.coresPerSocket: 2 (dual core)
      Number of cores OS/apps see: 8
      Number of sockets OS/apps see: 4

      Number of virtual processors: 8
      cpuid.coresPerSocket: 4 (quad core)
      Number of cores OS/apps see: 8
      Number of sockets OS/apps see: 2

      Hopefully you see the patterns here, the biggest one being “Number of virtual processors” will always equal “Number of cores OS/apps see”.
      The calculation piece is the cores per socket configuration.

    21. Parker Race says:

      Thanks for the explanation.