Monster VMs & ESX(i) Heap Size: Trouble In Storage Paradise

September 12th, 2012 by jason Leave a reply »

While running Microsoft Exchange Server Jetstress on vSphere 5 VMs in the lab, tests were failing about mid way through initializing its several TBs of databases.  This was a real head scratcher.  Symptoms were unwritable storage or lack of storage capacity.  Troubleshooting yielding errors such as “Cannot allocate memory”.  After some tail chasing, the road eventually lead to VMware KB article 1004424: An ESXi/ESX host reports VMFS heap warnings when hosting virtual machines that collectively use 4 TB or 20 TB of virtual disk storage.

As it turns out, ESX(i) versions 3 through 5 have a statically defined per-host heap size:

  • 16MB for ESX(i) 3.x through 4.0: Allows a max of 4TB open virtual disk capacity (again, per host)
  • 80MB for ESX(i) 4.1 and 5.x: Allows a max of 8TB open virtual disk capacity (per host)

This issue isn’t specific to Jetstress, Exchange, Microsoft, or a specific fabric type, storage protocol or storage vendor.  Exceeding the virtual disk capacities listed above, per host, results in the symptoms discussed earlier and memory allocation errors.  In fact, if you take a look at the KB article, there’s quite a laundry list of possible symptoms depending on what task is being attempted:

  • An ESXi/ESX 3.5/4.0 host has more that 4 terabytes (TB) of virtual disks (.vmdk files) open.
  • After virtual machines are migrated by vSphere HA from one host to another due to a host failover, the virtual machines fail to power on with the error:vSphere HA unsuccessfully failed over this virtual machine. vSphere HA will retry if the maximum number of attempts has not been exceeded. Reason: Cannot allocate memory.
  • You see warnings in /var/log/messages or /var/log/vmkernel.logsimilar to:vmkernel: cpu2:1410)WARNING: Heap: 1370: Heap_Align(vmfs3, 4096/4096 bytes, 4 align) failed. caller: 0x8fdbd0
    vmkernel: cpu2:1410)WARNING: Heap: 1266: Heap vmfs3: Maximum allowed growth (24) too small for size (8192)
    cpu15:11905)WARNING: Heap: 2525: Heap cow already at its maximum size. Cannot expand.
    cpu15:11905)WARNING: Heap: 2900: Heap_Align(cow, 6160/6160 bytes, 8 align) failed. caller: 0x41802fd54443
    cpu4:1959755)WARNING:Heap: 2525: Heap vmfs3 already at its maximum size. Cannot expand.
    cpu4:1959755)WARNING: Heap: 2900: Heap_Align(vmfs3, 2099200/2099200 bytes, 8 align) failed. caller: 0x418009533c50
    cpu7:5134)Config: 346: “SIOControlFlag2” = 0, Old Value: 1, (Status: 0x0)
  • Adding a VMDK to a virtual machine running on an ESXi/ESX host where heap VMFS-3 is maxed out fails.
  • When you try to manually power on a migrated virtual machine, you may see the error:The VM failed to resume on the destination during early power on.
    Reason: 0 (Cannot allocate memory).
    Cannot open the disk ‘<<Location of the .vmdk>>’ or one of the snapshot disks it depends on.
  • The virtual machine fails to power on and you see an error in the vSphere client:An unexpected error was received from the ESX host while powering on VM vm-xxx. Reason: (Cannot allocate memory)
  • A similar error may appear if you try to migrate or Storage vMotion a virtual machine to a destination ESXi/ESX host on which heap VMFS-3 is maxed out.
  • Cloning a virtual machine using the vmkfstools -icommand fails and you see the error:Clone: 43% done. Failed to clone disk: Cannot allocate memory (786441)
  • In the /var/log/vmfs/volumes/DatastoreName/VirtualMachineName/vmware.log file, you may see error messages similar to:2012-05-02T23:24:07.900Z| vmx| FileIOErrno2Result: Unexpected errno=12, Cannot allocate memory
    2012-05-02T23:24:07.900Z| vmx| AIOGNRC: Failed to open ‘/vmfs/volumes/xxxx-flat.vmdk’ : Cannot allocate memory (c00000002) (0x2013).
    2012-05-02T23:24:07.900Z| vmx| DISKLIB-VMFS : “/vmfs/volumes/xxxx-flat.vmdk” : failed to open (Cannot allocate memory): AIOMgr_Open failed. Type 3
    2012-05-02T23:24:07.900Z| vmx| DISKLIB-LINK : “/vmfs/volumes/xxxx.vmdk” : failed to open (Cannot allocate memory).
    2012-05-02T23:24:07.900Z| vmx| DISKLIB-CHAIN : “/vmfs/volumes/xxxx.vmdk” : failed to open (Cannot allocate memory).
    2012-05-02T23:24:07.900Z| vmx| DISKLIB-LIB : Failed to open ‘/vmfs/volumes/xxxx.vmdk’ with flags 0xa Cannot allocate memory (786441).
    2012-05-02T23:24:07.900Z| vmx| DISK: Cannot open disk “/vmfs/volumes/xxxx.vmdk”: Cannot allocate memory (786441).
    2012-05-02T23:24:07.900Z| vmx| Msg_Post: Error
    2012-05-02T23:24:07.900Z| vmx| [msg.disk.noBackEnd] Cannot open the disk ‘/vmfs/volumes/xxxx.vmdk’ or one of the snapshot disks it depends on.
    2012-05-02T23:24:07.900Z| vmx| [msg.disk.configureDiskError] Reason: Cannot allocate memory.

While VMware continues to raise the scale and performance bar for it’s vCloud Suite, this virtual disk and heap size limitation becomes a limiting constraint for monster VMs or vApps.  Fortunately, there’s a fairly painless resolution (at least up until a certain point):  Increase the Heap Size beyond its default value on each host in the cluster and reboot each host.  The advanced host setting to configure is VMFS3.MaxHeapSizeMB.

Let’s take another look at the default heap size and with the addition of its maximum allowable heap size value:

  • ESX(i) 3.x through 4.0:
    • Default value: 16MB – Allows a max of 4TB open virtual disk capacity
    • Maximum value: 128MB – Allows a max of 32TB open virtual disk capacity per host
  • ESX(i) 4.1 and 5.x:
    • Default value: 80MB – Allows a max of 8TB open virtual disk capacity
    • Maximum value: 256MB – Allows a max of 25TB open virtual disk capacity per host

After increasing the heap size and performing a reboot, the ESX(i) kernel will consume additional memory overhead equal to the amount of heap size increase in MB.  For example, on vSphere 5, the increase of heap size from 80MB to 256MB will consume an extra 176MB of base memory which cannot be shared with virtual machines or other processes running on the host.

Readers may have also noticed an overall decrease in the amount of open virtual disk capacity per host supported in newer generations of vSphere.  While I’m not overly concerned at the moment, I’d bet someone out there has a corner case requiring greater than 25TB or even 32TB of powered on virtual disk per host.  With two of VMware’s core value propositions being innovation and scalability, I would tip-toe lightly around the phrase “corner case” – it shouldn’t be used as an excuse for its gaps while VMware pushes for 100% data virtualization and vCloud adoption.  Short term, the answer may be RDMs. Longer term: vVOLS.

Updated 9/14/12: There are some questions in the comments section about what types of stoarge the heap size constraint applies to.  VMware has confirmed that heap size and max virtual disk capacity per host applies to VMFS only. The heap size constraint does not apply to RDMs nor does it apply to NFS datastores.

Updated 4/4/13: VMware has released patch ESXi500-201303401-BG to address heap issues.  This patch makes improvements to both default and maximum limits of open VMDK files per vSphere host.  After applying the above patch to each host, the default heap size for VMFS-5 datastores becomes 640MB which supports 60TB of open VMDK files per host.  These new default configurations are also the maximum values as well.  For additional reading on other fine blogs, see A Small Adjustment and a New VMware Fix will Prevent Heaps of Issues on vSphere VMFS Heap and The Case for Larger Than 2TB Virtual Disks and The Gotcha with VMFS.

Updated 4/30/13: VMware has released vSphere 5.1 Update 1 and as Cormac has pointed out here, heap issue resolution has been baked into this release as follows:

  1. VMFS heap can grow up to a maximum of 640MB compared to 256MB in earlier release. This is identical to the way that VMFS heap size can grow up to 640MB in a recent patch release (patch 5) for vSphere 5.0. See this earlier post.
  2. Maximum heap size for VMFS in vSphere 5.1U1 is set to 640MB by default for new installations. For upgrades, it may retain the values set before upgrade. In such cases, please set the values manually.
  3. There is also a new heap configuration “VMFS3.MinHeapSizeMB” which allows administrators to reserve the memory required for the VMFS heap during boot time. Note that “VMFS3.MinHeapSizeMB” cannot be set more than 255MB, but if additional heap is required it can grow up to 640MB. It alleviates the heap consumption issue seen in previous versions, allowing the ~ 60TB of open storage on VMFS-5 volumes per host to be accessed.

When reached for comment, Monster VM was quoted as saying “I’m happy about these changes and look forward to a larger population of Monster VMs like myself.”

photo

Advertisement

No comments

  1. Duncan says:

    “consume an extra 176MB of base memory which cannot be shared with virtual machines or other processes running on the host.”

    I probably wouldn’t worry about the cost of 176MB when I have virtual machines running on a single host with more than 25TB of storage open 🙂

  2. jason says:

    I wouldn’t worry about it either in today’s host density. An additional 176MB of overhead per host is a non-issue. However, Monster VMs (now my friend on Facebook – who’s responsible for that account anyway?) could see much larger memory overhead values and on a per VM basis. And while it doesn’t matter how many Monster VMs fit per host, at the end of the day they are still going to require cluster resources to satisfy their individual overhead entitlement values.

  3. Mark B says:

    Great synopsis of the problem and resolution I ran into. Glad you are letting the world know!

  4. Shaun says:

    So I assume that this is only an issue with vmdks on VMFS3 datastores? Vmdks residing on VMFS5 datatstores do not have this problem?

  5. Killian says:

    Does this apply to all storage or just VMFS? Does NFS have any of these limitations?

  6. jason says:

    Applies to VMFS-5 also. Don’t be thrown off by the “VMFS-3” advanced configuration label. That applies to 5 as well.

  7. jason says:

    @Shaun

    That’s a really good question. The KB article solely mentions VMFS but at the same time it calls out often that it’s an open virtual disk per host issue which would apply across the board. I will do some more digging and see if I can an answer on this.

  8. Shaun says:

    Found this interesting article about the subject:
    http://virtualkenneth.com/2011/02/15/vmfs3-heap-size-maxheapsizemb/

    Its saying that the 25TB max size isfrom using a 1MB block size ( VMFS5 would be using this ). Using an 8MB block size will allow up to 160TB using the default 80MB heap.

    I’m not sure how true this scenario is or if this changes with ESXi 5.1

  9. Drew says:

    Great article. I too am curious if this also affects NFS. I have several 5.0 host that have more than 8 TB of storage open on them and I haven’t had any issues yet… But we are running NFS, so I wonder if NFS is not affected by this.

    Anyways, thanks again for a great article and pointing this out to the community.

  10. jason says:

    There are some questions in the comments section about what types of stoarge the heap size constraint applies to. VMware has confirmed that heap size and max virtual disk capacity per host applies to VMFS only. The heap size constraint does not apply to RDMs nor does it apply to NFS datastores.

  11. Joe Tietz says:

    Starting to think that NFS might be wave of the future for VMware. With software defined storage around the corner no reason not to simplify and present it all as NFS.

  12. Roger lund says:

    Gah, this is a serious problem in my eyes. I have a DRV program that requires up front partitioned space (thick) when on block. With the space retention requirement the client has, the disk requirement is 45tb. A single 45tb drive is preferred. In some storage platforms rdms are not a option, and multiple vmdks would be the only choice, then requiring a very large dynamic disk within windows.

    The lack of large disk support like this is staggering.

    Roger Lund.

  13. Joe Tietz says:

    I have Agree with Roger this is good thing to know. I just ran in to this issue in a SQL POC. Wanted stress test the hosts with fauiled host.. Last 5 VM’s could not be moved becuse of heap error. We are all VMDK’s and all block storage at this time.

  14. Eric Miller says:

    This certainly is disconcerting, especially since backing up VMs requires mounting VMDKs to a backup virtual appliance. I wonder if, by snapshotting, and mounting the disks to the backup appliance doubles the amount of “open vmdk file space” for a given vmdk.

    Scalability seems to have always been low on the VMware priority list, in my opinion.

    Eric

  15. Eric says:

    Is there any reason “not” to just max out the heap size? I presume there’s a reason VMware set a smaller value. Is memory usage, the only con to increasing it? Meaning, is there any performance overhead with increasing the setting?