Posts Tagged ‘ESXi’

Monster VMs & ESX(i) Heap Size: Trouble In Storage Paradise

September 12th, 2012

While running Microsoft Exchange Server Jetstress on vSphere 5 VMs in the lab, tests were failing about mid way through initializing its several TBs of databases.  This was a real head scratcher.  Symptoms were unwritable storage or lack of storage capacity.  Troubleshooting yielding errors such as “Cannot allocate memory”.  After some tail chasing, the road eventually lead to VMware KB article 1004424: An ESXi/ESX host reports VMFS heap warnings when hosting virtual machines that collectively use 4 TB or 20 TB of virtual disk storage.

As it turns out, ESX(i) versions 3 through 5 have a statically defined per-host heap size:

  • 16MB for ESX(i) 3.x through 4.0: Allows a max of 4TB open virtual disk capacity (again, per host)
  • 80MB for ESX(i) 4.1 and 5.x: Allows a max of 8TB open virtual disk capacity (per host)

This issue isn’t specific to Jetstress, Exchange, Microsoft, or a specific fabric type, storage protocol or storage vendor.  Exceeding the virtual disk capacities listed above, per host, results in the symptoms discussed earlier and memory allocation errors.  In fact, if you take a look at the KB article, there’s quite a laundry list of possible symptoms depending on what task is being attempted:

  • An ESXi/ESX 3.5/4.0 host has more that 4 terabytes (TB) of virtual disks (.vmdk files) open.
  • After virtual machines are migrated by vSphere HA from one host to another due to a host failover, the virtual machines fail to power on with the error:vSphere HA unsuccessfully failed over this virtual machine. vSphere HA will retry if the maximum number of attempts has not been exceeded. Reason: Cannot allocate memory.
  • You see warnings in /var/log/messages or /var/log/vmkernel.logsimilar to:vmkernel: cpu2:1410)WARNING: Heap: 1370: Heap_Align(vmfs3, 4096/4096 bytes, 4 align) failed. caller: 0x8fdbd0
    vmkernel: cpu2:1410)WARNING: Heap: 1266: Heap vmfs3: Maximum allowed growth (24) too small for size (8192)
    cpu15:11905)WARNING: Heap: 2525: Heap cow already at its maximum size. Cannot expand.
    cpu15:11905)WARNING: Heap: 2900: Heap_Align(cow, 6160/6160 bytes, 8 align) failed. caller: 0x41802fd54443
    cpu4:1959755)WARNING:Heap: 2525: Heap vmfs3 already at its maximum size. Cannot expand.
    cpu4:1959755)WARNING: Heap: 2900: Heap_Align(vmfs3, 2099200/2099200 bytes, 8 align) failed. caller: 0x418009533c50
    cpu7:5134)Config: 346: “SIOControlFlag2” = 0, Old Value: 1, (Status: 0x0)
  • Adding a VMDK to a virtual machine running on an ESXi/ESX host where heap VMFS-3 is maxed out fails.
  • When you try to manually power on a migrated virtual machine, you may see the error:The VM failed to resume on the destination during early power on.
    Reason: 0 (Cannot allocate memory).
    Cannot open the disk ‘<<Location of the .vmdk>>’ or one of the snapshot disks it depends on.
  • The virtual machine fails to power on and you see an error in the vSphere client:An unexpected error was received from the ESX host while powering on VM vm-xxx. Reason: (Cannot allocate memory)
  • A similar error may appear if you try to migrate or Storage vMotion a virtual machine to a destination ESXi/ESX host on which heap VMFS-3 is maxed out.
  • Cloning a virtual machine using the vmkfstools -icommand fails and you see the error:Clone: 43% done. Failed to clone disk: Cannot allocate memory (786441)
  • In the /var/log/vmfs/volumes/DatastoreName/VirtualMachineName/vmware.log file, you may see error messages similar to:2012-05-02T23:24:07.900Z| vmx| FileIOErrno2Result: Unexpected errno=12, Cannot allocate memory
    2012-05-02T23:24:07.900Z| vmx| AIOGNRC: Failed to open ‘/vmfs/volumes/xxxx-flat.vmdk’ : Cannot allocate memory (c00000002) (0x2013).
    2012-05-02T23:24:07.900Z| vmx| DISKLIB-VMFS : “/vmfs/volumes/xxxx-flat.vmdk” : failed to open (Cannot allocate memory): AIOMgr_Open failed. Type 3
    2012-05-02T23:24:07.900Z| vmx| DISKLIB-LINK : “/vmfs/volumes/xxxx.vmdk” : failed to open (Cannot allocate memory).
    2012-05-02T23:24:07.900Z| vmx| DISKLIB-CHAIN : “/vmfs/volumes/xxxx.vmdk” : failed to open (Cannot allocate memory).
    2012-05-02T23:24:07.900Z| vmx| DISKLIB-LIB : Failed to open ‘/vmfs/volumes/xxxx.vmdk’ with flags 0xa Cannot allocate memory (786441).
    2012-05-02T23:24:07.900Z| vmx| DISK: Cannot open disk “/vmfs/volumes/xxxx.vmdk”: Cannot allocate memory (786441).
    2012-05-02T23:24:07.900Z| vmx| Msg_Post: Error
    2012-05-02T23:24:07.900Z| vmx| [msg.disk.noBackEnd] Cannot open the disk ‘/vmfs/volumes/xxxx.vmdk’ or one of the snapshot disks it depends on.
    2012-05-02T23:24:07.900Z| vmx| [msg.disk.configureDiskError] Reason: Cannot allocate memory.

While VMware continues to raise the scale and performance bar for it’s vCloud Suite, this virtual disk and heap size limitation becomes a limiting constraint for monster VMs or vApps.  Fortunately, there’s a fairly painless resolution (at least up until a certain point):  Increase the Heap Size beyond its default value on each host in the cluster and reboot each host.  The advanced host setting to configure is VMFS3.MaxHeapSizeMB.

Let’s take another look at the default heap size and with the addition of its maximum allowable heap size value:

  • ESX(i) 3.x through 4.0:
    • Default value: 16MB – Allows a max of 4TB open virtual disk capacity
    • Maximum value: 128MB – Allows a max of 32TB open virtual disk capacity per host
  • ESX(i) 4.1 and 5.x:
    • Default value: 80MB – Allows a max of 8TB open virtual disk capacity
    • Maximum value: 256MB – Allows a max of 25TB open virtual disk capacity per host

After increasing the heap size and performing a reboot, the ESX(i) kernel will consume additional memory overhead equal to the amount of heap size increase in MB.  For example, on vSphere 5, the increase of heap size from 80MB to 256MB will consume an extra 176MB of base memory which cannot be shared with virtual machines or other processes running on the host.

Readers may have also noticed an overall decrease in the amount of open virtual disk capacity per host supported in newer generations of vSphere.  While I’m not overly concerned at the moment, I’d bet someone out there has a corner case requiring greater than 25TB or even 32TB of powered on virtual disk per host.  With two of VMware’s core value propositions being innovation and scalability, I would tip-toe lightly around the phrase “corner case” – it shouldn’t be used as an excuse for its gaps while VMware pushes for 100% data virtualization and vCloud adoption.  Short term, the answer may be RDMs. Longer term: vVOLS.

Updated 9/14/12: There are some questions in the comments section about what types of stoarge the heap size constraint applies to.  VMware has confirmed that heap size and max virtual disk capacity per host applies to VMFS only. The heap size constraint does not apply to RDMs nor does it apply to NFS datastores.

Updated 4/4/13: VMware has released patch ESXi500-201303401-BG to address heap issues.  This patch makes improvements to both default and maximum limits of open VMDK files per vSphere host.  After applying the above patch to each host, the default heap size for VMFS-5 datastores becomes 640MB which supports 60TB of open VMDK files per host.  These new default configurations are also the maximum values as well.  For additional reading on other fine blogs, see A Small Adjustment and a New VMware Fix will Prevent Heaps of Issues on vSphere VMFS Heap and The Case for Larger Than 2TB Virtual Disks and The Gotcha with VMFS.

Updated 4/30/13: VMware has released vSphere 5.1 Update 1 and as Cormac has pointed out here, heap issue resolution has been baked into this release as follows:

  1. VMFS heap can grow up to a maximum of 640MB compared to 256MB in earlier release. This is identical to the way that VMFS heap size can grow up to 640MB in a recent patch release (patch 5) for vSphere 5.0. See this earlier post.
  2. Maximum heap size for VMFS in vSphere 5.1U1 is set to 640MB by default for new installations. For upgrades, it may retain the values set before upgrade. In such cases, please set the values manually.
  3. There is also a new heap configuration “VMFS3.MinHeapSizeMB” which allows administrators to reserve the memory required for the VMFS heap during boot time. Note that “VMFS3.MinHeapSizeMB” cannot be set more than 255MB, but if additional heap is required it can grow up to 640MB. It alleviates the heap consumption issue seen in previous versions, allowing the ~ 60TB of open storage on VMFS-5 volumes per host to be accessed.

When reached for comment, Monster VM was quoted as saying “I’m happy about these changes and look forward to a larger population of Monster VMs like myself.”

photo

VMworld 2012 Announcements – Part I

August 27th, 2012

VMworld 2012 is underway in San Francisco.  Once again, a record number of attendees is expected to gather at the Moscone Center to see what VMware and their partners are announcing.  From a VMware perspective, there is plenty.

Given the sheer quantity of announcements, I’m actually going to break up them up into a few parts, this post being Part I.  Let’s start with the release of vSphere 5.1 and some of its notable features.

Enhanced vMotion – the ability to now perform a vMotion as well as a Storage vMotion simultaneously. In addition, this becomes an enabler to perform vMotion without the shared storage requirement.  Enhanced vMotion means we are able to migrate a virtual machine stored on local host storage, to shared storage, and then to local storage again.  Or perhaps migrate virtual machines from one host to another with each having their own locally attached storage only.  Updated 9/5/12 The phrase “Enhanced vMotion” should be correctly read as “vMotion that has been enhanced”.  “Enhanced vMotion” is not an actual feature, product, or separate license.  It is an improvement over the previous vMotion technology and included wherever vMotion is bundled.

Snagit Capture

Enhanced vMotion Requirements:

  • Hosts must be managed by same vCenter Server
  • Hosts must be part of same Datacenter
  • Hosts must be on the same layer-2 network (and same switch if VDS is used)

Operational Considerations:

  • Enhanced vMotion is a manual process
  • DRS and SDRS automation do not leverage enhanced vMotion
  • Max of two (2) concurrent Enhanced vMotions per host
  • Enhanced vMotions count against concurrent limitations for both vMotion and Storage vMotion
  • Enhanced vMotion will leverage multi-NIC when available

Next Generation vSphere Client a.k.a. vSphere Web Client – An enhanced version of the vSphere Web Client which has already been available in vSphere 5.0.  As of vSphere 5.1, the vSphere Web Client becomes the defacto standard client for managing the vSphere virtualized datacenter.  Going forward, single sign-on infrastructure management will converge into a unified interface which any administrator can appreciate.  vSphere 5.1 will be the last platform to include the legacy vSphere client. Although you may use this client day to day while gradually easing into the Web Client, understand that all future development from VMware and its partners now go into the Web Client. Plug-ins currently used today will generally still function with the legacy client (with support from their respective vendors) but they’ll need to be completely re-written vCenter Server side for the Web Client.  Aside from the unified interface, the architecture of the Web Client has scaling advantages as well.  As VMware adds bolt-on application functionality to the client, VMware partners will now have the ability to to bring their own custom objects objects into the Web Client thereby extending that single pane of glass management to other integrations in the ecosystem.

 

Here is a look at that vSphere Web Client architecture:

Snagit Capture

Requirements:

  • Internet Explorer / FireFox / Chrome
  • others (Safari, etc.) are possible, but will lack VM console access

A look at the vSphere Web Client interface and its key management areas:

Snagit Capture

Where the legacy vSphere Client fall short and now the vSphere Web Client solves these issues:

  • Single Platform Support (Windows)
    • vSphere Web Client is Platform Agnostic
  • Scalability Limits
    • Built to handle thousands of objects
  • White Screen of Death
    • Performance
  • Inconsistent look and feel across VMware solutions
    • Extensibility
  • Workflow Lock
    • Pause current task and continue later right where you left off (this one is cool!)
    • Browser Behavior
  • Upgrades
    • Upgrade a Single serverside component

 vCloud Director 5.1

In the recent past, VMware aligned common application and platform releases to ease issues that commonly occurred with compatibility.  vCloud Director, the cornerstone of the vCloud Suite, is obviously the cornerstone in how VMware will deliver infrastructure, applications, and *aaS now and into the future. So what’s new in vCloud Director 5.1?  First an overview of the vCloud Suite:

Snagit Capture

And a detailed list of new features:

  • Elastic Virtual Datacenters – Provider vDCs can span clusters leveraging VXLAN allowing the distribution and mobility of vApps across infrastructure and the growing the vCloud Virtual Datacenter
  • vCloud Networking & Security VXLAN
  • Profile-Driven Storage integration with user and storage provided capabilities
  • Storage DRS (SDRS) integration
    • Exposes storage Pod as first class storage container (just like a datastores) making it visible in all workflows where a datastore is visible
    • Creation, modification, and deletion of spods not possible in vCD
    • Member datastore operations not permissible in VCD
  • Single level Snapshot & Revert support for vApps (create/revert/remove); integration with Chargeback
  • Integrated vShield Edge Gateway
  • Integrated vShield Edge Configuration
  • vCenter Single Sign-On (SSO)
  • New Features in Networking
    • Integrated Organization vDC Creation Workflow
    • Creates compute, storage, and networking objects in a single workflow
    • The Edge Gateway are exposed at Organization vDC level
    • Organization vDC networks replace Organization networks
    • Edge Gateways now support:
      • Multiple Interfaces on a Edge Gateway
      • The ability to sub-allocate IP pools to a Edge Gatewa
      • Load balancing
      • HA (not the same as vSphere HA)
        • Two edge VMs deployed in Active-Passive mode
        • Enabled at time of gateway creation
        • Can also be changed after the gateway has been completed
        • Gets deployed with first Organizational network created that uses this gateway
      • DNS Relay
        • Provides a user selectable checkbox to enable
        • If DNS servers are defined for the selected external network, DNS requests will be sent to the specified server. If not, then DNS requests will be sent to the default gateway of the external network.
      • Rate limiting on external interface
    • Organization networks replaced by Organization vDC Networks
      • Organization vDC Networks are associated with an Organization vDC
      • The network pool associated with Organization vDC is used to create routed and isolated Organization vDC networks
      • Can be shared across Organization vDCs in an Organization
    • Edge Gateways
      • Are associated with an Organization vDC, can not be shared across Organization vDCs
      • Can be connected to multiple external networks
        • Multiple routed Organization vDC networks will be connected to the same Edge Gateway
      • External network connectivity for the Organization vDC Network can be changed after creation by changing the external networks which the edge gateway is connected.
      • Allows IP pool of external networks to be sub-allocated to the Edge Gateway
        • Needs to be specified in case of NAT and Load Balancer
    • New Features in Gateway Services
      • Load balancer service on Edge Gateways
      • Ability to add multiple subnets to VPN tunnels
      • Ability to add multiple DHCP IP pools
      • Ability to add explicit SNAT and DNAT rules providing user with full control over address translation
      • IP range support in Firewall and NAT services
      • Service Configuration Changes
        • Services are configured on Edge Gateway instead of at the network level
        • DHCP can be configured on Isolated Organization vDC networks.
  • Usability Features
    • New default branding style
      • Cannot revert back to the Charcoal color scheme
      • Custom CSS files will require modification
    • Improved “Add vApp from Catalog” wizard workflow
    • Easy access to VM Quota and Lease Expirations
    • New dropdown menu that includes details and search
    • Redesigned catalog navigation and sub-entity hierarchy
    • Enhanced help and documentation links
  • Virtual Hardware Version 9
    • Supports features presented by HW9 (like 64 CPU support)
    • Supports Hardware Virtualization Calls
    • VT-x/EPT or AMD-V/RVI
    • Memory overhead increased, vMotion limited to like hardware
    • Enable/Disable exposed to users who have rights to create a vApp Template
  • Additional Guest OS Support
    • Windows 8
    • Mac OS 10.5, 10.6 and 10.7
  • Storage Independent of VM Feature
    • Added support for Independent Disks
    • Provides REST API support for actions on Independent Disks
      • As these consume disk space, the vCD UI was updated to show user when they are used:
      • Organizations List Page
      • A new Independent Disks count column is added.
      • Organization Properties Page
      • Independent Disks tab is added to show all independent disks belonging to vDC
      • Tab is not shown if no independent disk exists in the vDC.
      • Virtual Machine Properties Page
      • Hardware tab->Hard Disks section, attached independent disks are shown by their names and all fields for the disk are disabled as they are not editable.

That’s all I have time for right now.  As I said, there is more to come later on topics such as vDS enhancements, VXLAN, SRM, vCD Load Balancing, and vSphere Replication.  Stay tuned!

StarWind and Cirrus Tech Partner to Deliver Cutting Edge Technologies to the Cloud Computing Market

August 12th, 2012

Press Release

StarWind Solutions Become Available Through a Leading Canadian Web Hosting Company

Burlington, MA – 6 August 2012StarWind Software Inc., an innovative provider of storage virtualization software and VM backup technology, announced today a new partnership agreement with Cirrus Tech Ltd., a Canadian web hosting company specializing in VPS, VM and cloud hosting services. Companies collaborate to deliver best-in-breed Cloud services that help customers accelerate their businesses.

According to the agreement, Cirrus Tech extends its portfolio with StarWind storage virtualization software and will offer it to their customers as a dedicated storage platform that delivers a highly available and high performance scalable storage infrastructure that is capable of supporting heterogeneous server environments; as Cloud storage for private clouds as well as a robust solution for building Disaster Recovery (DR) plans.

StarWind SAN solutions deliver a wide variety of enterprise-class features, such as High Availability (HA), Synchronous Data Mirroring, Remote Asynchronous Replication, CDP/Snapshots, Thin Provisioning, Global Deduplication, etc., that make the stored data highly available, simplify storage management, and ensure business continuity and disaster recovery.

“Companies are increasingly turning to cloud services to gain efficiencies and respond faster to today’s changing business requirements.” said Artem Berman, Chief Executive Officer of StarWind Software, Inc. “We are pleased to combine our forces with Cirrus Tech in order to deliver our customers a wide range of innovative cloud services that will help their transition to a flexible and efficient shared IT infrastructure.”

“Every business needs to consider what would happen in the event of a disaster,” shares Cirrus CEO Ehsan Mirdamadi. “By bringing StarWind’s SAN solution to our customers, we are helping them to ease the burden of disaster recovery planning by offering powerful and affordable storage options. You never want to think of the worst, but when it comes to your sensitive data and business critical web operations, it’s always better to be safe than sorry. Being safe just got that much easier for Cirrus customers.”

To find out more about Cirrus’ web hosting services visit http://www.cirrushosting.com or call 1.877.624.7787.
For more information about StarWind, visit www.starwindsoftware.com

About Cirrus Hosting
Cirrus Tech Ltd. has been a leader in providing affordable, dependable VHS and VPS hosting services in Canada since 1999. They have hosted and supported hundreds of thousands of websites and applications for Canadian businesses and clients around the world. As a BBB member with an A+ rating, Cirrus Tech is a top-notch Canadian web hosting company with professional support, rigorous reliability and easily upgradable VPS solutions that grow right alongside your business. Their Canadian data center is at 151 Front Street in Toronto.

About StarWind Software Inc.
StarWind Software is a global leader in storage management and SAN software for small and midsize companies. StarWind’s flagship product is SAN software that turns any industry-standard Windows Server into a fault-tolerant, fail-safe iSCSI SAN. StarWind iSCSI SAN is qualified for use with VMware, Hyper-V, XenServer and Linux and Unix environments. StarWind Software focuses on providing small and midsize companies with affordable, highly availability storage technology which previously was only available in high-end storage hardware. Advanced enterprise-class features in StarWind include Automated HA Storage Node Failover and Failback (High Availability), Replication across a WAN, CDP and Snapshots, Thin Provisioning and Virtual Tape management.

Since 2003, StarWind has pioneered the iSCSI SAN software industry and is the solution of choice for over 30,000 customers worldwide in more than 100 countries and from small and midsize companies to governments and Fortune 1000 companies.

For more information on StarWind Software Inc., visit: www.starwindsoftware.com

Storage: Starting Thin and Staying Thin with VAAI UNMAP

June 28th, 2012

For me, it’s hard to believe nearly a year has elapsed since vSphere 5 was announced on July 12th.  Among the many new features that shipped was an added 4th VAAI primitive for block storage.  The primitive itself revolved around thin provisioning and was the sum of two components: UNMAP and STUN.  At this time I’m going to go through the UNMAP/Block Space Reclamation process in a lab environment and I’ll leave STUN for a later discussion.

Before I jump into the lab, I want frame out a bit of a chronological timeline around the new primitive.  Although this 4th primitive was formally launched with vSphere 5 and built into the corresponding platform code that shipped, a few months down the road VMware issued a recall on the UNMAP portion of the primitive due to a discovery made either in the field or in their lab environment.  With the UNMAP component recalled, the Thin Provisioning primitive as a whole (including the STUN component) was not supported by VMware.  Furthermore, storage vendors could not be certified for the Thin Provisioning VAAI primitive although the features may have been functional if their respective arrays supported it.  A short while later, VMware released a patch which, once installed on the ESXi hosts, disabled the UNMAP functionality globally.  In March of this year, VMware released vSphere 5.0 Update 1.  With this release, VMware implemented the necessary code to resolve the performance issues related to UNMAP.  However, VMware did not re-enable the automatic UNMAP mechanism.  Instead and in the interim, VMware implemented a manual process for block space reclamation on a per datastore basis regardless of the global UNMAP setting on the host.  I believe it is VMware’s intent to bring back “automatic” UNMAP long term but that is purely speculation.  This article will walk through the manual process of returning unused blocks to a storage array which supports both thin provisioning and the UNMAP feature.

I also want to point out some good information that already exists on UNMAP which introduces the feature and provides a good level of detail.

  • Duncan Epping wrote this piece about a year ago when the feature was launched.
  • Cormac Hogan wrote this article in March when vSphere 5.0 Update 1 was launched and the manual UNMAP process was re-introduced.
  • VMware KB 2014849 Using vmkfstools to reclaim VMFS deleted blocks on thin-provisioned LUNs

By this point, if you are unaware of the value of UNMAP, it is simply keeping thin provisioned LUNs thin.  By doing so, raw storage is consumed and utilized in the most efficient manner yielding cost savings and better ROI for the business. Arrays which support thin provisioning have been shipping for years.  What hasn’t matured is just as important as thin provisioning itself: the ability to stay thin where possible.  I’m going to highlight this below in a working example but basically once pages are allocated from a storage pool, they remain pinned to the volume they were originally allocated for, even after the data written to those pages has been deleted or moved.  Once the data is gone, the free space remains available to that particular LUN and the storage host which owns it and will continue to manage it – whether or not that free space will ever be needed again in the future for that storage host.  Without UNMAP, the pages are never released back to the global storage pool where they may be allocated to some other LUN or storage host whether it be virtual or physical.  Ideal use cases for UNMAP:  Transient data, Storage vMotion, SDRS, data migration. UNMAP functionality requires the collaboration of both operating system and storage vendors.  As an example, Dell Compellent Storage Center has supported the T10 UNMAP command going back to early versions of the 5.x Storage Center code, however there has been very little adoption on the OS platform side which is responsible for issuing the UNMAP command to the storage array when data is deleted from a volume.  RHEL 6 supports it, vSphere 5.0 Update 1 now supports it, and Windows Server 2012 is slated to be the first Windows platform to support UNMAP.

UNMAP in the Lab

So in the lab I have a vSphere ESXi 5.0 Update 1 host attached to a Dell Compellent Storage Center SAN.  To demonstrate UNMAP, I’ll Storage vMotion a 500GB virtual machine from one 500GB LUN to another 500GB LUN.  As you can see below from the Datastore view in the vSphere Client, the 500GB VM is already occupying lun1 and an alarm is thrown due to lack of available capacity on the datastore:

Snagit Capture

Looking at the volume in Dell Compellent Storage Center, I can see that approximately 500GB of storage is being consumed from the storage page pool. To keep the numbers simple, I’ll ignore actual capacity consumed due to RAID overhead.

Snagit Capture

After the Storage vMotion

I’ve now performed a Storage vMotion of the 500GB VM from lun1 to lun2.  Again looking at the datastores from a vSphere client perspective, I can see that lun2 is now completely consumed with data while lun1 is no longer occupied – it now has 500GB  capacity available.  This is where operating systems and storage arrays which do not support UNMAP fall short of keeping a volume thin provisioned.

Snagit Capture

Using the Dell Compellent vSphere Client plug-in, I can see that the 500GB of raw storage originally allocated for lun1 remains pinned with lun1 even though the LUN is empty!  I’m also occupying 500GB of additional storage for the virtual machine now residing on lun2.  The net here is that as a result of my Storage vMotion, I’m occupying nearly 1TB of storage capacity for a virtual machine that’s half the size.  If I continue to Storage vMotion this virtual machine to other LUNs, the problem is compounded and the available capacity in the storage pool continues to drain, effectively raising the high watermark of consumed storage.  To add insult to injury, this will more than likely be stranded Tier 1 storage – backed by the most expensive spindles in the array.

Snagit Capture

Performing the Manual UNMAP

Using a PuTTY connection to the ESXi host, I’ll start with identifying the naa ID of my datastore using esxcli storage core device list |more

Snagit Capture

Following the KB article above, I’ll make sure my datastore supports the UNMAP primitive using esxcli storage core device vaai status get -d <naa ID>.  The output shows UNMAP is supported by Dell Compellent Storage Center, in addition to the other three core VAAI primitives (Atomic Test and Set, Copy Offload, and Block Zeroing).

Snagit Capture

I’ll now change to the directory of the datastore and perform the UNMAP using vmkfstools -y 100.  It’s worth pointing out here that using a value of 100, although apparently supported, ultimately fails.  I reran the command using a value of 99% which successfully unmapped 500GB in about 3 minutes.

Snagit Capture

Also important to note is VMware recommends the reclaim be run after hours or during a maintenance window with maximum recommended reclaim percentage of 60%.  This value is pointed out by Duncan in the article I linked above and it’s also noted when providing a reclaim value outside of the acceptable parameters of 0-100.  Here’s the reasoning behind the value:  When the manual UNMAP process is run, it balloons up a temporary hidden file at the root of the datastore which the UNMAP is being run against.  You won’t see this balloon file with the vSphere Client’s Datastore Browser as it is hidden.  You can catch it quickly while UNMAP is running by issuing the ls -l -a command against the datastore directory.  The file will be named .vmfsBalloon along with a generated suffix.  This file will quickly grow to the size of data being unmapped (this is actually noted when the UNMAP command is run and evident in the screenshot above).  Once the UNMAP is completed, the .vmfsBalloon file is removed.  For a more detailed explanation behind the .vmfsBalloon file, check out this blog article.

Snagit Capture

The bottom line is that the datastore needs as much free capacity as what is being unmapped.  VMware’s recommended value of 60% reclaim is actually a broad assumption that the datastore will have at least 60% capacity available at the time UNMAP is being run.  For obvious reasons, we don’t want to run the datastore out of capacity with the .vmfsBalloon file, especially if there are still VMs running on it.  My recommendation if you are unsure or simply bad at math: start with a smaller percentage of block reclaim initially and perform multiple iterations of UNMAP safely until all unused blocks are returned to the storage pool.

To wrap up this procedure, after the UNMAP step has been run with a value of 99%, I can now see from Storage Center that nearly all pages have been returned to the page pool and 500gbvol1 is only consuming a small amount of raw storage comparatively – basically the 1% I wasn’t able to UNMAP using the value of 99% earlier.  If I so chose, I could run the UNMAP process again with a value of 99% and that should return just about all of the 2.74GB still being consumed, minus the space consumed for VMFS-5 formatting.

Snagit Capture

The last thing I want to emphasize is that today, UNMAP works at the VMFS datastore layer and isn’t designed to work inside the encapsulated virtual machine.  In other words, if I delete a file inside a guest operating system running on top of the vSphere hypervisor with attached block storage, that space can’t be liberated with UNMAP.  As a vSphere and storage enthusiast, for me that would be next on the wish list and might be considered by others as the next logical step in storage virtualization.  And although UNMAP doesn’t show up in Windows platforms until 2012, Dell Compellent has developed an agent which accomplishes the free space recovery on earlier versions of Windows in combination with a physical raw device mapping (RDM).

Update 7/2/12: VMware Labs released its latest fling – Guest Reclaim.

From labs.vmware.com:

Guest Reclaim reclaims dead space from NTFS volumes hosted on a thin provisioned SCSI disk. The tool can also reclaim space from full disks and partitions, thereby wiping off the file systems on it. As the tool deals with active data, please take all precautionary measures understanding the SCSI UNMAP framework and backing up important data.

Features

  • Reclaim space from Simple FAT/NTFS volumes
  • Works on WindowsXP to Windows7
  • Can reclaim space from flat partitions and flat disks
  • Can work in virtual as well as physical machines

Whats a Thin provisioned (TP) SCSI disks? In a thin provisioned LUN/Disk, physical storage space is allocated on demand. That is, the storage system allocates space as and when a client (example a file system/database) writes data to the storage medium. One primary goal of thin provisioning is to allow for storage overcommit. A thin provisioned disk can be a virtual disk, or a physical LUN/disk exposed from a storage array that supports TP. Virtual disks created as thin disks are exposed as TP disks, starting with virtual Hardware Version 9. For more information on this please refer http://en.wikipedia.org/wiki/Thin_provisioning. What is Dead Space Reclamation?Deleting files frees up space on the file system volume. This freed space sticks with the LUN/Disk, until it is released and reclaimed by the underlying storage layer. Free space reclamation allows the lower level storage layer (for example a storage array, or any hypervisor) to repurpose the freed space from one client for some other storage allocation request. For example:

  • A storage array that supports thin provisioning can repurpose the reclaimed space to satisfy allocation requests for some other thin provisioned LUN within the same array.
  • A hypervisor file system can repurpose the reclaimed space from one virtual disk for satisfying allocation needs of some other virtual disk within the same data store.

GuestReclaim allows transparent reclamation of dead space from NTFS volumes. For more information and detailed instructions, view the Guest Reclaim ReadMe (pdf)

Update 5/14/13: Excerpt from Cormac Hogan’s vSphere storage blog: “We’ve recently been made aware of a limitation on our UNMAP mechanism in ESXi 5.0 & 5.1. It would appear that if you attempt to reclaim more than 2TB of dead space in a single operation, the UNMAP primitive is not handling this very well.” Read more about it here: Heads Up! UNMAP considerations when reclaiming more than 2TB s

Update 9/13/13: vSphere 5.5 UNMAP Deep Dive

Using vim-cmd To Power On Virtual Machines

June 21st, 2012

I’ve been pretty lucky in that since retiring the UPS equipment in the lab, the flow of electricity to the lab has been both clean and consistent.  We get some nasty weather and high winds in this area but I’ll bet there hasn’t been an electrical outage in close to 2+ years.  Well, early Tuesday morning we had a terrible storm with hail and winds blowing harder than ever.  I ended up losing some soffits, window screens, and a two stall garage door.  A lot of mature trees were also lost in the surrounding area.  I was pretty sure we’d be losing electricity and the lab would go down hard – and it did.

If you’re familiar with my lab, you might know that it’s 100% virtualized.  Every infrastructure service, including DHCP, DNS, and Active Directory, resides in a virtual machine.  This is great when the environment is stable, but recovering this type of environment from a complete outage can be a little hairy.  After bringing the network, storage, and ESXi hosts online, I still have no DHCP on the network from which I’d leverage to open the vSphere Client and connect to an ESXi host.  What this means is that I typically will bring up a few infrastructure VMs from the ESXi host TSM (Tech Support Mode) console.  No problem, I’ve done this many times in the past using vmware-cmd.

Snagit Capture

Well, on ESXi 5.0 Update 1, vmware-cmd no longer brings joy.  The command has apparently been deprecated and replaced by /usr/bin/vim-cmd.

Snagit Capture

Before I can start my infrastructure VMs using vim-cmd, I need to find their corresponding Vmid using vim-cmd vmsvc/getallvms (add |more at the end to pause at each page of a long list of registered virtual machines):

Snagit Capture

Now that I have the Vmid for the infrastructure VM I want to power on, I can power it on using vim-cmd vmsvc/power.on 77.  At this point I’ll have DHCP and I can use the vSphere Client on my workstation to power up the remaining virtual machines in order.  Or, I can continue using vim-cmd to power on virtual machines.

Snagit Capture

As you can see from the output below, there is much more that vim-cmd can accomplish within the virtual machine vmsvc context:

Snagit Capture

Take a quick look at this in your lab. Command line management is popular on the VCAP-DCA exams. Knowing this could prove useful in the exam room or the datacenter the next time you experience an outage.

Invitation to Dell/Sanity Virtualization Seminar

May 22nd, 2012

I know this is pretty short notice but I wanted to make local readers aware of a lunch event taking place tomorrow between 11:00am and 1:30pm.  Dell and Sanity Solutions will be discussing storage technologies for your vSphere virtualized datacenter, private, public, or  hybrid cloud.  I’ll be on hand as well talking about some of the key integration points between the vSphere and Storage Center.  You can find full details in the brochure below.  Click on it or this text to get yourself registered and we’ll hope to see you tomorrow.

Snagit Capture

AppAssure Webinar – The Top 3 Backup Technologies for 2012

April 24th, 2012

Webinar Announcement:

What: The Top 3 Backup Technologies for 2012

When: Wednesday, April 25, 12:00 PM CST

Where: http://go.appassure.com/webinar-top-3-backup-technologies-042512-gmt

Details: FEATURED SPEAKER: Joseph Hand, Sr. Director of Product Strategy, AppAssure Software.  Learn about the 3 Backup & Recovery technologies you need for 2012!  Instant Restore with near-zero RTO, Backup and Recovery auto-verification, and Cross-platform Data Recovery – P2V, V2V, V2P.

For more on this and other AppAssure webinars, please visit http://www.appassure.com/live-webinars/