vSphere 5.5 UNMAP Deep Dive

September 13th, 2013 by jason Leave a reply »

One of the features that has been updated in vSphere 5.5 is UNMAP which is one of two sub-components of what I’ll call the fourth block storage based thin provisioning VAAI primitive (the other sub-component is thin provisioning stun).  I’ve already written about UNMAP a few times in the past.  It was first introduced in vSphere 5.0 two years ago.  A few months later the feature was essentially recalled by VMware.  After it was re-released by VMware in 5.0 Update 1, I wrote about its use here and followed up with a short piece about the .vmfsBalloon file here.

For those unfamiliar, UNMAP is a space reclamation mechanism used to return blocks of storage back to the array after data which was once occupying those blocks has been moved or deleted.  The common use cases are deleting a VM from a datastore, Storage vMotion of a VM from a datastore, or consolidating/closing vSphere snapshots on a datastore.  All of these operations, in the end, involve deleting data from pinned blocks/pages on a volume.  Without UNMAP, these pages, albeit empty and available for future use by vSphere and its guests only, remain pinned to the volume/LUN backing the vSphere datastore.  The pages are never returned back to the array for use with another LUN or another storage host.  Notice I did not mention shrinking a virtual disk or a datastore – neither of those operations are supported by VMware.  I also did not mention the use case of deleting data from inside a virtual machine – while that is not supported, I believe there is a VMware fling for experimental use.  In summary, UNMAP extends the usefulness of thin provisioning at the array level by maintaining storage efficiency throughout the life cycle of the vSphere environment and the array which supports the UNMAP VAAI primitive.

On the Tuesday during VMworld, Cormac Hogan launched his blog post introducing new and updated storage related features in vSphere 5.5.  One of those features he summarized was UNMAP.  If you haven’t read his blog, I’d definitely recommend taking a look – particularly if you’re involved with vSphere storage.  I’m going to explore UNMAP in a little more detail.

The most obvious change to point out is the command line itself used to initiate the UNMAP process.  In previous versions of vSphere, the command issued on the vSphere host was:

vmkfstools -y x (where x represent the % of storage to unmap)

As Cormac points out, UNMAP has been moved to esxcli namespace in vSphere 5.5 (think remote scripting opportunities after XYZ process) where the basic command syntax is now:

esxcli storage vmfs unmap

In addition to the above, there are also three switches available for use; of first two listed below, one is required, and the third is optional.

-l|–volume-label=<str> The label of the VMFS volume to unmap the free blocks.

-u|–volume-uuid=<str> The uuid of the VMFS volume to unmap the free blocks.

-n|–reclaim-unit=<long> Number of VMFS blocks that should be unmapped per iteration.

Previously with vmkfstools, we’d change to VMFS folder in which we were going to UNMAP blocks from.  In vSphere 5.5, the esxcli command can be run from anywhere so specifying the the datastore name or the uuid is one of the required parameters for obvious reasons.  So using the datastore name, the new UNMAP command in vSphere 5.5 is going to look like this:

esxcli storage vmfs unmap -l 1tb_55ds

As for the optional parameter, the UNMAP command is an iterative process which continues through numerous cycles until complete.  The reclaim unit parameter specifies the quantity of blocks to unmap per each iteration of the UNMAP process.  In previous versions of vSphere, VMFS-3 datastores could have block sizes of 1, 2, 4, or 8MB.  While upgrading a VMFS-3 datastore to VMFS-5 will maintain these block sizes, executing an UNMAP operation on a native net-new VMFS-5 datastore results in working with a 1MB block size only.  Therefore, if a reclaim unit value of 100 is specified on a VMFS-5 datastore with a 1MB block size, then 100MB data will be returned to the available raw storage pool per iteration until all blocks marked available for UNAMP are returned.  Using a value of 100, the UNMAP command looks like this:

esxcli storage vmfs unmap -l 1tb_55ds -n 100

If the reclaim unit value is unspecified when issuing the UNMAP command, the default reclaim unit value is 200, resulting in 200MB of data returned to the available raw storage pool per iteration assuming a 1MB block size datastore.

One additional piece to to note on the CLI topic is that in a release candidate build I was working with, while the old vmkfstools -y command is deprecated, it appears to still exist but with newer vSphere 5.5 functionality published in the –help section:

vmkfstools vmfsPath -y –reclaimBlocks vmfsPath [–reclaimBlocksUnit #blocks]

The next change involves the hidden temporary balloon file (refer to my link at the top if you’d like more information about the balloon file but basically it’s a mechanism used to guarantee blocks targeted for UNMAP are not in the interim written to by an outside I/O request until the UNMAP process is complete).  It is no longer named .vmfsBalloon.  The new name is .asyncUnmapFile as shown below.

/vmfs/volumes/5232dd00-0882a1e4-e918-0025b3abd8e0 # ls -l -h -A
total 998408
-r——–    1 root     root      200.0M Sep 13 10:48 .asyncUnmapFile
-r——–    1 root     root        5.2M Sep 13 09:38 .fbb.sf
-r——–    1 root     root      254.7M Sep 13 09:38 .fdc.sf
-r——–    1 root     root        1.1M Sep 13 09:38 .pb2.sf
-r——–    1 root     root      256.0M Sep 13 09:38 .pbc.sf
-r——–    1 root     root      250.6M Sep 13 09:38 .sbc.sf
drwx——    1 root     root         280 Sep 13 09:38 .sdd.sf
drwx——    1 root     root         420 Sep 13 09:42 .vSphere-HA
-r——–    1 root     root        4.0M Sep 13 09:38 .vh.sf
/vmfs/volumes/5232dd00-0882a1e4-e918-0025b3abd8e0 #

As discussed in the previous section, use of the UNMAP command now specifies the the actual size of the temporary file instead of the temporary file size being determined by a percentage of space to return to the raw storage pool.  This is an improvement in part because it helps avoid the catastrophe if UNMAP tried to remove 2TB+ in a single operation (discussed here).

VMware has also enhanced the functionality of the temporary file.  A new kernel interface in ESXi 5.5 allows the user to ask for blocks beyond a a specified block address in the VMFS file system.  This ensures that the blocks allocated to the temporary file were never allocated to the temporary file previously.  The benefit realized in the end is that any size temporary file can be created and with UNMAP issued to the blocks allocated to the temporary file, we can rest assured that we can issue UNMAP on all free blocks on the datastore.

Going a bit deeper and adding to the efficiency, VMware has also enhanced UNMAP to support multiple block descriptors.  Compared to vSphere 5.1 which issued just one block descriptor per UNMAP command, vSphere 5.5 now issues up to 100 block descriptors depending on the storage array (these identifying capabilities are specified internally in the Block Limits VPD (B0) page).

A look at the asynchronous and iterative vSphere 5.5 UNMAP logical process:

  1. User or script issues esxcli UNMAP command
  2. Does the array support VAAI UNMAP?  yes=3, no=end
  3. Create .asyncUnmapFile on root of datastore
  4. .asyncUnmapFile created and locked? yes=5, no=end
  5. Issue 10CTL to allocate reclaim-unit blocks of storage on the volume past the previously allocated block offset
  6. Did the previous block allocation succeed? yes=7, no=remove lock file and retry step 6
  7. Issue UNMAP on all blocks allocated above in step 5
  8. Remove the lock file
  9. Did we reach the end of the datastore? yes=end, no=3

From a performance perspective, executing the UNMAP command in my vSphere 5.5 RC lab showed peak write I/O of around 1,200MB/s with an average of around 200IOPS comprised of a 50/50 mix of read/write.  The UNMAP I/O pattern is a bit hard to gauge because with the asynchronous iterative process, it seemed to do a bunch of work, rest, do more work, rest, and so on.  Sorry no screenshots because flickr.com is currently down.  Perhaps the most notable takeaway from the performance section is that as of vSphere 5.5, VMware is lifting the recommendation of only running UNMAP during a maintenance window.  Keep in mind this is just a recommendation.  I encourage vSphere 5.5 customers to test UNMAP in their lab first using various reclaim unit sizes.  While do this, examine performance impacts to the storage fabric, the storage array (look at both front end and back end), as well as other applications sharing the array.  Remember that fundamentally the UNMAP command is only going to provide a benefit AFTER its associated use cases have occurred (mentioned at the top of the article).  Running UNMAP on a volume which has no pages to be returned will be a waste of effort.  Once you’ve become comfortable with using UNMAP and understanding its impacts in your environment, consider running it on a recurring schedule – perhaps weekly.  It really depends on how much the use cases apply to your environment.  Many vSphere backup solutions leverage vSphere snapshots which is one of the use cases.  Although it could be said there are large gains to be made with UNMAP in this case, keep in mind backups run regularly and and space that is returned to raw storage with UNMAP will likely be consumed again in the following backup cycle where vSphere snapshots are created once again.

To wrap this up, customers who have block arrays supporting the thin provision VAAI primitive will be able to use UNMAP in vSphere 5.5 environments (for storage vendors, both sub-components are required to certify for the primitive as a whole on the HCL).  This includes Dell Compellent customers with current version of Storage Center firmware.  Customers who use array based snapshots with extended retention periods should keep in mind that while UNMAP will work against active blocks, it may not work with blocks maintained in a snapshot.  This is to honor the snapshot based data protection retention.

Advertisement

No comments

  1. Tran Phan says:

    Jason,

    Thank you for the detailed write-up on this feature. Very helpful!

    Best,
    Tran

  2. Abhijit Chaudhuri says:

    Is there a switch available with “esxcli storage vmfs…” to get the size of the space that can be unmapped? That would help figuring out when to schedule unmapping the reclaimable space.

  3. wong boon hong says:

    I notice that [vmkfstools -y] no longer works with percentage, you have to state the volume instead.

    I also notice that [esxcli storage vmfs unmap -l] failed to reclaim anything despite executed without showing any error. [esxcli storage vmfs unmap -u] works. So I have to use UUID.