Posts Tagged ‘3rd Party Apps’

The .vmfsBalloon File

July 1st, 2013

One year ago, I wrote a piece about thin provisioning and the role that the UNMAP VAAI primitive plays in thin provisioned storage environments.  Here’s an excerpt from that article:

When the manual UNMAP process is run, it balloons up a temporary hidden file at the root of the datastore which the UNMAP is being run against.  You won’t see this balloon file with the vSphere Client’s Datastore Browser as it is hidden.  You can catch it quickly while UNMAP is running by issuing the ls -l -a command against the datastore directory.  The file will be named .vmfsBalloonalong with a generated suffix.  This file will quickly grow to the size of data being unmapped (this is actually noted when the UNMAP command is run and evident in the screenshot above).  Once the UNMAP is completed, the .vmfsBalloon file is removed.

Has your curiosity ever got you wondering about the technical purpose of the .vmfsBalloon file?  It boils down to data integrity and timing.  At the time the UNMAP command is run, the balloon file is immediately instantiated and grows to occupy (read: hog) all of the blocks that are about to be unmapped.  It does this so that during the unmap process, none of the blocks are allocated during the process of new file creation elsewhere.  If you think about it, it makes sense – we just told vSphere to give these blocks back to the array.  If during the interim one or more of these blocks were suddenly allocated for a new file or file growth purposes, then we purge the block, we have a data integrity issue.  More accurately, newly created data will be missing as its block or blocks were just flushed back to the storage pool on the array.

Baremetalcloud Special Promo Through MikeLaverick.com

March 14th, 2013

Snagit CaptureHe’s Laverick by name, Maverick by nature (and if I might add, a very cool chap and my friend) – Mike Laverick, formerly of RTFM Education of which I was a LONG time reader going back to my Windows and Citrix days, now has a blog cleverly and conveniently situated at mikelaverick.com.  Since Mike joined forces with VMware, he’s been focused on vCloud evangelism and recently visited the Sydney/Melbourne VMUG where he was inspired with a new interest in home labs by AutoLab ala Alastair Cooke of Demitasse fame.  AutoLab has garnered some much deserved attention and adoption.  One organization that has taken an interest is baremetalcloud who provide IaaS via AutoLab on top of physical hardware for its customers.

Long story short, baremetalcloud is offering a special promotion to the first 100 subscribers through Mike’s blog.  Visit the Maverick’s blog via the link in the previous sentence where you can grab the promo code and reserve your baremetalcloud IaaS while supplies last.  Mike also walks through an end to end deployment so you can get an idea of what that looks like beforehand or use it as a reference in case you get stuck.

Thank you Mike, Alastair, and baremetalcloud for lending your hand to the community.

VAAI and the Unlimited VMs per Datastore Urban Myth

February 28th, 2013

Speaking for myself, it’s hard to believe that just a little over 2 years ago in October 2010, many were rejoicing the GA release of vSphere 4.1 and its awesome new features and added scalability.  It seems so long ago.  The following February 2011, Update 1 for vSphere 4.1 was launched and I celebrated my one year anniversary as a VCDX certificate holder.  Now two years later, 5.0 and 5.1 have both seen the light of day along with a flurry of other products and acquisitions rounding out and shaping what is now the vCloud Suite.  Today I’m as much involved with vSphere as I think I ever have been.  Not so much in the operational role I had in the past, but rather a stronger focus on storage integration and meeting with Dell Compellent/VMware customers on a regular basis.

I began this article with vSphere 4.1 for a purpose.  vSphere 4.1 shipped with a new Enterprise Plus feature named vStorage APIs for Array Integration or VAAI for short (pronounced ‘vee double-ehh eye’ to best avoid twist of the tongue).  These APIs offered three different hardware offload mechanisms for block storage enabling the vSphere hypervisor to push some of the storage related heavy lifting to a SAN which supported the APIs.  One of the primitives in particular lies at the root of this topic and a technical marketing urban myth that I have seen perpetuated off and on since the initial launch of VAAI.  I still see it pop up from time to time through present day.

One of the oldest debates in VMware lore is “How many virtual machines should I place on each datastore?”  For this discussion, the context is block storage (as opposed to NFS).  There were all sorts of opinions as well as technical constraints to be considered.  There was the tried and true rule of thumb answer of 10-15-20 which has more than stood the test of time.  The best qualified answer was usually: “Whatever fits best for your consolidated environment” which translates to “it depends” and an invoice in consulting language.

When VAAI was released, I began to notice a slight but alarming trend of credible sources citing claims that the Atomic Test and Set or Hardware Assisted Locking primitive once and for all solved the VMs per LUN conundrum to the point that the number of VMs per LUN no longer mattered because LUN based SCSI reservations were now a thing of the past.  To that point, I’ve got marketing collateral saved on my home network that literally states “unlimited number of VMs per LUN with ATS!”  Basically, VAAI is the promise land – if you can get there with compatible storage and can afford E+ licensing, you no longer need to worry about VM placement and LUN sprawl to satisfy performance needs and  generally reduce latency across the board.  I’ll get to why that doesn’t work in a moment but for the time being I think the general public, especially veterans, remained cautious and less optimistic – and this was good.

Then vSphere 5.0 was released.  By this time, VAAI was made more highly available and affordable to customers in the Enterprise tier and additional primitives had been added for both block and NFS based storage.  In addition, VMware added support for 64TB block datastores without using extents (a true cause for celebration in its own right).  This new feature aligned perfectly with the ATS urban myth because where capacity may have been a limiting constraint in the past, that issue has certainly been lifted now.  To complement that, consistently growing density drives and reduction of cost/GB in arrays and thin provisioning made larger datastores easily achievable.  Marketing decks were updating accordingly.  Everything else being equal, we should now have no problem nor hesitation with placing hundreds, if not thousands of virtual machines on a single block datastore as if it were NFS and free from the constraints associated with the SCSI protocol.

The ATS VAAI primitive was developed to address infrastructure latency as a result of LUN based SCSI reservations which were necessary for certain operations such as creating and deleting files on a LUN, growing a file in size, creating and extending datastores.  We encounter these types of operations by doing things like powering on virtual machines individually or in large groups such as in a VDI environment, creating vSphere snapshots (very popular integration point for backup technologies), provisioning virtual machines from a template.  All of these tasks have one thing in common: they result in the change of metadata on the LUN which in turn necessitates a LUN level lock by the vSphere host making the change.  This lock, albeit very brief in duration, drives noticeable storage I/O latency in large iterations for the hosts and virtual machines “locked out” of the LUN.  The ATS primitive offloads the locking mechanism to the array which only locks the data being updated instead of locking the entire LUN.  Any environment which has been historically encumbered by these types of tasks is going to benefit from the ATS primitive and a reduction of storage latency (both reads and writes, sequential and random) will be the result.

With that overview of ATS out of the way, let’s revisit the statement again and see if it makes sense: “unlimited number of VMs per LUN with ATS!”  If the VMs we’re talking about frequently exhibit the behavior patterns discussed above which cause SCSI reservations, then without a doubt, ATS is going to replace the LUN level locking mechanism as the previous bottleneck and reduce storage latency.  This in turn will allow more VMs to be placed on the LUN until the next bottleneck is introduced.  Unlimited?  Not even close to being correct.  And what about VMs which don’t fit the SCSI reservation use case?  Suppose I use array based snapshots for data protection?  Suppose I don’t use or there is a corporate policy against vSphere snapshots (trust me, they’re out there, they exist)?  Maybe I don’t have a large scale VDI environment or boot storms are not a concern.  This claim I see from time to time makes no mention of use cases and conceivably applies to me as well meaning in an environment not constrained by classic SCSI reservation problem.  I too can leverage VAAI ATS to double, triple, place an unlimited amount of VMs per block datastore.  I talk with customers on a fairly regular basis who are literally confused about VM to LUN placement because of mixed messages they receive, especially when it comes to VAAI.

Allow me to perfrom some Eric Sloof style VMware myth busting and put the uber VMs per ATS enabled LUN claim to the test.  Meet Mike – a DBA who has taken over his organization’s vSphere 5.1 environment.  Mike spends the majority of his time keeping up with four different types of database technologies deployed in his datacenter.  Unfortunately that doesn’t leave Mike much time to read vSphere Clustering Deepdives or Mastering VMware vSphere but he knows well enough to not use vSphere snapshotting because he has an array based data consistent solution which integrates with each of his databases.

Fortunately, Mike has a stable and well performing environment exhibited to the left which the previous vSphere architect left for him.  Demanding database VMs, 32 in all, are distributed across eight block datastores.  Performance characteristics for each VM in terms of IOPS and Throughput are displayed (these are real numbers generated by Iometer in my lab).  The previous vSphere architect was never able to get his organization to buy off on Enterprise licensing and thus the environment lacked VAAI even though their array supported it.

Unfortunately for Mike, he tends to trust random marketing advice without thorough validation or research on impact to his environment.  When Mike took over, he heard from someone that he could simplify infrastructure management by implementing VAAI ATS and consolidate his existing 32 VMs to just a single 64TB datastore on the same array, plus grow his environment by adding basically an unlimited amount of VMs to the datastore providing there is enough capacity.

This information was enough to convince Mike and his management that, risks aside, management and troubleshooting efficiency through a single datastore was definitely the way to go.  Mike installed his new licensing, ensured VAAI was enabled on each host of the cluster, and carved up his new 64TB datastore which is backed by the same pool of raw storage and spindles servicing the eight original datastores.  Over the weekend, Mike used Storage vMotion to migrate his 32 eager zero thick database VMs from their eight datastores to the new 64TB datastore.  He then destroyed his eight original LUNs and for the remainder of that Sunday afternoon, he put his feet up on the desk and basked in the presence of his vSphere Client exhibiting a cluster of hosts and 32 production database VMs running on a single 64TB datastore.

On Monday morning, his stores began to open up on the east coast and in the midwest.  At about 8:30AM central time, the helpdesk began receiving calls from various stores that the system seemed slow.  Par for the course for a Monday morning but with great pride and ethics, Mike began health checks on the database servers anyway.  While he was busy with that, stores on the west coast opened for business and then the calls to the helpdesk increased in frequency and urgency.  The system was crawling and in some rare cases the application was timing out producing transaction failure messages.

Finding no blocking or daytime re-indexing issues at the database layer, Mike turned to the statistical counters for storage and saw a significant decrease in IOPS and Throughput across the board – nearly 50% (again, real Iometer numbers to the right).  Conversely, latency (which is not shown) was through the roof which explained the application timeout failures.  Mike was bewildered.  He had made an additional investment in hardware assisted offload technology and was hoping for a noticeable increase in performance.  Least of all, he didn’t expect a net reduction in performance, especially this pronounced.  What happened?  How is it possible to change the VM:datastore ratio, backed by the same exact pool of storage Tier and RAID type, and come up with a dramatic shift in performance?  Especially when one resides in the kingdom of VAAI?

Queue Depth.  There’s only so much active I/O to go around, per LUN, per host, at any given moment in time.  When multiple VMs on the same host reside on the same LUN, they must share the queue depth of that LUN.  Queue depth is defined in many places along the path of an I/O and at each point, it specifies how many I/Os per LUN per host can be “active” in terms of being handled and processed (decreases latency) as opposed to being queued or buffered (increases latency).  Outside of an environment utilizing SIOC, the queue depth that each virtual machine on a given LUN per host must share is 32 as defined by the default vSphere DSNRO value.  What this effectively means is that all virtual machines on a host sharing the same datastore must share a pool of 32 active I/Os for that datastore.

Applied to Mike’s two-host cluster, whereas he used to have four VMs per datastore evenly distributed across two hosts, effectively each VM had a sole share of 16 IOPS to work with (1 datastore x queue depth of 32 x 2 hosts / 4 VMs or simplified further 1 datastore x queue depth of 32 x 1 host /2 VMs)

After Mike’s consolidation to a single datastore, 16 VMs per host had to share a single LUN with a default queue depth of 32 which reduced each virtual machine’s active IOPS from 16 to 2.

Although the array had the raw storage spindle count and IOPS capability to provide fault tolerance, performance, and capacity, at the end of the day, queue depth ultimately plays a role in performance per LUN per host per VM.  To circle back to the age old “How many virtual machines should I place on each datastore?” question, this is ultimately where the old 10-15-20 rule of thumb came in:

  • 10 high I/O VMs per datastore
  • 15 average I/O VMs per datastore
  • 20 low I/O VMs per datastore

Extrapolated across even the most modest sized cluster, each VM above is going to get a fairly sufficient share of the queue depth to work with.  Assuming even VM distribution across clustered hosts (you use DRS in automated mode right?), each host added to the cluster and attached to the shared storage brings with it, by default, an additional 32 IOPS per datastore for VMs to share in.  Note that this article is not intended to be an end to end queue depth discussion and safe assumptions are made that the DSNRO value of 32 represents the smallest queue depth in the entire path of the I/O which is generally true with most installations and default HBA card/driver values.

In summary, myth busted.  Each of the VAAI primitives was developed to address specific storage and fabric bottlenecks.  While the ATS primitive is ideal for drastically reducing SCSI reservation based latency and it can increase the VM: datastore ratio to a degree, it was never designed to imply large sums of or an unlimited number of VMs per datastore because this assumption simply does not factor in other block based storage performance inhibitors such as queue depth, RAID pools, controller/LUN ownership model, fabric balancing, risk, etc.  Every time I hear the claim, it sounds as foolish as ever.  Don’t be fooled.

Update 3/11/13: A few related links on queue depth:

QLogic Fibre Channel Adapter for VMware ESX User’s Guide

Execution Throttle and Queue Depth with VMware and Qlogic HBAs

Changing the queue depth for QLogic and Emulex HBAs (VMware KB 1267)

Setting the Maximum Outstanding Disk Requests for virtual machines (VMware KB 1268)

Controlling LUN queue depth throttling in VMware ESX/ESXi (VMware KB 1008113)

Disk.SchedNumReqOutstanding the story (covers Disk.SchedQuantum, Disk.SchedQControlSeqReqs, and Disk.SchedQControlVMSwitches)

Disk.SchedNumReqOutstanding and Queue Depth (an article I wrote back in June 2011)

Last but not least, a wonderful whitepaper from VMware I’ve held onto for years:  Scalable Storage Performance VMware ESX 3.5

 

 

 

 

 

StarWind and Cirrus Tech Partner to Deliver Cutting Edge Technologies to the Cloud Computing Market

August 12th, 2012

Press Release

StarWind Solutions Become Available Through a Leading Canadian Web Hosting Company

Burlington, MA – 6 August 2012StarWind Software Inc., an innovative provider of storage virtualization software and VM backup technology, announced today a new partnership agreement with Cirrus Tech Ltd., a Canadian web hosting company specializing in VPS, VM and cloud hosting services. Companies collaborate to deliver best-in-breed Cloud services that help customers accelerate their businesses.

According to the agreement, Cirrus Tech extends its portfolio with StarWind storage virtualization software and will offer it to their customers as a dedicated storage platform that delivers a highly available and high performance scalable storage infrastructure that is capable of supporting heterogeneous server environments; as Cloud storage for private clouds as well as a robust solution for building Disaster Recovery (DR) plans.

StarWind SAN solutions deliver a wide variety of enterprise-class features, such as High Availability (HA), Synchronous Data Mirroring, Remote Asynchronous Replication, CDP/Snapshots, Thin Provisioning, Global Deduplication, etc., that make the stored data highly available, simplify storage management, and ensure business continuity and disaster recovery.

“Companies are increasingly turning to cloud services to gain efficiencies and respond faster to today’s changing business requirements.” said Artem Berman, Chief Executive Officer of StarWind Software, Inc. “We are pleased to combine our forces with Cirrus Tech in order to deliver our customers a wide range of innovative cloud services that will help their transition to a flexible and efficient shared IT infrastructure.”

“Every business needs to consider what would happen in the event of a disaster,” shares Cirrus CEO Ehsan Mirdamadi. “By bringing StarWind’s SAN solution to our customers, we are helping them to ease the burden of disaster recovery planning by offering powerful and affordable storage options. You never want to think of the worst, but when it comes to your sensitive data and business critical web operations, it’s always better to be safe than sorry. Being safe just got that much easier for Cirrus customers.”

To find out more about Cirrus’ web hosting services visit http://www.cirrushosting.com or call 1.877.624.7787.
For more information about StarWind, visit www.starwindsoftware.com

About Cirrus Hosting
Cirrus Tech Ltd. has been a leader in providing affordable, dependable VHS and VPS hosting services in Canada since 1999. They have hosted and supported hundreds of thousands of websites and applications for Canadian businesses and clients around the world. As a BBB member with an A+ rating, Cirrus Tech is a top-notch Canadian web hosting company with professional support, rigorous reliability and easily upgradable VPS solutions that grow right alongside your business. Their Canadian data center is at 151 Front Street in Toronto.

About StarWind Software Inc.
StarWind Software is a global leader in storage management and SAN software for small and midsize companies. StarWind’s flagship product is SAN software that turns any industry-standard Windows Server into a fault-tolerant, fail-safe iSCSI SAN. StarWind iSCSI SAN is qualified for use with VMware, Hyper-V, XenServer and Linux and Unix environments. StarWind Software focuses on providing small and midsize companies with affordable, highly availability storage technology which previously was only available in high-end storage hardware. Advanced enterprise-class features in StarWind include Automated HA Storage Node Failover and Failback (High Availability), Replication across a WAN, CDP and Snapshots, Thin Provisioning and Virtual Tape management.

Since 2003, StarWind has pioneered the iSCSI SAN software industry and is the solution of choice for over 30,000 customers worldwide in more than 100 countries and from small and midsize companies to governments and Fortune 1000 companies.

For more information on StarWind Software Inc., visit: www.starwindsoftware.com

Storage: Starting Thin and Staying Thin with VAAI UNMAP

June 28th, 2012

For me, it’s hard to believe nearly a year has elapsed since vSphere 5 was announced on July 12th.  Among the many new features that shipped was an added 4th VAAI primitive for block storage.  The primitive itself revolved around thin provisioning and was the sum of two components: UNMAP and STUN.  At this time I’m going to go through the UNMAP/Block Space Reclamation process in a lab environment and I’ll leave STUN for a later discussion.

Before I jump into the lab, I want frame out a bit of a chronological timeline around the new primitive.  Although this 4th primitive was formally launched with vSphere 5 and built into the corresponding platform code that shipped, a few months down the road VMware issued a recall on the UNMAP portion of the primitive due to a discovery made either in the field or in their lab environment.  With the UNMAP component recalled, the Thin Provisioning primitive as a whole (including the STUN component) was not supported by VMware.  Furthermore, storage vendors could not be certified for the Thin Provisioning VAAI primitive although the features may have been functional if their respective arrays supported it.  A short while later, VMware released a patch which, once installed on the ESXi hosts, disabled the UNMAP functionality globally.  In March of this year, VMware released vSphere 5.0 Update 1.  With this release, VMware implemented the necessary code to resolve the performance issues related to UNMAP.  However, VMware did not re-enable the automatic UNMAP mechanism.  Instead and in the interim, VMware implemented a manual process for block space reclamation on a per datastore basis regardless of the global UNMAP setting on the host.  I believe it is VMware’s intent to bring back “automatic” UNMAP long term but that is purely speculation.  This article will walk through the manual process of returning unused blocks to a storage array which supports both thin provisioning and the UNMAP feature.

I also want to point out some good information that already exists on UNMAP which introduces the feature and provides a good level of detail.

  • Duncan Epping wrote this piece about a year ago when the feature was launched.
  • Cormac Hogan wrote this article in March when vSphere 5.0 Update 1 was launched and the manual UNMAP process was re-introduced.
  • VMware KB 2014849 Using vmkfstools to reclaim VMFS deleted blocks on thin-provisioned LUNs

By this point, if you are unaware of the value of UNMAP, it is simply keeping thin provisioned LUNs thin.  By doing so, raw storage is consumed and utilized in the most efficient manner yielding cost savings and better ROI for the business. Arrays which support thin provisioning have been shipping for years.  What hasn’t matured is just as important as thin provisioning itself: the ability to stay thin where possible.  I’m going to highlight this below in a working example but basically once pages are allocated from a storage pool, they remain pinned to the volume they were originally allocated for, even after the data written to those pages has been deleted or moved.  Once the data is gone, the free space remains available to that particular LUN and the storage host which owns it and will continue to manage it – whether or not that free space will ever be needed again in the future for that storage host.  Without UNMAP, the pages are never released back to the global storage pool where they may be allocated to some other LUN or storage host whether it be virtual or physical.  Ideal use cases for UNMAP:  Transient data, Storage vMotion, SDRS, data migration. UNMAP functionality requires the collaboration of both operating system and storage vendors.  As an example, Dell Compellent Storage Center has supported the T10 UNMAP command going back to early versions of the 5.x Storage Center code, however there has been very little adoption on the OS platform side which is responsible for issuing the UNMAP command to the storage array when data is deleted from a volume.  RHEL 6 supports it, vSphere 5.0 Update 1 now supports it, and Windows Server 2012 is slated to be the first Windows platform to support UNMAP.

UNMAP in the Lab

So in the lab I have a vSphere ESXi 5.0 Update 1 host attached to a Dell Compellent Storage Center SAN.  To demonstrate UNMAP, I’ll Storage vMotion a 500GB virtual machine from one 500GB LUN to another 500GB LUN.  As you can see below from the Datastore view in the vSphere Client, the 500GB VM is already occupying lun1 and an alarm is thrown due to lack of available capacity on the datastore:

Snagit Capture

Looking at the volume in Dell Compellent Storage Center, I can see that approximately 500GB of storage is being consumed from the storage page pool. To keep the numbers simple, I’ll ignore actual capacity consumed due to RAID overhead.

Snagit Capture

After the Storage vMotion

I’ve now performed a Storage vMotion of the 500GB VM from lun1 to lun2.  Again looking at the datastores from a vSphere client perspective, I can see that lun2 is now completely consumed with data while lun1 is no longer occupied – it now has 500GB  capacity available.  This is where operating systems and storage arrays which do not support UNMAP fall short of keeping a volume thin provisioned.

Snagit Capture

Using the Dell Compellent vSphere Client plug-in, I can see that the 500GB of raw storage originally allocated for lun1 remains pinned with lun1 even though the LUN is empty!  I’m also occupying 500GB of additional storage for the virtual machine now residing on lun2.  The net here is that as a result of my Storage vMotion, I’m occupying nearly 1TB of storage capacity for a virtual machine that’s half the size.  If I continue to Storage vMotion this virtual machine to other LUNs, the problem is compounded and the available capacity in the storage pool continues to drain, effectively raising the high watermark of consumed storage.  To add insult to injury, this will more than likely be stranded Tier 1 storage – backed by the most expensive spindles in the array.

Snagit Capture

Performing the Manual UNMAP

Using a PuTTY connection to the ESXi host, I’ll start with identifying the naa ID of my datastore using esxcli storage core device list |more

Snagit Capture

Following the KB article above, I’ll make sure my datastore supports the UNMAP primitive using esxcli storage core device vaai status get -d <naa ID>.  The output shows UNMAP is supported by Dell Compellent Storage Center, in addition to the other three core VAAI primitives (Atomic Test and Set, Copy Offload, and Block Zeroing).

Snagit Capture

I’ll now change to the directory of the datastore and perform the UNMAP using vmkfstools -y 100.  It’s worth pointing out here that using a value of 100, although apparently supported, ultimately fails.  I reran the command using a value of 99% which successfully unmapped 500GB in about 3 minutes.

Snagit Capture

Also important to note is VMware recommends the reclaim be run after hours or during a maintenance window with maximum recommended reclaim percentage of 60%.  This value is pointed out by Duncan in the article I linked above and it’s also noted when providing a reclaim value outside of the acceptable parameters of 0-100.  Here’s the reasoning behind the value:  When the manual UNMAP process is run, it balloons up a temporary hidden file at the root of the datastore which the UNMAP is being run against.  You won’t see this balloon file with the vSphere Client’s Datastore Browser as it is hidden.  You can catch it quickly while UNMAP is running by issuing the ls -l -a command against the datastore directory.  The file will be named .vmfsBalloon along with a generated suffix.  This file will quickly grow to the size of data being unmapped (this is actually noted when the UNMAP command is run and evident in the screenshot above).  Once the UNMAP is completed, the .vmfsBalloon file is removed.  For a more detailed explanation behind the .vmfsBalloon file, check out this blog article.

Snagit Capture

The bottom line is that the datastore needs as much free capacity as what is being unmapped.  VMware’s recommended value of 60% reclaim is actually a broad assumption that the datastore will have at least 60% capacity available at the time UNMAP is being run.  For obvious reasons, we don’t want to run the datastore out of capacity with the .vmfsBalloon file, especially if there are still VMs running on it.  My recommendation if you are unsure or simply bad at math: start with a smaller percentage of block reclaim initially and perform multiple iterations of UNMAP safely until all unused blocks are returned to the storage pool.

To wrap up this procedure, after the UNMAP step has been run with a value of 99%, I can now see from Storage Center that nearly all pages have been returned to the page pool and 500gbvol1 is only consuming a small amount of raw storage comparatively – basically the 1% I wasn’t able to UNMAP using the value of 99% earlier.  If I so chose, I could run the UNMAP process again with a value of 99% and that should return just about all of the 2.74GB still being consumed, minus the space consumed for VMFS-5 formatting.

Snagit Capture

The last thing I want to emphasize is that today, UNMAP works at the VMFS datastore layer and isn’t designed to work inside the encapsulated virtual machine.  In other words, if I delete a file inside a guest operating system running on top of the vSphere hypervisor with attached block storage, that space can’t be liberated with UNMAP.  As a vSphere and storage enthusiast, for me that would be next on the wish list and might be considered by others as the next logical step in storage virtualization.  And although UNMAP doesn’t show up in Windows platforms until 2012, Dell Compellent has developed an agent which accomplishes the free space recovery on earlier versions of Windows in combination with a physical raw device mapping (RDM).

Update 7/2/12: VMware Labs released its latest fling – Guest Reclaim.

From labs.vmware.com:

Guest Reclaim reclaims dead space from NTFS volumes hosted on a thin provisioned SCSI disk. The tool can also reclaim space from full disks and partitions, thereby wiping off the file systems on it. As the tool deals with active data, please take all precautionary measures understanding the SCSI UNMAP framework and backing up important data.

Features

  • Reclaim space from Simple FAT/NTFS volumes
  • Works on WindowsXP to Windows7
  • Can reclaim space from flat partitions and flat disks
  • Can work in virtual as well as physical machines

Whats a Thin provisioned (TP) SCSI disks? In a thin provisioned LUN/Disk, physical storage space is allocated on demand. That is, the storage system allocates space as and when a client (example a file system/database) writes data to the storage medium. One primary goal of thin provisioning is to allow for storage overcommit. A thin provisioned disk can be a virtual disk, or a physical LUN/disk exposed from a storage array that supports TP. Virtual disks created as thin disks are exposed as TP disks, starting with virtual Hardware Version 9. For more information on this please refer http://en.wikipedia.org/wiki/Thin_provisioning. What is Dead Space Reclamation?Deleting files frees up space on the file system volume. This freed space sticks with the LUN/Disk, until it is released and reclaimed by the underlying storage layer. Free space reclamation allows the lower level storage layer (for example a storage array, or any hypervisor) to repurpose the freed space from one client for some other storage allocation request. For example:

  • A storage array that supports thin provisioning can repurpose the reclaimed space to satisfy allocation requests for some other thin provisioned LUN within the same array.
  • A hypervisor file system can repurpose the reclaimed space from one virtual disk for satisfying allocation needs of some other virtual disk within the same data store.

GuestReclaim allows transparent reclamation of dead space from NTFS volumes. For more information and detailed instructions, view the Guest Reclaim ReadMe (pdf)

Update 5/14/13: Excerpt from Cormac Hogan’s vSphere storage blog: “We’ve recently been made aware of a limitation on our UNMAP mechanism in ESXi 5.0 & 5.1. It would appear that if you attempt to reclaim more than 2TB of dead space in a single operation, the UNMAP primitive is not handling this very well.” Read more about it here: Heads Up! UNMAP considerations when reclaiming more than 2TB s

Update 9/13/13: vSphere 5.5 UNMAP Deep Dive

Spousetivities Is Packing For Boston

June 5th, 2012

Snagit Capture

Dell Storage Forum kicks off in Boston next week and Spousetivities will be there to ensure a good time is had by all.  If you’ve never been to Boston or if you haven’t had a chance to look around, you’re in for a treat.  Crystal has an array of activities queued up (see what I did there?) including  whale watching, a tour of MIT and/or Harvard via trolley or walking, a trolley tour of historic Boston (I highly recommend this one, lots of history in Boston), a wine tour, as well as a welcome breakfast to get things started and a private lunch cruise.

If you’d like to learn more or if you’d like to sign up for one or more of these events, follow this link – Spousetivities even has deals to save you money on your itinerary.

We hope to see you there!

Snagit Capture

Invitation to Dell/Sanity Virtualization Seminar

May 22nd, 2012

I know this is pretty short notice but I wanted to make local readers aware of a lunch event taking place tomorrow between 11:00am and 1:30pm.  Dell and Sanity Solutions will be discussing storage technologies for your vSphere virtualized datacenter, private, public, or  hybrid cloud.  I’ll be on hand as well talking about some of the key integration points between the vSphere and Storage Center.  You can find full details in the brochure below.  Click on it or this text to get yourself registered and we’ll hope to see you tomorrow.

Snagit Capture