Posts Tagged ‘vSphere’

VMware DPM Issue

October 24th, 2010

I’ve been running into a DPM issue in the lab recently.  Allow me briefly describe the environment:

  • 3 vCenter Servers 4.1 in Linked Mode
  • 1 cluster with 2 hosts
    • ESX 4.1, 32GB RAM, ~15% CPU utilization, ~65% Memory utilization, host DPM set for disabled meaning the host should never be placed in standby by DPM.
    • ESXi 4.1, 24GB RAM, ~15% CPU utilization, ~65% Memory utilization, host DPM set for automatic meaning the host is always a candidate to be placed in standby by DPM.
  • Shared storage
  • DRS and DPM enabled for full automation (both configured at Priority 4, almost the most aggressive setting)

Up until recently, the ESX and ESXi hosts weren’t as loaded and DPM was working reliably.  Each host had 16GB RAM installed.  When aggregate load was light enough, all VMs were moved to the ESX host and the ESXi host was placed in standby mode by DPM.  Life was good.

There has been much activity in the lab recently.  The ESX and ESXi host memory was upgraded to 32GB and 24GB respectively.  Many VMs were added to the cluster and powered on for various projects.  The DPM configuration remained as is.  Now what I’m noticing is that with a fairly heavy memory load on both hosts in cluster, DPM moves all VMs to the ESX host and places the ESXi host in standby mode.  This places a tremendous amount of memory pressure and over commit on the solitary ESX host.  This extreme condition is observed by the cluster and nearly as quickly, the ESXi host is taken back out of standby mode to balance the load.  Then maybe about an hour later, the process repeats itself again.

I then configured DPM to manual mode so that I could examine the recommendations being made by the calculator.  The VMs were being evacuated for the purposes of DPM via a Priority 3 recommendation which is half way between Conservative and Aggressive recommendations.

SnagIt Capture

What is my conclusion?  I’m surprised at the perceived increase in aggressiveness of DPM.  In order to avoid the extreme memory over commit, I’ll need to configure DPM slide bar for Priority 2.  In addition, I’d like to get a better understanding of the calculation.  I have a difficult time believing the amount of memory over commit being deemed acceptable in a neutral configuration (Priority 3) which falls half way between conservative and aggressive.  In addition to that, I’m not a fan of a host continuously entering and exiting standby mode, along with the flurry of vMotion activity which results.  This tells me that the calculation isn’t accounting for the amount of memory pressure which is actually occurring once a host goes into standby mode, or coincidentally there are significant shifts in the workload patterns shortly after each DPM operation.

If you are a VMware DPM product manager, please see my next post Request for UI Consistency.

vCenter Server Linked Mode Configuration Error

October 23rd, 2010

As of vCenter Server 4.1, VMware supports Windows Server 2008 R2 as a vCenter platform (remember 2008 R2 is 64-bit only).  With this, I expect many environments will be configured with vCenter Server on Microsoft’s newest Server operating system.

While working in the lab with vCenter Server 4.1 on Windows Server 2008 R2, I ran into an issue configuring Linked Mode via the vCenter Server Linked Mode Configuration shortcut.  Error 28035. Setup failed to copy LDIFDE.EXE from System folder to ‘%windir%\ADAM’ folder.

SnagIt Capture

After no success in relaxing Windows NTFS permissions, I remembered it’s a Windows Server 2008 R2 permissions issue.  The resolution is quite simple and is often the solution when running into similar errors on Windows 7 and Windows Server 2008 R2.  In addition, I found the workaround documented in VMware KB 1025637.  Rather than launching the vCenter Server Linked Mode Configuration as you normally would by clicking on the icon, instead, right click the shortcut and choose Run as administrator.

SnagIt Capture 

You should find that launching the shortcut in the administrator context grants the installer the permissions necessary to complete Linked Mode configuration.

Hardware Status and Maintenance Mode

October 20th, 2010

I’m unable to view hardware health status data while a host is in maintenance mode in my vSphere 4.0 Update 1 environment.

SnagIt Capture

A failed memory module was replaced on a host but I’m skeptical about taking it out of maintenance mode until I am sure it is healthy.  There is enough load on this cluster such that removing the host from maintenance mode will result in DRS moving VM workloads onto it within five minutes.  For obvious reasons, I don’t want VMs running on an unhealthy host.

So… I need to disable DRS at the cluster level, take the host out of maintenance mode, verify the hardware health on the Hardware Status tab, then re-enable DRS.  It’s a round about process, particularly if it’s a production environment which requires a Change Request (CR) with associated approvals and lead time to toggle the DRS configuration. 

Taking a look at KB 1011284, VMware acknowledges the steps above and considers the following a resolution to the problem:

Resolution

By design, the host monitoring agents (IPMI) are not supported while the ESX host is in maintenance mode. You must exit maintenance mode to view the information on the Hardware Status tab. To take the ESX host out of maintenance mode:

1.Right click ESX host within vSphere Client.

2.Click on Exit Maintenance Mode.

Fortunately, this design specification has been improved by VMware in vSphere 4.1 where I have the ability to view hardware health while a host is in maintenance mode.

vCenter Storage Monitoring Plug-in Disabled

October 18th, 2010

Those who have upgraded to vSphere (hopefully most of you by now) may become accustomed to the new tab in vCenter labeled Storage Views. From time to time, you may notice that this tab mysteriously disappears from a view where it should normally be displayed.  If you’re a subscriber to my vCalendar, you’ll find a tip on July 18th which speaks to this:

Is your vSphere Storage Views tab or host Hardware Status tab not functioning or missing? Make sure the VMware VirtualCenter Management Webservices service is running on the vCenter Server.

The solution above is an easy enough resolution, but what if that doesn’t fix the problem?  I ran into another instance of the Storage Views tab disappearing and it was not due to a stopped VMware VirtualCenter Management Webservices service.  After a short investigation, I found a failed or disabled vCenter Storage Monitoring (Storage Monitoring and Reporting) plug-in:

SnagIt Capture

For those who cannot read the screen shot detail above, and for the purposes of Google search, I’ll paste the error code below:

The plug-in failed to load on server(s) <your vCenter Server> due to the following error: Could not load file or assembly ‘VpxClientCommon, Version=4.1.0.0, Culture=neutral, PublicKeyToken=7c8-0a434483c7c50′ or one of its dependencies. The system cannot find the file specified.

I performed some testing in the lab and here’s what I found.  Long story short, installation of the vSphere 4.1 Client on a system which already has the the vSphere 4.0 Update 1 Client installed causes the issue.  The 4.1 Client installs a file called SMS.dll (dated 5/13/2010) into the directory C:\Program Files (x86)\VMware\Infrastructure\Virtual Infrastructure Client\Plugins\SMS\ overwriting the previous version (dated 11/7/2009).  While the newer version of the SMS.dll file causes no issues and works fine when connecting to vCenter 4.1 Servers, it’s not backward compatible with vCenter 4.0 Update 1.  The result is what you see in the image above, the plugin is disabled and cannot be enabled.

Furthermore, if you investigate your vSphere Client log files at C:\Users\%username%\AppData\Local\VMware\vpx\ you’ll find another similar entry:

System.IO.FileNotFoundException: Could not load file or assembly ‘VpxClientCommon, Version=4.1.0.0, Culture=neutral, PublicKeyToken=7c80a434483c7c50′ or one of its dependencies. The system cannot find the file specified.

Copying the old version of the SMS.dll file into its proper location resolves the plug-in issue when connecting to a vSphere 4.0 Update 1 vCenter Server, this much I tested, however I’m sure it immediately breaks the plug-in when connecting to a vCenter 4.1 Server (I didn’t go so far as to testing this).

Essentially what this boils down to is a VMware vSphere Client bug which is going to bite people who have both vCenter Server 4.0 and 4.1 in their environment, and the respective clients are installed on the same endpoint machine.  I expect to hear about this more as people start their upgrades from vSphere 4.0 to vSphere 4.1.  Some may not even realize they have the issue, after all, I didn’t notice it until I was looking for the Storage Views tab and it wasn’t there.  After lab testing, I did some looking around on the net to see if anyone had discovered or documented this issue and the only hit I came across was a recently started VMware Communities thread, however, there was no posted solution.  The thread does contain a few hints which would have pointed me in the right direction much quicker had I read it ahead of time.  Nonetheless, time spent in the lab is time well spent as far as I’m concerned.  Unfortunately, there’s no fix here I can offer.  This one is on VMware to fix with a new release of the vSphere 4.1 Client.

Update 12/1/10:  VMware has released KB 1024493 to identify this problem and temporarily address the issue with a workaround:

Installing each Client version in different folders does not work. When you install the first Client you are asked where you want to install it. When you install the second Client, you are not asked for a location. Instead, the installer sees that you have already installed a Client and automatically tries and install the second client in the same directory.

To install vSphere Client 4.0 and 4.1 in separate directories:

  1. Install vSphere Client 4.0 in C:\Client4.0.
  2. Copy C:\Client4.0 to an external drive (such as a share or USB).
  3. Uninstall vSphere Client 4.0. Do not skip this step.
  4. Install vSphere Client 4.1 in C:\Client4.1.
  5. Copy the 4.0 Client folder from the external drive to the machine.
  6. Run vpxClient.exe from the 4.0 or 4.1 folder.

I’m expecting a more permanent fix in the future which addresses the .DLL incompatibility in the 4.1 vSphere Client.

Update 2/15/11:  Through some lab testing, it looks as if VMware has resolved this issue with the release of vSphere 4.1 Update 1 although KB 1024493 has not been updated yet to reflect this.  I uninstalled all vSphere Clients, then installed vSphere Client 4.0 Update 1, then installed vSphere Client 4.1 Update 1.  The result is the vCenter Storage Monitoring plug-in is no longer malfunctioning.  The Storage Views tab is also available.  Both of those items are a positive reflection of a resolution.  The Search function is failing in a different way but I’m not convinced it has anything to do with two installed vSphere Clients because it is also failing on a different machine which has only one vSphere Client installed.

The Future of VMware Lab Manager

September 12th, 2010

With the release of VMware vCloud Director 1.0 at VMworld 2010 San Franciso, what’s in store for VMware Lab Manager?  The future isn’t entirely clear for me.  I visualize two potential scenarios:

  1. Lab Manager development and product releases continue in parallel with VMware vCloud Director.  Although the two overlap in functionality in certain areas, they will co-exist on into the future in perfect harmony.
  2. VMware vCloud Director gains the features, popularity, pricing, and momentum needed to obsolete and sunset Lab Manager.

I’ve got no formal bit of information from VMware regarding the destiny of Lab Manager. In lieu of that, I’ve been keeping my ear to the rail trying to pick up clues from VMware body language.  Here are some of the items I’ve got in my notebook thus far:

Development Efforts:  First and foremost, what seems obvious to me is that VMware has all but stopped development of Lab Manager well beyond the past year.  Major functionality hasnt been introduced since the 3.x version.  Let’s take a look:

4.0 was released in July 2009 which provided compatibility with the recent launch of vSphere, that’s really it. I don’t count VMware’s half baked attempt at integrating with vDS which they market as DPM for Lab Manager (one problem, the service VMs prevent successful host maintenance mode and, in turn, prevent DPM from working; this bug has existed for over a year with no attempts at fixing).  To further add, the use of the Host Spanning network feature leverages the vDS and implies the requirement Enterprise Plus licensing for the hosts.  This drives up the sticker price of an already costly development solution by some accounts.

4.0.1 was released in December 2009, again to provide compatibility with vSphere 4.0 Update 1. VMware markets this release as introducing compatibility with Windows 7 and 2008 R2 (which in and of itself is not a lie), but anyone who knows the products realizes the key enabler was vSphere 4.0.1 and not Lab Manager 4.0.1 itself.

4.0.2 is released in July 2010 to provide compatibility with vSphere 4.1.  No new features to speak of other than what vSphere 4.1 brings to the table.

SnagIt Capture

Are you noticing the pattern?  Development efforts are being put forth merely to keep up compatibility with the vSphere releases.  Lab Manager documentation hasn’t been updated since the 4.0 release.  The 4.0.1 and 4.0.2 versions both point back to the 4.0 documentation.  Lab Manager documentation hasn’t been updated in over a year even considering two Lab Manager code releases since then.  Further evidence there has been no recent feature development in the Lab Manager product itself.

This evidence seems to make it clear that VMware is positioning Lab Manager for retirement.  The logical replacement is vCloud Director.  I haven’t heard of large scale developer layoffs in Palo Alto so a conclusion could be drawn here that most developer effort was pulled from Lab Manager and put on on vCloud Director 1.0 to get it out the door in Q3 2010.

Bug Fixes & Feature Requests:  This really ties into Development Efforts, but due to its weight, I thought it deserved a category of its own.  Lab Manager has acquired a significant following over the years by delivering on its promise of making software development more efficient and cost effective through automation.  Much like datacenter virtualization itself, a number of customers have become dependent on the technology.  As much as VMware has satisified these customers by maintaining Lab Manager compatibility with vSphere, at the same time customers are getting the short end of the stick.  Customers continue to pay their SnS fees but the value add of SnS is diminishing as VMware development efforts slowed down to a crawl.  At one time, SnS would net you new features, bug fixes, in addition to new versions of the software which provide compatibility with the host platforms.  Instead, the long list of customer feature requests (and great ideas I might add) sits dead in a VMware Communities forum thread like this.  The number of bugs fixed in the last two releases of Lab Manager I can almost count on two hands.  And what about squashing these bugs: this, this, and this?  Almost nothing has changed since Steven Kishi (I believe) exited the role of Director of Product Manager for VMware Lab Manager.

Again, this evidence seems to make it clear that VMware is sending Lab Manager off into the sunset.  Hello vCloud Director.

Marketing Efforts:  From my perspective, VMware hasn’t spent much time focusing on Lab Manager marketing.  By a show of customer or partner hands, who has seen a Lab Manager presentation from VMware in the last 6-12 months?  This ties strongly into the Development Efforts point made above.  Why market a product which seems to be well beyond its half life?  Consistent with the last thought above, marketing has noticably shifted almost entirely from Lab Manager to vCloud Director. 

Chalk up another point for the theory which states Lab Manager will be consumed by vCloud Director.

Lack of Clear Communication:  About the only voice in my head (of which there are many) which reasons Lab Manager might be sticking around (other than a VMware announcement of a Lab Manager video tutorial series which has now gone stale) is the fact that VMware has not made it formally and publically clear that Lab Manager is being retired or replaced by vCloud Director.  Although I’m making a positive point here for the going concern of Lab Manager, I think there is ultimately an expiration date of Lab Manager in the not so distant future.  If you understand the basics of vCloud Director or if you have installed and configured it, you’ll notice similarities between it and Lab Manager.  But there is not 100% coverage of Lab Manager functionality and integration.  Until VMware can provide that seamless migration, they obviously aren’t going to pull the plug on Lab Manager.  Quite honestly, I think this is the most accurate depiction of where we’re sitting today.  VMware has a number of areas to address before vCloud Director can successfully replace Lab Manager.  Some are technical such as getting that 100% gap coverage between the two products from a features standpoint.  Some are going to be political/marketing based.  Which customers are ready to replace a tried and true solution with a version 1.0 product?  Some may be cost based.  Will VMware take a 1:1 trade in on Lab Manager for vCloud Director or will there be an uplift fee?  Will Enterprise Plus licensing be a requirement for future versions of vCloud Director?  vCloud Direct0r 1.0 requires Enterprise Plus licensing according to the VMware product’s ‘buy’ page.  Some will be a hybrid.  For instance, existing Lab Manager customers rely on a MS SQL (Express) database.  vCloud Director 1.0 is back ended with Oracle, a costly platform Lab Manager customers won’t necessarily have already in terms of infrastructure and staff.

SnagIt Capture

In summary, this point is an indicator that both Lab Manager and vCloud Director will exist in parallel, however, the signs can’t be ignored that Lab Manager is coasting on fumes.  Its ongoing presence and customer base will require support and future compatibility upgrades from VMware.  Maintaining support on two technologies for VMware is more expensive than to maintain just one.  A larger risk for VMware and customers may be that development efforts for vSphere have to slow down to allow Lab Manager to keep pace.  Even worse, new technology doesn’t see the light of day in vSphere because it cannot be made backward compatible with Lab Manager.  Unless we see a burst in development or marketing for Lab Manager, we may be just a short while away from a formal announcement from VMware stating the retirement of Lab Manager along with the migration plan for Lab Manager customers to become vCloud Director customers.

What are your thoughts?  I’d like to hear some others weigh in.  Please be sure to not disclose any information which would violate an NDA agreement.

Update 2/14/11: VMware has published a VMware vCenter Lab Manager Product Lifecycle FAQ for it’s current customers which fills in some blanks.  Particularly:

What is the future of the vCenter Lab Manager product line?

As customers continue to expand the use of virtualization both inside the datacenter and outside the firewall, we are focusing on delivering infrastructure solutions that can support these expanded scalability and security requirements. As a result, we have decided to discontinue additional major releases of vCenter Lab Manager. Lab Manager 4 will continue to be supported in line with our General Support Policy through May 1, 2013.

When is the current end-of-support date for vCenter Lab Manager 4?

For customers who are current on SnS, General Support has been extended to May 1, 2013.

Are vCenter Lab Manager customers eligible for offers to any new products?

To provide Lab Manager customers with the opportunity to leverage the scale and security of vCloud Director, customers who are active on SnS may exchange their existing licenses of Lab Manager to licenses of vCloud Director at no additional cost. This exchange program is entirely optional and may be exercised anytime during Lab Manager’s General Support period. This provides customers the freedom and flexibility to decide whether and when to implement a secure enterprise hybrid cloud.

The Primary License Administrator can file a Customer Service Request to request an exchange of licenses. For more information on the terms and conditions of the exchange, contact your VMware account manager.

vCenter Server JVM Memory

September 6th, 2010

For those of you who have installed VMware vCenter Server 4.1, have you noticed anything new during the installation process?  A new screen was introduced at the end of the installation wizard for specifying the anticipated size of the virtual infrastructure which the respective vCenter Server would be managing.  There are three choices here: Small, Medium, & Large.  Sorry, no Supersize available yet.  If you require this option, I’m sure VMware wants to talk to you.

SnagIt Capture

The selection you make from the installation wizard not only defines the Maximum Memory Pool value for the Java Virtual Machine, but also the Initial Memory Pool value.  Following is a chart which takes a look at vCenter Server 4.0 & 4.1 JVM Memory Configuration comparisions:

vCenter/JVM Initial Memory Pool Max Memory Pool Thread Stack Size
4.0 128MB 1024MB 1024KB
4.1 Small (<100 hosts, default) 256MB 1024MB 1024KB
4.1 Medium (100-400 hosts) 256MB 2048MB 1024KB
4.1 Large (> 400 hosts) 512MB 4096MB 1024KB

As noted by the table above, in vCenter Server 4.0, the JVM Maximum Memory Pool was configured by default at 1024MB.  The vCenter Server 4.1 installation also defaults to 1024MB (Small <100 hosts) if left unchanged. One other comparison – pay attention to the difference in Initial Memory Pool. By default, vCenter 4.1 uses twice the amount of RAM out of the gate than previous versions.

Although the installation wizard JVM tuning component is new in 4.1, the ability to tune the JVM for vCenter is not.  The Configure Tomcat application has been available in previous versions of vCenter.  Some organizations with growing infrastructures may have been instructed by VMware support to tune the JVM values to overcome a vCenter issue having to do with scaling or some other issue.

SnagIt Capture

SnagIt Capture

Judging from the table, one can assume that the 1024MB value was appropriate for managing less than 100 hosts in vCenter 4.0.  As a point of reference, the Configuration Maximums document states that 300 hosts can be managed by vCenter 4.0.  This would imply that managing 100 hosts or more with vCenter 4.0 requires an adjustment to the out of box setting for the JVM Maximum Memory Pool (change from 1024MB to 2048MB). 

With vCenter 4.1, VMware has improved scaling in terms of the number of hosts a vCenter Server can manage.  The Configuration Maximums document specifies vCenter 4.1 can manage 400 hosts but the table above implies VMware may be preparing to support more than 400 hosts in the near future.  And that’s awesome because vCenter Server sprawl sucks. Period.

So have fun tuning the JVM but before you go, a few parting tips:

  • The Initial Memory Pool value defines the memory footprint (Commit Size) of the Tomcat process when the service is first started.  The Maximum Memory Pool defines the memory footprint which the Tomcat process is allowed to grow to.  Make sure you have sufficient RAM installed in your server to accommodate both of these values.
  • Setting the Initial Memory Pool to a value greater than the Maximum Memory Pool will prevent the Tomcat VJM from starting.  I thought I’d mention that before you spend too much time pulling your hair out.
  • If you would like to learn more about tuning Tomcat, vast resources exist on the internet.  This looks like a good place to start.

Unable To Retrieve Health Data

September 5th, 2010

SnagIt CaptureA number of people, including myself, have noticed that after upgrading to VMware vCenter 4.1, the vCenter Service Status shows red and displays the error message:

Unable to retrieve health data from https://<VC servername or IP address>/converter/health.xml

VMware has provided a workaround to this issue in KB 1025010.  The workaround involves installing the ldp.exe application binary from Microsoft, however, since I’m running vCenter Server on Windows Server 2008 R2, the binary is already in place by default and no download and installation was required. I’ve applied the workaround and after a service restart and a brief wait, the Service Status health went completely green, which is desired.

It’s worth nothing for posterity that step 3a is missing a small piece which I have provided in red below:

Double-click DC=virtualcenter,DC=vmware,DC=int, then double-click
-OU=Health,DC=virtualcenter,DC=vmware,DC=int
-OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware,DC=int
-CN=<GUID>.vpxd,CN=<GUID>,OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware
-CN=com.vmware.converter,CN=<GUID>,OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC