Posts Tagged ‘Update Manager’

Windows 2008 R2 and Windows 7 on vSphere

March 28th, 2010

If you run Windows Server 2008 R2 or Windows 7 as a guest VM on vSphere, you may be aware that it was advised in VMware KB Article 1011709 that the SVGA driver should not be installed during VMware Tools installation.  If I recall correctly, this was due to a stability issue which was seen in specific, but not all, scenarios:

If you plan to use Windows 7 or Windows 2008 R2 as a guest operating system on ESX 4.0, do not use the SVGA drivers included with VMware Tools. Use the standard SVGA driver instead.

Since the SVGA driver is installed by default in a typical installation, it was necessary to perform a custom installation (or scripted perhaps) to exclude the SVGA driver for these guest OS types.  Alternatively, perform a typical VMware Tools installation and remove the SVGA driver from the Device Manager afterwards.  What you ended up with, of course, is a VM using the Microsoft Windows supplied SVGA driver and not the VMware Tools version shown in the first screenshot.  The Microsoft Windows supplied SVGA driver worked and provided stability as well, however one side effect was that mouse movement via VMware Remote Console felt a bit sluggish.

Beginning with ESX(i) 4.0 Update 1 (released 11/19/09), VMware changed the behavior and revised the above KB article in February, letting us know that they now package a new version of the SVGA driver in VMware Tools in which the bits are populated during a typical installation but not actually enabled:

The most effective solution is to update to ESX 4.0 Update 1, which provides a new WDDM driver that is installed with VMware Tools and is fully supported. After VMware Tools upgrade you can find it in C:\Program Files\Common Files\VMware\Drivers\wddm_video.

After a typical VMware Tools installation, you’ll still see a standard SVGA driver installed.  Following the KB article, head to Windows Device Manager and update the driver to the bits located in C:\Program Files\Common Files\VMware\Drivers\wddm_video:

    

The result is the new wddm driver, which ships with the newer version of VMware Tools, is installed: 

After a reboot, the crisp and precise mouse movement I’ve become accustomed to over the years with VMware has returned.  The bummer here is that while the appropriate VMware SVGA drivers get installed in previous versions of Windows guest operating systems, Windows Server 2008 R2 and Windows 7 require manual installation steps, much like VMware Tools installation on Linux guest VMs.  Add to this the fact that the automated installation/upgrade of VMware Tools via VMware Update Manager (VUM) does not enable the wddm driver.  In short, getting the appropriate wddm driver installed for many VMs will require manual intervention or scripting.  One thing you can do is to get the wddm driver installed in your Windows Server 2008 R2 and Windows 7 VM templates.  This will ensure VMs deployed from the templates have the wddm driver installed and enabled.

The wddm driver install method from VMware is helpful for the short term, however, it’s not the scalable and robust long term solution.  We need an automated solution from VMware to get the wddm driver installed.  It needs to be integrated with VUM.  I’m interested in finding out what happens with the next VMware Tools upgrade – will the wddm driver persist, or will the VMware Tools upgrade replace the wddm version with the standard version?  Stay tuned.

VMware Update Manager Becomes Self-Aware

March 4th, 2010

@Mikemohr on Twitter tonight said it best:

“Haven’t we learned from Hollywood what happens when the machines become self-aware?”

I got a good chuckle.  He took my comment of VMware becoming “self-aware” exactly where I wanted it to go.  A reference to The Terminator series of films in which a sophisticated computer defense system called Skynet becomes self-aware and things go downhill for mankind from there.

Metaphorically speaking in today’s case, Skynet is VMware vSphere and mankind is represented by VMware vSphere Administrators.

During an attempt to patch my ESX(i)4  hosts, I received an error message (click the image for a larger version):

At that point, the remediation task fails and the host is not patched.  The VUM log file reflects the same error in a little more detail:

[2010-03-04 14:58:04:690 'JobDispatcher' 3020 INFO] [JobDispatcher, 1616] Scheduling task VciHostRemediateTask{675}
[2010-03-04 14:58:04:690 'JobDispatcher' 3020 INFO] [JobDispatcher, 354] Starting task VciHostRemediateTask{675}
[2010-03-04 14:58:04:690 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 INFO] [vciTaskBase, 534] Task started…
[2010-03-04 14:58:04:908 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 INFO] [vciHostRemediateTask, 680] Host host-112 scheduled for patching.
[2010-03-04 14:58:05:127 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 INFO] [vciHostRemediateTask, 691] Add remediate host: vim.HostSystem:host-112
[2010-03-04 14:58:13:987 'InventoryMonitor' 2180 INFO] [InventoryMonitor, 427] ProcessUpdate, Enter, Update version := 15936
[2010-03-04 14:58:13:987 'InventoryMonitor' 2180 INFO] [InventoryMonitor, 460] ProcessUpdate: object = vm-2642; type: vim.VirtualMachine; kind: 0
[2010-03-04 14:58:17:533 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 WARN] [vciHostRemediateTask, 717] Skipping host solo.boche.mcse as it contains VM that is running VUM or VC inside it.
[2010-03-04 14:58:17:533 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 INFO] [vciHostRemediateTask, 786] Skipping host 0BC5A140, none of upgrade and patching is supported.
[2010-03-04 14:58:17:533 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 ERROR] [vciHostRemediateTask, 230] No supported Hosts found for Remediate.
[2010-03-04 14:58:17:737 'VciRemediateTask.RemediateTask{674}' 2676 INFO] [vciTaskBase, 583] A subTask finished: VciHostRemediateTask{675}

Further testing in the lab revealed that this condition will be caused with a vCenter VM and/or a VMware Update Manager (VUM) VM. I understand from other colleagues on the Twitterverse that they’ve seen the same symptoms occur with patch staging.

The work around is to manually place the host in maintenance mode, at which time it has no problem whatsoever evacuating all VMs, including infrastructure VMs.  At that point, the host in maintenance mode can be remediated.

VMware Update Manager has apparently become self-aware in that it detects when its infrastructure VMs are running on the same host hardware which is to be remediated.  Self-awareness in and of itself isn’t bad, however, its feature integration is.  Unfortunately for the humans, this is a step backwards in functionality and a reduction in efficiency for a task which was once automated.  Previously, a remediation task had no problem evacuating all VMs from a host, infrastructure or not. What we have now is… well… consider the following pre and post “self-awareness” remediation steps:

Pre “self-awareness” remediation for a 6 host cluster containing infrastructure VMs:

  1. Right click the cluster object and choose Remediate
  2. Hosts are automatically and sequentially placed in maintenance mode, evacuated, patched, rebooted, and brought out of maintenance mode

Post “self-awareness” remediation for a 6 host cluster containing infrastructure VMs:

  1. Right click Host1 object and choose Enter Maintenance Mode
  2. Wait for evacutation to complete
  3. Right click Host1 object and choose Remediate
  4. Wait for remediation to complete
  5. Right click Host1 object and choose Exit Maintenance Mode
  6. Right click Host2 object and choose Enter Maintenance Mode
  7. Wait for evacutation to complete
  8. Right click Host2 object and choose Remediate
  9. Wait for remediation to complete
  10. Right click Host2 object and choose Exit Maintenance Mode
  11. Right click Host3 object and choose Enter Maintenance Mode
  12. Wait for evacutation to complete
  13. Right click Host3 object and choose Remediate
  14. Wait for remediation to complete
  15. Right click Host3 object and choose Exit Maintenance Mode
  16. Right click Host4 object and choose Enter Maintenance Mode
  17. Wait for evacutation to complete
  18. Right click Host4 object and choose Remediate
  19. Wait for remediation to complete
  20. Right click Host4 object and choose Exit Maintenance Mode
  21. Right click Host5 object and choose Enter Maintenance Mode
  22. Wait for evacutation to complete
  23. Right click Host5 object and choose Remediate
  24. Wait for remediation to complete
  25. Right click Host5 object and choose Exit Maintenance Mode
  26. Right click Host6 object and choose Enter Maintenance Mode
  27. Wait for evacutation to complete
  28. Right click Host6 object and choose Remediate
  29. Wait for remediation to complete
  30. Right click Host6 object and choose Exit Maintenance Mode

It’s Saturday and your kids want to go to the park. Do the math.

Update 5/5/10: I received this response back on 3/5/10 from VMware but failed to follow up with finding out if it was ok to share with the public.  I’ve received the blessing now so here it is:

[It] seems pretty tactical to me. We’re still trying to determine if this was documented publicly, and if not, correct the documentation and our processes.

We introduced this behavior in vSphere 4.0 U1 as a partial fix for a particular class of problem. The original problem is in the behavior of the remediation wizard if the user has chosen to power off or suspend virtual machines in the Failure response option.

If a stand-alone host is running a VM with VC or VUM in it and the user has selected those options, the consequences can be drastic – you usually don’t want to shut down your VC or VUM server when the remediation is in progress. The same applies to a DRS disabled cluster.

In DRS enabled cluster, it is also possible that VMs could not be migrated to other hosts for configuration or other reasons, such as a VM with Fault Tolerance enabled. In all these scenarios, it was possible that we could power off or suspend running VMs based on the user selected option in the remediation wizard.

To avoid this scenario, we decided to skip those hosts totally in first place in U1 time frame. In a future version of VUM, it will try to evacuate the VMs first, and only in cases where it can’t migrate them will the host enter a failed remediation state.

One work around would be to remove such a host from its cluster, patch the cluster, move the host back into the cluster, manually migrate the VMs to an already patched host, and then patch the original host.

It would appear VMware intends to grant us back some flexibility in future versions of vCenter/VUM.  Let’s hope so. This implementation leaves much to be desired.

Update 5/6/10: LucD created a blog post titled Counter the self-aware VUM. In this blog post you’ll find a script which finds the ESX host(s) that is/are running the VUM guest and/or the vCenter guest and will vMotion the guest(s) to another ESX host when needed.

11 New ESX(i) 4.0 Patch Definitions Released; 6 Critical

March 3rd, 2010

Eleven new patch definitions have been released for ESX(i) 4.0 (7 for ESX, 2 for ESXi, 2 for the Cisco Nexus 1000V).  Previous versions of ESX(i) are not impacted.

6 of the 11 patch definitions are rated critical and should be evaluated quickly for application in your virtual infrastructure.

ID: ESX400-201002401-BG Impact: Critical Release date: 2010-03-03 Products: esx 4.0.0 Updates vmkernel64,vmx,hostd etc

This patch provides support and fixes the following issues:

  • On some systems under heavy networking and processor load (large number of virtual machines), some NIC drivers might randomly attempt to reset the device and fail.
    The VMkernel logs generate the following messages every second:
    Oct 13 05:19:19 vmkernel: 0:09:22:33.216 cpu2:4390)WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic1: transmit timed out
    Oct 13 05:19:20 vmkernel: 0:09:22:34.218 cpu8:4395)WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic1: transmit timed out
  • ESX hosts do not display the proper status of the NFS datastore after recovering from a connectivity loss.
    Symptom: In vCenter Server, the NFS datastore is displayed as inactive.
  • When using NPIV, if the LUN on the physical HBA path is not same as the LUN on the virtual port (VPORT) path, though the LUNID:TARGETID pairs are same, then I/O might be directed to the wrong LUN causing a possible data corruption. Refer KB 1015290 for more information.
    Symptom: If NPIV is not configured properly, I/O might be directed to the wrong disk.
  • On Fujitsu systems, the OEM-IPMI-Command-Handler that lists the available OEM IPMI commands do not work as intended. No custom OEM IPMI commands are listed, though they were initialized correctly by the OEM. After applying this fix, running the VMware_IPMIOEMExtensionService and VMware_IPMIOEMExtensionServiceImpl objects displays the supported commands as listed in the command files.
  • Provides prebuilt kernel module drivers for Ubuntu 9.10 guest operating systems.
  • Adds support for upstreamed kernel PVSCSI and vmxnet3 modules.
  • Provides a change to the maintenance mode requirement during Cisco Nexus 1000V software upgrade. After installing this patch if you perform Cisco Nexus 1000V software upgrade, the ESX host goes into maintenance mode during the VEM upgrade.
  • In certain race conditions, freeing journal blocks from VMFS filesystems might fail. The WARNING: J3: 1625: Error freeing journal block (returned 0) <FB 428659> for 497dd872-042e6e6b-942e-00215a4f87bb: Lock was not free error is written to the VMware logs.
  • Changing the resolution of the guest operating system over a PCoIP connection (desktops managed by View 4.0) might cause the virtual machine to stop responding.
    Symptoms: The following symptoms might be visible:

    • When you try to connect to the virtual machine through a vCenter Server console, a black screen appears with the Unable to connect to MKS: vmx connection handshake failed for vmfs {VM Path} message.
    • Performance graphs for CPU and memory usage in vCenter Server drop to 0.
    • Virtual machines cannot be powered off or restarted.

ID: ESX400-201002402-BG Impact: Critical Release date: 2010-03-03 Products: esx 4.0.0 Updates initscripts

This patch fixes an issue where pressing Ctrl+Alt+Delete on service console causes ESX 4.0 hosts to reboot.

ID: ESX400-201002404-SG Impact: HostSecurity Release date: 2010-03-03 Products: esx 4.0.0 Updates glib2

The service console package for GLib2 is updated to version glib2-2.12.3-4.el5_3.1. This GLib update fixes an issue where the functions inside GLib incorrectly allows multiple integer overflows leading to heap-based buffer overflows in GLib’s Base64 encoding and decoding functions. This might allow an attacker to possibly execute arbitrary code while a user is running the application. The Common Vulnerabilities and Exposures Project (cve.mitre.org) has assigned the name CVE-2008-4316 to this issue.

ID: ESX400-201002405-BG Impact: Critical Release date: 2010-03-03 Products: esx 4.0.0 Updates megaraid-sas

This patch fixes an issue where some applications do not receive events even after registering for Asynchronous Event Notifications (AEN). This issue occurs when multiple applications register for AENs.

ID: ESX400-201002406-SG Impact: HostSecurity Release date: 2010-03-03 Products: esx 4.0.0 Updates newt

The service console package for Newt library is updated to version newt-0.52.2-12.el5_4.1. This security update of Newt library fixes an issue where an attacker might cause a denial of service or possibly execute arbitrary code with the privileges of a user who is running applications using the Newt library. The Common Vulnerabilities and Exposures Project (cve.mitre.org) has assigned the name CVE-2009-2905 to this issue.

ID: ESX400-201002407-SG Impact: HostSecurity Release date: 2010-03-03 Products: esx 4.0.0 Updates nfs-utils

The service console package for nfs-utils is updated to version nfs-utils-1.0.9-42.el5. This security update of nfs-utils fixes an issue that might permit a remote attacker to bypass an intended access restriction. The Common Vulnerabilities and Exposures Project (cve.mitre.org) has assigned the name CVE-2008-4552 to this issue.

ID: ESX400-201002408-BG Impact: Critical Release date: 2010-03-03 Products: esx 4.0.0 Updates Enic driver

In scenarios where Pass Thru Switching (PTS) is in effect, if virtual machines are powered on, the network interface might not come up. In PTS mode, when the network interface is brought up, PTS figures the MTU from the network. There is a race in this scenario, where the enic driver might incorrectly indicate that the driver fails. This issue might occur frequently on a CISCO UCS system. This patch fixes the issue.

ID: ESXi400-201002401-BG Impact: Critical Release date: 2010-03-03 Products: embeddedEsx 4.0.0 Updates Firmware

This patch provides support and fixes the following issues:

  • On some systems under heavy networking and processor load (large number of virtual machines), some NIC drivers might randomly attempt to reset the device and fail.
    The VMkernel logs generate the following messages every second:
    Oct 13 05:19:19 vmkernel: 0:09:22:33.216 cpu2:4390)WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic1: transmit timed out
    Oct 13 05:19:20 vmkernel: 0:09:22:34.218 cpu8:4395)WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic1: transmit timed out
  • ESX hosts do not display the proper status of the NFS datastore after recovering from a connectivity loss.
    Symptom: In vCenter Server, the NFS datastore is displayed as inactive.
  • When using NPIV, if the LUN on the physical HBA path is not same as the LUN on the virtual port (VPORT) path, though the LUNID:TARGETID pairs are same, then I/O might be directed to the wrong LUN causing a possible data corruption. Refer KB 1015290 for more information.
    Symptom: If NPIV is not configured properly, I/O might be directed to the wrong disk.
  • On Fujitsu systems, the OEM-IPMI-Command-Handler that lists the available OEM IPMI commands do not work as intended. No custom OEM IPMI commands are listed, though they were initialized correctly by the OEM. After applying this fix, running the VMware_IPMIOEMExtensionService and VMware_IPMIOEMExtensionServiceImpl objects displays the supported commands as listed in the command files.
  • Provides prebuilt kernel module drivers for Ubuntu 9.10 guest operating systems.
  • Adds support for upstreamed kernel PVSCSI and vmxnet3 modules.
  • Provides a change to the maintenance mode requirement during Cisco Nexus 1000V software upgrade. After installing this patch if you perform Cisco Nexus 1000V software upgrade, the ESX host goes into maintenance mode during the VEM upgrade.
  • In certain race conditions, freeing journal blocks from VMFS filesystems might fail. The WARNING: J3: 1625: Error freeing journal block (returned 0) <FB 428659> for 497dd872-042e6e6b-942e-00215a4f87bb: Lock was not free error is written to the VMware logs.
  • Changing the resolution of the guest operating system over a PCoIP connection (desktops managed by View 4.0) might cause the virtual machine to stop responding.
    Symptoms: The following symptoms might be visible:

    • When you try to connect to the virtual machine through a vCenter Server console, a black screen appears with the Unable to connect to MKS: vmx connection handshake failed for vmfs {VM Path} message.
    • Performance graphs for CPU and memory usage in vCenter Server drop to 0.
    • Virtual machines cannot be powered off or restarted.

ID: ESXi400-201002402-BG Impact: Critical Release date: 2010-03-03 Products: embeddedEsx 4.0.0 Updates VMware Tools

This patch fixes an issue where pressing Ctrl+Alt+Delete on service console causes ESX 4.0 hosts to reboot.

ID: VEM400-201002001-BG Impact: HostGeneral Release date: 2010-03-03 Products: embeddedEsx 4.0.0, esx 4.0.0 Cisco Nexus 1000V VEM

ID: VEM400-201002011-BG Impact: HostGeneral Release date: 2010-03-03 Products: embeddedEsx 4.0.0, esx 4.0.0 Cisco Nexus 1000V VEM

VMware Releases ESX(i) 3.5 Update 5; Critical Updates

December 5th, 2009

VMware apparently released ESX(i) 3.5 Update 5 dated 12/3/09, however it became available on Update Manager late this afternoon.  VMware is extremely poor at communicating anything but major releases, so to get the fastest notification possible about security patches and updates, I configure my VMware Update Manager servers to check for updates every 6 hours and provide me with email notification of anything it finds.  VMware doesn’t listen to me much when it comes to feature requests so I’ll shelve the ranting.

So what’s new in ESX 3.5 Update 5?  The major highlights are guest VM support for Windows 7 and Windows Server 2008 R2 (reminder, 64-bit only), as well as Ubuntu 9.04, and added hardware support for processors and NICs.  Before you get too excited about Windows 7, remember that it is not a supported guest operating system in VMware View.  Even in the new View 4 release, Windows 7 has “Technology Preview” support status only.

If you track the updates from VMware Update Manager, the 12/3 releases amount to 20 updates including Update 5, 16 updates of which are rated critical.  If you’re still a ways out on vSphere deployment, you’ll probably want to take a look at the critical updates for your 3.x environment.

Enablement of Intel Xeon Processor 3400 Series – Support for the Intel Xeon processor 3400 series has been added. Support includes Enhanced VMotion capabilities. For additional information on previous processor families supported by Enhanced VMotion, see Enhanced VMotion Compatibility (EVC) processor support (KB 1003212).

Driver Update for Broadcom bnx2 Network Controller – The driver for bnx2 controllers has been upgraded to version 1.6.9. This driver supports bootcode upgrade on bnx2 chipsets and requires bmapilnx and lnxfwnx2 tools upgrade from Broadcom. This driver also adds support for Network Controller – Sideband Interface (NC-SI) for SOL (serial over LAN) applicable to Broadcom NetXtreme 5709 and 5716 chipsets.

Driver Update for LSI SCSI and SAS Controllers – The driver for LSI SCSI and SAS controllers is updated to version 2.06.74. This version of the driver is required to provide a better support for shared SAS environments.

Newly Supported Guest Operating Systems – Support for the following guest operating systems has been added specifically for this release:

For more complete information about supported guests included in this release, see the VMware Compatibility Guide: http://www.vmware.com/resources/compatibility/search.php?deviceCategory=software.

•Windows 7 Enterprise (32-bit and 64-bit)
•Windows 7 Ultimate (32-bit and 64-bit)
•Windows 7 Professional (32-bit and 64-bit)
•Windows 7 Home Premium (32-bit and 64-bit)
•Windows 2008 R2 Standard Edition (64-bit)
•Windows 2008 R2 Enterprise Edition (64-bit)
•Windows 2008 R2 Datacenter Edition (64-bit)
•Windows 2008 R2 Web Server (64-bit)
•Ubuntu Desktop 9.04 (32-bit and 64-bit)
•Ubuntu Server 9.04 (32-bit and 64-bit)

Newly Supported Management Agents – See VMware ESX Server Supported Hardware Lifecycle Management Agents for current information on supported management agents.

Newly Supported Network Cards – This release of ESX Server supports HP NC375T (NetXen) PCI Express Quad Port Gigabit Server Adapter.

Newly Supported SATA Controllers – This release of ESX Server supports the Intel Ibex Peak SATA AHCI controller.

Note:

•Some limitations apply in terms of support for SATA controllers. For more information, see SATA Controller Support in ESX 3.5. (KB 1008673)

•Storing VMFS datastores on native SATA drives is not supported.

Create a 32-bit vCenter DSN on a 64-bit Operating System

November 21st, 2009

As I had pointed out in this blog post, VMware hints that 64-bit may be the future for vCenter Server. I decided that for my upgrade to vCenter 4.0 Update 1 this weekend, I would take the opportunity to rebuild my vCenter server from Windows Server 2003 32-bit to Windows Server 2008 64-bit.

Once the 64-bit base operating system build was complete, I installed the 64-bit Microsoft SQL Server Native Client drivers (downloadable here) since my back end database is Microsoft SQL Server 2005 on a remote server. A key thing to remember about this installation is that it installs both 64-bit and 32-bit DSN drivers.

The next step is to create the vCenter ODBC DSNs. Although vCenter Server runs on 64-bit operating systems, it currently requires a 32-bit ODBC DSN. This is important to remember because the Windows Start Menu launches the 64-bit ODBC DSN tool, not the 32-bit version I needed.  The vCenter Server (and Update Manager) installation will not complete without a 32-bit DSN.

To create a 32-bit DSN on a 64-bit operating system, run the following executable:

[WindowsDir]\SysWOW64\odbcad32.exe

Once the utility opens, you’ll be greeted by all the legacy 32-bit ODBC DSNs you’ve likely seen for years working with tiered Windows platforms. If using Microsoft SQL Server 2005 like me, be sure to select the SQL Native Client driver towards the bottom of the list, and not Driver da Microsoft para arquivos texto highlighted below:

Proceed with the creation of the vCenter Server and Update Manager ODBC DSNs and complete the vCenter Server and Update Manager installations.

This information and much more can be found in the ESX and vCenter Server Installation Guide, page 73.

8 New ESX 3.5.0 Patches Released; 3 Critical

October 16th, 2009

Eight new patches have been released for ESX 3.5.0. Other versions of ESX, including vSphere and ESXi, are not impacted.

3 of the 8 patches are rated critical and should be evaluated quickly for application in your virtual infrastructure.

ID: ESX350-200910401-SG Impact: HostSecurity Release date: 2009-10-16 Products: esx 3.5.0 Updates VMkernel, Tools, hostd

This patch contains the following fixes and enhancements:

This patch updates the service console kernel version to kernel-2.4.21-58.EL. The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the names CVE-2008-4210, CVE-2008-3275, CVE-2008-0598, CVE-2008-2136, CVE-2008-2812, CVE-2007-6063, and CVE-2008-3525 to the security issues fixed in kernel-2.4.21-58.EL.

This patch reduces the boot time of ESX hosts and should be applied when multiple ESX hosts detect LUNs used for Microsoft Cluster Service (MSCS).

Symptom: Error messages similar to the following might be logged in the /var/log/vmkernel log file of the service console:

Jul 24 14:34:24 VMEX3EQCH1100003 vmkernel: 165:15:48:57.500 cpu0:1033)WARNING: SCSI: 5519: Failing I/O due to too many reservation conflicts

Jul 24 14:34:24 VMEX3EQCH1100003 vmkernel: 165:15:48:57.500 cpu0:1033)WARNING: SCSI: 5615: status SCSI reservation conflict, rstatus 0xc0de01 for vmhba1:0:9. residual R 919, CR 0, ER 3

Jul 24 14:34:24 VMEX3EQCH1100003 vmkernel: 165:15:48:57.500 cpu0:1033)SCSI: 6608: Partition table read from device vmhba1:0:9 failed: SCSI reservation conflict (0xbad0022)

Any additional lines or customizations added by a user in the /etc/fstab file are deleted when VMware Tools is reinstalled or reconfigured. This issue occurs because when uninstalling, VMware Tools restores the files which were backed up during installation.

After applying this patch, any request for connection with ESX 3.5 using cipher suite of 56-bit encryption will be dropped. As a result, browsers that exclusively use cipher suites with 40-bit and 56-bit encryption cannot connect to ESX 3.5. Microsoft has made the Internet Explorer High Encryption Pack available for Internet Explorer 5.01 and earlier. Internet Explorer 5.5 and higher versions already use 128-bit encryption.

This patch contains a fix for a security vulnerability in the ISC third-party DHCP client. This vulnerability allows for code execution in the client by a remote DHCP server through a specially crafted subnet-mask option. The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the name CVE-2009-0692 to this issue.

ID: ESX350-200910402-BG Impact: Critical Release date: 2009-10-16 Products: esx 3.5.0 Updates ESX Scripts

This patch is required to be installed with ESX350-200910401-SG (KB 1013124) to resolve a boot-time-related issue. The patch reduces the boot time of ESX hosts and should be applied when multiple ESX hosts detect LUNs used for Microsoft Cluster Service (MSCS).

ID: ESX350-200910403-SG Impact: HostSecurity Release date: 2009-10-16 Products: esx 3.5.0 Updates Web Access

This patch updates the following:

WebAccess component Tomcat server to 5.5.27. This update addresses multiple security issues that exist in the earlier releases of the Tomcat server.

The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the names CVE-2008-1232, CVE-2008-1947, and CVE-2008-2370 to the issues addressed by Tomcat 5.5.27. For more information on these security vulnerabilities, refer to the Apache Tomcat 5.x Vulnerabilities page at http://tomcat.apache.org/security-5.html.

WebAccess component JRE to 1.5.0_18. This update addresses multiple security issues that existed in the previous versions of JRE.

The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the following names to the security issues fixed in JRE 1.5.0_17:

CVE-2008-2086, CVE-2008-5347, CVE-2008-5348, CVE-2008-5349, CVE-2008-5350, CVE-2008-5351, CVE-2008-5352, CVE-2008-5353, CVE-2008-5354, CVE-2008-5356, CVE-2008-5357, CVE-2008-5358, CVE-2008-5359, CVE-2008-5360, CVE-2008-5339, CVE-2008-5342, CVE-2008-5344, CVE-2008-5345, CVE-2008-5346, CVE-2008-5340, CVE-2008-5341, CVE-2008-5343, and CVE-2008-5355.

The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the following names to the security issues fixed in JRE 1.5.0_18:

CVE-2009-1093, CVE-2009-1094, CVE-2009-1095, CVE-2009-1096, CVE-2009-1097, CVE-2009-1098, CVE-2009-1099, CVE-2009-1100, CVE-2009-1101, CVE-2009-1102, CVE-2009-1103, CVE-2009-1104, CVE-2009-1105, CVE-2009-1106, and CVE-2009-1107.

ID: ESX350-200910404-SG Impact: HostSecurity Release date: 2009-10-16 Products: esx 3.5.0 Updates cim

After applying this patch, any request for connection to CIM port 5989 on ESX 3.5 using cipher suite of 56-bit encryption will be dropped.

ID: ESX350-200910405-SG Impact: HostSecurity Release date: 2009-10-16 Products: esx 3.5.0 Updates mptscsi drivers

This patch updates the mptscsi driver to a version that is compatible with the service console version kernel-2.4.21-58.EL.

ID: ESX350-200910406-SG Impact: HostSecurity Release date: 2009-10-16 Products: esx 3.5.0 Updates Service Console DHCP Client

The service console package dhclient has been updated to version dhclient-3.0.1-10.2. This fixes a stack buffer overflow flaw in the ISC DHCP client and a flaw in the way the DHCP daemon init script handles temporary files. The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the names CVE-2009-0692 and CVE-2009-1893 to these issues.

ID: ESX350-200910408-BG Impact: Critical Release date: 2009-10-16 Products: esx 3.5.0 Updates VMkernel iSCSI driver

When ESX 3.5 hosts are connected to Adaptec Snap Server series or Dell NX series of NAS appliances through the ESX software iSCSI initiator, sometimes the iSCSI LUNs are not detected by the ESX 3.5 hosts. The issue is caused due to the way the software iSCSI driver detects an overflow condition. This patch fixes the issue.

ID: ESX350-200910409-BG Impact: Critical Release date: 2009-10-16 Products: esx 3.5.0 Updates Emulex FC driver

ESX 3.5 Update 4 hosts with Emulex HBAs might stop responding when accessed through vCenter Server. This Emulex driver patch fixes the issue.

Symptom: On ESX hosts, any application making an ioctl call in to the Emulex driver might fail.

8 New ESX(i) 4.0 Patches Released; 7 Critical

September 25th, 2009

Eight new patches have been released for ESX(i) 4.0 (6 for ESX, 2 for ESXi).  Previous versions of ESX(i) are not impacted.

7 of the 8 patches are rated critical and should be evaluated quickly for application in your virtual infrastructure.

ID: ESX400-200909401-BG Impact: Critical Release date: 2009-09-24 Products: esx 4.0.0
Updates vmx and vmkernel64
This patch fixes some key issues such as:
* Guest operating system shows high memory usage on Nehalem based systems, which might trigger memory alarms in vCenter.
* monitor or vmkernel fails when running certain guest operating systems with a 32-bit monitor running in binary translation mode.

See http://kb.vmware.com/kb/1014019 for details

NOTE: Cisco Nexus 1000v customers using VMware Update Manager to patch ESX 4.0 should add an additional patch download URL as described in KB 1013134

ID: ESX400-200909402-BG Impact: Critical Release date: 2009-09-24 Products: esx 4.0.0 Updates VMware Tools
This patch includes the following fixes
* Updated VMware SVGA and mouse device drivers for supported Linux guest operating systems that use Xorg 7.5.
* PBMs for Debian 5.0.1.
* PBMs for SUSE Linux Enterprise 11 VMI kernel.

See http://kb.vmware.com/kb/1014020 for details

NOTE: Cisco Nexus 1000v customers using VMware Update Manager to patch ESX 4.0 should add an additional patch download URL as described in KB 1013134

ID: ESX400-200909403-BG Impact: Critical Release date: 2009-09-24 Products: esx 4.0.0 Updates bnx2x
This patch fixes the following issues:
* Virtual machines experience a network outage when they run with older versions of VMware Tools (ESX 3.0.x)
* A network outage is experienced if the MTU value is changed on a Broadcom Netxtreme II 10gig NIC.
* unloading the driver causes a host reboot.

See http://kb.vmware.com/kb/1014021 for details

NOTE: Cisco Nexus 1000v customers using VMware Update Manager to patch ESX 4.0 should add an additional patch download URL as described in KB 1013134

ID: ESX400-200909404-BG Impact: Critical Release date: 2009-09-24 Products: esx 4.0.0 Updates ixgbe
This patch fixes the following issue:
* A vSphere ESX Host that has NIC teaming configured with the ixgbe driver for the physical NICs might fail if one of the physical NICs goes down.

See http://kb.vmware.com/kb/1014022 for more details

NOTE: Cisco Nexus 1000v customers using VMware Update Manager to patch ESX 4.0 should add an additional patch download URL as described in KB 1013134

ID: ESX400-200909405-BG Impact: HostGeneral Release date: 2009-09-24 Products: esx 4.0.0 Updates perftools
This patch fixes the following issue:
* esxtop utility might quit with the error message “VMEsxtop_GrpStatsInit() failed” when attempting to monitor network status on ESX.

See http://kb.vmware.com/kb/1014023 for more details

NOTE: Cisco Nexus 1000v customers using VMware Update Manager to patch ESX 4.0 should add an additional patch download URL as described in KB 1013134

ID: ESX400-200909406-BG Impact: Critical Release date: 2009-09-24 Products: esx 4.0.0 Updates hpsa
This patch fixes the following issue:
* A virtual machine might fail after the Storage Port controller is reset on ESX hosts that have the HPSA driver connected to an SAS array.
* Hosts cannot detect more than 2 HPSA controllers due to the limited driver heap size.

See http://kb.vmware.com/kb/1014024 for more details

NOTE: Cisco Nexus 1000v customers using VMware Update Manager to patch ESX 4.0 should add an additional patch download URL as described in KB 1013134

ID: ESXi400-200909401-BG Impact: Critical Release date: 2009-09-24 Products: embeddedEsx 4.0.0 Updates Firmware
This patch fixes some key issues such as:
* Guest operating system shows high memory usage on Nehalem based systems, which might trigger memory alarms in vCenter.
* monitor or vmkernel fails when running certain guest operating systems with a 32-bit monitor running in binary translation mode.
See http://kb.vmware.com/kb/1014026 for details

NOTE: Cisco Nexus 1000v customers using VMware Update Manager to patch ESXi 4.0 should add an additional patch download URL as described in KB 1013134

ID: ESXi400-200909402-BG Impact: Critical Release date: 2009-09-24 Products: embeddedEsx 4.0.0 Updates Tools
This patch includes the following fixes
* Updated VMware SVGA and mouse device drivers for supported Linux guest operating systems that use Xorg 7.5.
* PBMs for Debian 5.0.1.
* PBMs for SUSE Linux Enterprise 11 VMI kernel.

See http://kb.vmware.com/kb/1014027 for details

NOTE: Cisco Nexus 1000v customers using VMware Update Manager to patch ESXi 4.0 should add an additional patch download URL as described in KB 1013134

Saturday Grab Bag

September 12th, 2009

Here’s a collection of quick hits I’ve been meaning to get to. Individually, their content is a bit on the short side for the length I normally like to write so I thought I’d throw them together in a single post and see how it comes out.

Tasks and Events List Lengths

First up is the listing of Tasks and Events in the vSphere Client. Have you ever started troubleshooting an issue in the vSphere client by looking at the Tasks or Events and the chronological listing of events doesn’t go back far enough to the date or time you’re looking for? Not finding the logs you’re looking for in the vSphere Client usually means you need to open a PuTTY session and start sifting through logs in /var/log/ or /var/log/vmware/ in the Service Console. The reason for this is that the vSphere Client, by default, is configured to tail the last 100 entries in the Tasks or Events list. You can find this setting in your vSphere Client by choosing “Edit|Client Settings” then choose the “Lists” tab:

Simply increase the value from 100 to whatever you’d like, with 1,000 being the highest allowable value. Notice that when this number is increased, you will immediately see more history. In other words, you don’t have to necessarily wait for time to pass and more historical events to accumulate to see the additional rows of information. Also note that this is a vSphere Client setting which is retained client side and applies to both vCenter Server and ESX(i) host connections.

Collecting diagnostic information for VMware products

Like any offering from a software or hardware vendor, VMware products aren’t perfect. During your VMware experience, you may run into a problem which requires the intervention of VMware support. More often than not, VMware is going to ask you to generate a support bundle which consists of a collection of diagnostic and configuration files and logs. Following this paragraph is a link to VMware KB1008524 which contains links to creating support bundles for various VMware products. Note that in some cases there are different methods for different versions of the same product. If you choose to create a VMware SR online, it is helpful to have created these log bundles in advance so you can attach them to the SR. If you’ve done VMware support long enough, you already know how to FTP log bundles to VMware after an SR number has been generated.

Collecting diagnostic information for VMware products

New VMware Update Manager won’t download ESX(i) patches

Scenario: You’ve built a new VMware vCenter Server in addition to a new VMware Update Manager Server (VUM). After properly configuring Update Manager as well as the necessary internet, proxy, baseline, and scheduled task settings, VUM proceeds to download Windows, Linux, and application patches, but it won’t download ESX(i) host patches. As I found out by trench experience, the cause is because no ESX(i) hosts have been added to the vCenter Server and thus no hosts are being managed by VUM. You need to add at least one ESX(i) host to vCenter Server before VUM will be triggered to suck down all the host updates. One might then ask why guest patches are being downloaded. The only answer I have for the inconsistent behavior is due the fact that ESX(i) host patches are downloaded from VMware, while guest OS and application patches are downloaded from a completely different source, Shavlik. The mechanics behind the download processes obviously differ between the two.

What vCenter Server is this ESX(i) host managed by?

Scenario: You administer a large VMware virtual infrastructure with many vCenter Servers. You need to manage or configure a host or cluster but haven’t the slightest idea what vCenter Server to connect to. You can easily find out by attempting a Virtual Infrastructure Client connection to the host in question. Shortly after providing the necessary host credentials, the IP address of the vCenter Server managing this host will be revealed:

Now in theory, you could establish a Virtual Infrastructure Client connection to the IP address, however, I don’t like this because it dirties up the cached connection list with IP addresses which are meaningless short of having them all memorized. I prefer to take it a step further by opening a Command Prompt and using the command ping -a <IP_address> to reveal the name of the vCenter Server managing the host:

The command above reveals jarjar.boche.mcse as the vCenter Server which is managing the ESX(i) host I was wanting to manage via the vCenter Server.

I’m sure a PowerShell expert will follow up with a script which makes this process easier but this a good example to follow if you don’t have PowerShell or the VI Toolkit (Power CLI) installed.

8 New ESX 3.5 Patches Released; 4 Critical

September 1st, 2009

Four new patches have been released for ESX 3.5.0. It appears ESXi, ESX4, and other versions of ESX are not impacted.

4 of the 8 patches are rated critical and should be evaluated quickly for application in your virtual infrastructure.

ID: ESX350-200908401-BG Impact: HostGeneral Release date: 2009-08-31 Products: esx 3.5.0 Updates forcedeth driver
The forcedeth driver installed on the ESX hosts causes the NVIDIA nForce Network Controller NICs to lose network connectivity until the forcedeth driver is reloaded. This patch addresses the issue.

The affected NICS are:

  • NVIDIA nForce Professional 2200 MCP 1Gbe NIC
  • NVIDIA nForce Professional 2050 I/O companion chip 1Gbe NIC
  • NVIDIA nForce Professional 3600 1Gbe NIC

ID: ESX350-200908402-BG Impact: Critical Release date: 2009-08-31 Products: esx 3.5.0 Updates VMware Tools
After performing VMotion between ESX 3.0.x and ESX 3.5 hosts, virtual machines running on ESX 3.5 hosts are restarted in order to upgrade to the latest version of VMware Tools. After applying this fix, VMware Tools function as expected.

ID: ESX350-200908403-BG Impact: HostGeneral Release date: 2009-08-31 Products: esx 3.5.0 Updates megaraid and mptscsi drivers
This patch fixes the following issues:

  • When the ESX host boots, the megaraid_sas driver heap gets depleted when claiming 4 LSI SAS RAID controllers on IBM System x3950 M2 Athena servers. This issue might cause the ESX host to stop booting. The fix increases the heap size for the megaraid_sas driver from 8 MB to 16 MB.
  • The mptscsi_2xx driver limits the discovery of targets to 63 SAS devices per LSI Serial Attached SCSI (SAS) host bus adapter (HBA). This fix increases the number of targets to the value returned by the HBA firmware.

ID: ESX350-200908404-BG Impact: HostGeneral Release date: 2009-08-31 Products: esx 3.5.0 Updates vmkctl
When N-Port ID Virtualization (NPIV) enabled virtual machines are powered on on ESX hosts, a rescan issued from the VI Client results in an error message stating that the rescan failed, even if the rescan is successful.

ID: ESX350-200908405-BG Impact: Critical Release date: 2009-08-31 Products: esx 3.5.0 Updates vmkernel
Running the esxtop command on the service console of the ESX hosts lists high values for the max limited (%MLMTD) parameter for virtual machines when no max limited parameter is set. When the high values are listed, the performance of the virtual machines might be affected. In the VI Client, the max limited parameter is set in the Resources tab for CPU in Virtual Machine properties.

ID: ESX350-200908406-BG Impact: Critical Release date: 2009-08-31 Products: esx 3.5.0 Updates vmx
This patch provides the following:

  • Adds support for new SCSI-3 status values in the SCSI emulation for virtual machines.
  • Fixes an issue where powering on customized versions of Ubuntu virtual machines from the ESX hosts might cause the ESX hosts to stop responding.

ID: ESX350-200908407-BG Impact: HostGeneral Release date: 2009-08-31 Products: esx 3.5.0 Updates kernel source and vmnix
This patch updates the service console kernel for the following fixes:

  • The forcedeth driver installed on the ESX hosts causes the NVIDIA nForce Network Controller NICs to lose network connectivity under certain circumstances. The affected NICS are:
    • NVIDIA nForce Professional 2200 MCP 1Gbe NIC
    • NVIDIA nForce Professional 2050 I/O companion chip 1Gbe NIC
    • NVIDIA nForce Professional 3600 1Gbe NIC
  • A bnx2x firmware dump issue.
  • The mptscsi_2xx driver limits the discovery of targets to 63 SAS devices per LSI Serial Attached SCSI (SAS) host bus adapter (HBA). This fix increases the number of targets to the value returned by the HBA firmware.

ID: ESX350-200908408-BG Impact: Critical Release date: 2009-08-31 Products: esx 3.5.0 Updates bnx2x driver
This patch fixes a bnx2x firmware dump issue.

4 New ESX Patches Released

July 30th, 2009

Four new patches have been released for ESX 3.5.0. It appears ESXi, ESX4, and other versions of ESX are not impacted.

3 of the 4 patches are rated critical.

ESX350-200907403-BG – VMware Tools Update (General)

Adds support for Windows XP Embedded with Service Pack 2 guest operating system.

Installing VMware Tools on Ubuntu 9.04 virtual machines display a message stating that no drivers are available for Xorg 7.5. This patch provides the VMware SVGA and mouse drivers for Xorg 7.5.

ESX350-200907404-BG – critical

Applications in a virtual machine using SSSE3 instructions might fail under certain conditions. The vmware.log file might display an entry or entries similar to:
May 20 17:14:44.398: vcpu-0| vmcore/decoder/decoder.c:655 0xd1e357 #UD e41d380f sz=4 ct=0.

When SUSE Linux Enterprise 11 virtual machines installed with Virtual Machine Interface (VMI) kernel or virtual machines supporting VMI are booted into VMI mode, the virtual machines might stop responding or become extremely slow.

ESX350-200907405-BG – critical

On IBM systems having iBMC/IMM devices, during boot time, the CDCEther driver could not complete its device discovery due to a timing issue in the device firmware. This patch fixes the issue.

ESX350-200907407-BG – critical

The maximum username length of UserAccount in the VMware VI Toolkit is increased from 16 to 32 characters.

Fixes an hostd memory leak issue with HTTP connection recycling when communicating with UI, SDK etc.

When a mounted NFS volume goes offline in an ESX Server cluster, it might cause the heap size to grow and might cause the ESX Server to stop responding.

VMFS locks on ESX Server hosts might be incorrectly broken, when a previous unlock operation from the same host fails.

Some virtual machines including Red Hat, Windows, and SUSE Linux Enterprise boot very slowly or might not boot at all when an EMC Symmetrix LUN in Not Ready state is attached to the virtual machines as an RDM device. After applying this fix, the virtual machines boot normally.

First vSphere Patches Released by VMware

July 10th, 2009

Approximately six weeks after the vSphere launch, the first batch of ESX/ESXi 4.0 patches have downloaded by vSphere Update Manager. I was notified this morning at 3am via an Email from vSphere Update Manager. Here is the patch list:

The number of patch definitions downloaded:

10 critical

16 total

ESX:

ID: ESX400-200906401-BG Impact: Critical Release date: 2009-07-09 Products: esx 4.0.0
Updates VMX

ID: ESX400-200906402-BG Impact: Critical Release date: 2009-07-09 Products: esx 4.0.0
Updates ESX Scripts

ID: ESX400-200906403-BG Impact: HostGeneral Release date: 2009-07-09 Products: esx 4.0.0
Updates VMware Tools

ID: ESX400-200906404-BG Impact: Critical Release date: 2009-07-09 Products: esx 4.0.0
Updates CIM

ID: ESX400-200906405-SG Impact: HostSecurity Release date: 2009-07-09 Products: esx 4.0.0
Updates krb5 and pam_krb5

ID: ESX400-200906406-SG Impact: HostSecurity Release date: 2009-07-09 Products: esx 4.0.0
Updates sudo

ID: ESX400-200906407-SG Impact: HostSecurity Release date: 2009-07-09 Products: esx 4.0.0
Updates curl

ID: ESX400-200906408-BG Impact: Critical Release date: 2009-07-09 Products: esx 4.0.0
Updates SCSI Driver for QLogic FC

ID: ESX400-200906409-BG Impact: Critical Release date: 2009-07-09 Products: esx 4.0.0
Updates LSI storelib Library

ID: ESX400-200906410-BG Impact: Critical Release date: 2009-07-09 Products: esx 4.0.0
Updates hostd

ID: ESX400-200906411-SG Impact: HostSecurity Release date: 2009-07-09 Products: esx 4.0.0
Updates udev

ID: ESX400-200906412-BG Impact: Critical Release date: 2009-07-09 Products: esx 4.0.0
Updates esxupdate

ID: ESX400-200906413-BG Impact: Critical Release date: 2009-07-09 Products: esx 4.0.0
Updates vmkernel iSCSI Driver

ESXi:

ID: ESXi400-200906401-BG Impact: Critical Release date: 2009-07-09 Products: embeddedEsx 4.0.0
Updates Firmware

ID: ESXi400-200906402-BG Impact: Critical Release date: 2009-07-09 Products: embeddedEsx 4.0.0
Updates Tools

ID: VEM400-200906002-BG Impact: HostGeneral Release date: 2009-07-09 Products: embeddedEsx 4.0.0
Cisco Nexus 1000V VEM

Update Manager does not download host updates

July 1st, 2009

Scenario: You build a brand new vCenter and Update Manager server. After the installation is complete, you decide to get a jump on things by starting the download of all the ESX/ESXi host updates. You force Update Manager to download updates and the task completes surprisingly fast for the amount of ESX/ESXi content expected to be downloaded:

7-1-2009 8-54-08 PM

A problem is discovered in that Update Manager has downloaded metadata for guest OS updates (Windows, Linux, applications, etc.), but no ESX/ESXi update information is downloaded. The baselines are verified as OK, internet connectivity and proxy configuration checks out OK. What is the problem?

Cause: There are no ESX/ESXi hosts in vCenter Server. Per VMware KB 1008308, ESX/ESXi hosts must be present in vCenter Server before Update Manager will download the update metadata and the updates themselves.

7-1-2009 9-00-41 PM

This is one of those embarrassing forehead slapper type problems, however, Windows administrators who are used to working with and relying on the predicable behavior of WSUS are likely to encounter this at some point in time and are exempt from chastising. Swallow your pride and don’t tell anyone. :)

VMware is entitled to their opinion on how their software should function but to me this is a UI/usability issue that doesn’t make a lot of sense. What adds to the confusion is the inconsistent behavior in that in the absence of both hosts and guests in vCenter Server, guest OS update information appears in Update Manager but host update information does not. Yes I’m aware that host updates come from VMware and guest updates come from Shavlik.  No that’s not an acceptable excuse.

While we’re on the subject, there are a handful of other reasons why Update Manager may malfunction. Take a look at VMware’s KB index and use your browser search to find all instances of “Update Manager”. There you’ll find all known solutions to Update Manager issues as well as some best practices and port requirements.

VMware Update Manager, Updates, and New Builds

June 7th, 2009

This was somewhat of a strange post to get off the ground. I had a definite purpose at the beginning and I knew what I was going to write about, however, through some lab scenarios I unexpectedly took the scenic route in getting to the end.

In my mind, the topic started out as “Effective/Efficient Use of Update Manager For New Builds”.

Then, while working in the lab, the title changed to “Gosh, Update Manager Is Slow”.

A while later it morphed into “Cripes, What In The Heck Is Update Manager Doing?!”

Finally I had a revelation and the topic came full circle back to an appropriate title of “VMware Update Manager, Updates, and New Builds” which is what I more or less had in mind to begin with but as I said I picked up some information which I hadn’t recognized at the beginning.

“Effective/Efficient Use of Update Manager For New Builds”

So as I said, the idea of the post started out with a predefined purpose – discussion about the use of Update Manager in host deployments. It really has more to do with host deployment methodology as a basis of discussion that it has to do with patch management. What I was going highlight was that the deployment of an ESX host goes much quicker if you start out with the most current ESX .ISO allowed in your environment and then use VMware Update Manager to install the remaining patches to bring it to current.

As an example, let’s say our current ESX platform standard is ESX 3.5.0 Update 4 with all patches up to today’s date of 6/6/09.

  • The most efficient deployment method would be to perform the initial installation of ESX using the ESX 3.5.0 Update 4 .ISO and then afterwards, use VMware Update Manager to install the remaining 15 patches through today’s date. Using Ultimate Deployment Appliance version 1.4, I can deploy ESX 3.5.0 with Update 4 in five minutes. The subsequent 15 patches using VMware Update Manager takes an additional 16 minutes, end to end including the reboot. That’s a total of less than 25 minutes to deploy a host with all patches.
  • Now let’s look at an alternative and much more time consuming method. Install ESX 3.5.0 using the original or even the Update 1 .ISO. Again, using UDA 1.4, this takes 5 minutes. Now we use Update Manager to remediate the ESX host to Update 4 plus the remaining 15 patches. If you used the original ESX .ISO, you’re looking at 149 updates. If you installed from the ESX 3.5.0 Update 1 .ISO, you’ve got 125 patches to install. This patching process takes nearly 90 minutes! Even on an IBM x3850M2 (one of the fastest hardware platforms available on the market today), the patch process is 75 minutes.

The numbers in the second bullet above speak to the deployment of one host. We always have more than one host in a high availability cluster and a typical environment might have 6, 12, or even 32 hosts in a cluster. Ideally we don’t want to be running hosts in a cluster on different patch levels for an extended duration. Suddenly we’re looking at a long day of work for a 6 node cluster (9.5 hours) and an entire weekend gone for a cluster of 12 hosts or more (18 hours +). The kicker is that this is still an automated deployment. Automation usually means efficiency right? Not in this case. Granted, there’s not a lot of manual labor involved here, but there is a lot of “hurry up and wait”.

Now before anyone jumps in and recommends rebuilding all of the hosts concurrently, let’s just count that out as an option because in this scenario, we’re rebuilding an active cluster that can only afford 1 host outage at a time (N+1). I’m actually being generous with the time durations because I’m not even accounting for host evacuations, which at the vCenter default of 2 at a time, can take a long time on densely populated clusters. It’s a real world scenario and if you don’t plan ahead for it, you may find out there is not enough time in a weekend to complete your upgrade.

Moral of this section: When deploying hosts, use the most recent .ISO possible which has all of the updates injected into it up to the release date of the .ISO.

“Gosh, Update Manager Is Slow”

I’ve heard some comments via word of mouth about how slow Update Manager is. Myself, I thought the comments were unfounded. I’ve never had major issues with Update Manager aside from a few nuisances I’ve learned to work around. Having managed ESX environments before the advent of Update Manager, I’m grateful for what Update Manager has brought to the table in lieu of manually populated and managed intranet update repositories. I never really noticed the Update Manager slowness because I was always deploying new host builds from the latest ESX .ISO as I described in the first bullet in the section above, and then applying the few incremental post deployment patches. Deploying the full boat of ESX patches using Update Manager has opened up my eyes as to how painfully slow it can be.

One interesting thing that I discovered in the lab was not only is the patch deployment process longer, the preceding scan process is as well. The interesting component is that both the scan and the remediate steps seem to scale in a linear fashion, whether that is actually true or just a coincidence, who knows. What I mean is that:

  • An ESX 3.5.0 Update 4 host took 1 minute to scan and 16 minutes to remediate
  • An ESX 3.5.0 Update 1 host took 5 minutes to scan and 84 minutes to remediate

So we’re wasting extra time in both of the remediation processes: The scan, and the remediate.

Moral of this section: Update Manager or ESX patch installation or both is slow, but it doesn’t have to be. Same as the moral of the first section: Avoid this pitfall by using the most recent .ISO possible which has all of the updates injected into it up to the release date of the .ISO.

“Cripes, What In The Heck Is Update Manager Doing?!”

So then curiosity got the best of me and I took the lab experiment a little further. Of the 84 minutes spent remediating ESX 3.5.0 Update 1 host above, how much of that time was spent installing Update 4, and how much of the time was spent installing the 15 subsequent post Update 4 patches? Afterall, I already know that remediating the 15 post Update 4 patches by themselves takes only 16 minutes. Will the numbers jive?

To find out, I deployed an ESX 3.5.0 Update 1 host and created a remediation baseline containing ONLY ESX 3.5.0 Update 4. Big sucker – 723MB, but because it’s just one giant service pack, perhaps it will install quicker than the sum of all its updates. Here’s where I was really wrong.

I remediated the host and expected to see 1 task in vCenter describing an installation process, and then a reboot. Instead, I saw a boatload of patches being installed:

6-7-2009 12-26-22 AM

Which brings me to the title of this section “Cripes, What In The Heck Is Update Manager Doing?!” Did I apply the wrong baseline? Did Update Manager become self aware like Skynet and decide to engineer its own creative solutions to datacenter problems? Turns out Update 4 is not a patch or a service pack at all. In and of itself, it doesn’t even include binary RPM data. It’s metadata that points to all ESX 3.5.0 patches dated up to and including 3/30/09. Sure, you can download Update 4 as a 724MB offline installation package from the VMware download section, but mosey on over to their patch repository portal and you’ll see that the giant list of superseded and included updates in Update 4 is merely an 8.0K download. At first I thought that had to be a typo and I was about to drop John Troyer an email but opening up that 8K file just for kicks was the eye opener for me. Take a look at the 8K file and you’ll see the metadata that tells Update Manager to go download many of the incremental patches leading up to 3/30/09. Same concept with the 724MB offline installation package. It’s a .ZIP file. Open it up and you won’t find a large 724MB .RPM. Instead you’ll find a directory structure containing many of the incremental updates leading up to 3/30/09.

Moral of this section: Same as the moral of the first and second sections: Avoid wasting your valuable maintenance window time by avoiding as many incremental ESX patches as possible. Use the most recent .ISO possible which has all of the updates injected into it up to the release date of the .ISO when you deploy a host.

“VMware Update Manager, Updates, and New Builds”

Connect the dots and I think we’ve got a best practice in the making for host deployments using Update Manager. Existing and new host deployments aside, look at the implications of using Update Manager to deploy a major Update (in this discussion, Update 4). It’s actually 5 times faster to rebuild the host with the integrated Update 4 .ISO than it is to patch it with Update Manager. To me that’s bizarre but it is reality if you have automated host deployment methods. For medium to large environments, automated builds are absolutely required. There’s not enough time in the weekend to patch an 18 host cluster, let alone a 32 node cluster using Update Manager. Rebuild from an updated .ISO or span your host updates over several maintenance windows. The latter could get hairy and I definitely would not recommend it.

Great day today and I got a lot accomplished in the lab. Unfortunately towards the end, this happened:

6-7-2009 1-08-09 AM

Replacement unit is already on the way from NewEgg. Thank you vWire for funding the replacement!

VMware Update Manager plugin failures

December 8th, 2008

Roger Lund posted several links on his blog which I was personally interested in because I have dealt with them in one way shape or form. One of them was a potential resolution to the issue where the VIC loses connectivity to VMware Update Manager and the VUM plugin unloads. The error message is “Your session with the VMware Update Manager Server is no longer valid. The VMware Update Manager Client plugin will be unloaded from the VI Client”

12-7-2008 9-19-21 PM

This is an issue that I wouldn’t say I’m plagued with, however, it does pop up every few days and the easy fix is to simply re-enable the VUM plugin. It’s an inconvenience that I wanted to get to the bottom of some day when I had time, but thus far it hasn’t been a high priority. I had checked the VUM logs but was not able to determine anything conclusive.

At any rate, I was excited to see the link on Roger Lund’s blog pointing to VMware KB article 1007099 “Update Manager Client is randomly disabled”. The link discusses a potential solution of disabling anti-virus scanning of the VUM repository (where all the code and metadata is downloaded to). I performed this over the weekend by neutering Symantec Antivirus Corporate Edition and kept my fingers crossed.

Things were looking good until Sunday night when the VUM error popped up again. Oh well, back to the drawing board. If anyone has any other ideas, I’m all ears.