Posts Tagged ‘vCenter Server’

vCloud Director and vCenter Proxy Service Failure

December 16th, 2011

In the past couple of weeks I have spent some time with VMware vCloud Director 1.5.  The experience yielded three blog articles Collecting diagnostic information for VMware vCloud Director, Expanding vCloud Director Transfer Server Storage and now this one.

In this round, the vCD cell stopped working properly (single cell server environment).  I could log into the vCD provider and organization portals but the deployment of vApps would run for an abnormally long time and then fail after 20 minutes with one of the resulting failure messages being Failed to receive status for task.

Doing some digging in the environment I found a few problems.

Problem #1:  None of the cells have a vCenter proxy service running on the cell server.

Snagit Capture

Problem #2:  Performing a Reconnect on the vCenter Server object resulted in Error performing operation and Unable to find the cell running this listener.

Snagit Capture

Snagit Capture

I search the Community Forums, talked with Chris Colotti (read his blog) for a bit, and then opened an SR with VMware support.  VMware sent me a procedure along with a script to run on the Microsoft SQL Server:

  1. BACKUP the entire SQL Database.
  2. Stop all cells. (service vmware-vcd stop)
  3. Run the attached reset_qrtz_tables_sql_database.sql
    — shutdown all cells before executing
    delete from qrtz_scheduler_state
    delete from qrtz_fired_triggers
    delete from qrtz_paused_trigger_grps
    delete from qrtz_calendars
    delete from qrtz_trigger_listeners
    delete from qrtz_blob_triggers
    delete from qrtz_cron_triggers
    delete from qrtz_simple_triggers
    delete from qrtz_triggers
    delete from qrtz_job_listeners
    delete from qrtz_job_details
    go
  4. Start one cell and verify if issue is resolved. (service vmware-vcd start)
  5. Start the remaining cells.

Before running the script I knew I had to make a few modifications to select the vCloud database first.

When running the script, it failed due to case sensitivity with respect to the table names.  Upon installation, vCD creates all tables with upper case names.  When the MS SQL Server database was first created by yours truly, case sensitivity, along with accent sensitivity, were enabled with COLLATE Latin1_General_CS_AS which comes straight from page 17 of the vCloud Director Installation and Configuration Guide.

After fixing the script, it looked like this:

USE [vcloud]
GO

— shutdown all cells before executing
delete from QRTZ_SCHEDULER_STATE
delete from QRTZ_FIRED_TRIGGERS
delete from QRTZ_PAUSED_TRIGGER_GRPS
delete from QRTZ_CALENDARS
delete from QRTZ_TRIGGER_LISTENERS
delete from QRTZ_BLOB_TRIGGERS
delete from QRTZ_CRON_TRIGGERS
delete from QRTZ_SIMPLE_TRIGGERS
delete from QRTZ_TRIGGERS
delete from QRTZ_JOB_LISTENERS
delete from QRTZ_JOB_DETAILS
go

The script ran successfully wiping out all rows in each of the named tables.  A little sidebar discussion here.. I talked with @sqlchicken (Jorge Segarra, read his blog here) about the delete from statements in the script. It is sometimes a best practice to use the truncate table statement instead so that the transaction logs are bypassed instead of using the delete from statement which is more resource intensive due to the row by row deletion method and the rows being recorded in the transaction logs. Thank you for that insight Jorge! More on MS SQL Delete vs Truncate here. Jorge was also kind enough to provide a link on the subject matter but credentials will be required to view the content.

I was now able to restart the vCD cell and my problems were gone. Everything was working again. All errors have vanished.  I thanked the VMware support staff and then tried to gain a little bit more information about how the problem was resolved by deleting table rows and what exactly are the qrtz tables?  I had looked at the table rows myself before they were deleted and the information in there didn’t make a lot of sense to me (but that doesn’t necessarily classify it as transient data).  This is what he had to say:

These [vCenter Proxy Service] issues are usually caused by a disconnect from the database, causing the tables to become stale. vCD constantly needs the ability to write to the database and when it cannot, the cell ends up in a state that is similar to the one that you have seen.

The qrtz tables contain information that controls the coordinator service, and lets it know when the coordinator to be dropped and restarted, for cell to cell fail over to another cell in multi cell enviroment.

When the tables are purged it forces the cell on start up to recheck its status and start the coordinator service. In your situation the cell, due to corrupt records in the table was not allowing this to happen.

So by clearing them forced the cell to recheck and to restart the coordinator.

Good information to know going forward. I’m going to keep this in my back pocket. Or on my blog as it were.  Have a great weekend!

VMware vSphere 4.1 Update 2 Released

October 27th, 2011

As I sit here working on an SRM lab, VUM just sent an email to me reporting 28 new patches for ESX(i) 4.1 including the release of 4.1 Update 2.

What’s New

The VMware vCenter Server 4.1 Update 2 release offers the following improvements:

  • Support for new processors: vCenter Server 4.1 Update 2 supports hosts with processors on AMD Opteron 6200 series (Interlagos) and AMD Opteron 4200 series (Valencia).
    Note: For the AMD Opteron 6200 and 4200 series (Family 15h) processors, vCenter Server 4.1 Update 2 treats each core within a compute unit as an independent core, except while applying licenses. For the purpose of licensing, vCenter Server treats each compute unit as a core. For example, although a processor with 8 compute units can provide the processor equivalent of 16 cores on vCenter Server 4.1 Update 2, only 8 cores will be counted towards license usage calculation.
  • Additional vCenter Server Database Support: vCenter Server now supports the following databases.
    • Microsoft SQL Server 2008 Express (x32 and x64)
    • Microsoft SQL Server 2008 R2 Express (x32 and x64)
  • Resolved Issues: This release delivers a number of bug fixes that have been documented in the Resolved Issues section.

What’s New

The following information describes some of the enhancements available in this release of VMware ESXi:

  • Support for new processors – ESXi 4.1 Update 2 supports AMD Opteron 6200 series (Interlagos) and AMD Opteron 4200 series (Valencia).Note: For the AMD Opteron 6200 and 4200 series (Family 15h) processors, ESX/ESXi 4.1 Update 2 treats each core within a compute unit as an independent core, except while applying licenses. For the purpose of licensing, ESX/ESXi treats each compute unit as a core. For example, although a processor with 8 compute units can provide the processor equivalent of 16 cores on ESX/ESXi 4.1 Update 2, it only uses 8 licenses.
  • Support for additional guest operating system ESX 4.1 Update 2 adds support for Ubuntu 11.10 guest operating system. For a complete list of guest operating systems supported with this release, see the VMware Compatibility Guide.

Resolved Issues In addition, this release delivers a number of bug fixes that are documented in the Resolved Issues section.

ESXi 4.0 Update 2 hosts may PSOD after vCenter Server is upgraded to 5.0

October 11th, 2011

This important notice just came across my radar:

VMware has become aware of some ESXi 4.0 hosts experiencing a purple screen after vCenter Server is upgraded to 5.0

The Knowledgebase Team has prepared KB article: ESXi 4.0 Update 2 hosts may experience a purple screen after vCenter Server is upgraded to 5.0 (2007269) and an alert has been placed on the Support page to alert customers of this issue.

This Knowledge Base article will be updated if new information becomes available (you can subscribe to rss feeds on individual KB articles for this purpose). If you have been affected by this, please read the KB.

We apologize for any inconvenience this may have caused you. If you know how to spread the word to your friends and colleagues, please do so.

Symptoms

You may encounter an issue where:

  • You have recently upgraded your vCenter Server to version 5.0
  • You have ESXi 4.0 Update 2 hosts in the inventory of this vCenter Server
  • After the vpxa agents are upgraded, the ESXi 4.0 Update 2 hosts experience a purple screen that includes this error:NOT_IMPLEMENTED bora/vmkernel/filesystems/visorfs/visorfsObj.c:3391

Cause

This is caused by an issue in the handling of the vpxa agent upgrade.

Resolution

This issue has been resolved in 4.0 Update 3. To avoid this issue, upgrade all ESXi 4.0 Update 2 hosts to at least version ESXi 4.0 Update 3 before upgrading vCenter Server to 5.0.
ESXi 4.0 Update 3 and later versions can be downloaded from the VMware Download Center.

Enabling vCenter Server 5.0 Database Monitoring

September 27th, 2011

I stumbled across this while rummaging through the vSphere 5.0 Installation and Setup document.  Page 183 contains a small section (new in vSphere 5.0) which describes a process to enable database monitoring for Microsoft SQL Server (surrounding pages discuss enabling the same for other supported database platforms).  The SQL script provided in the documentation contains an error on the first line but I was able to adjust that and run it on the SQL 2008 R2 server in the lab.  Following is the script I ran:

use master
go
grant VIEW SERVER STATE to vcenter
go

Once access has been granted, vCenter will collect certain SQL Server health statistics and store them in the rotating vCenter profile log located by default at C:\ProgramData\VMware\VMware VirtualCenter\Logs\vpxd-profiler-xx.log.  These metrics were taken from my vCenter Server log file and serve as an example of what is being collected from the SQL Server by the vCenter Server:

–> <dbMonitoring>
–> DbMonitoring/Counter/Storage: Manually extensible data files/Unit/count/Range Type/range/RangeMin/0/RangeMax/0/Timestamp/2011-09-27T18:00:01.79Z/Value/0
–> DbMonitoring/Counter/Memory:Database pages/Unit/timesIncrease/Range Type/range/RangeMin/0/RangeMax/3/Timestamp/1970-01-01T00:00:00Z/Value/N/A
–> DbMonitoring/Counter/Storage: Peak data file storage utilization/Unit/percent/Range Type/range/RangeMin/60559224/RangeMax/90/Timestamp/2011-09-27T18:00:01.802999Z/Value/0
–> DbMonitoring/Counter/Memory:Availaable/Unit/kiloBytes/Range Type/range/RangeMin/5120/RangeMax/60559416/Timestamp/1970-01-01T00:00:00Z/Value/N/A
–> DbMonitoring/Counter/Memory:Page Life Expectancy/Unit/seconds/Range Type/range/RangeMin/300/RangeMax/60559416/Timestamp/1970-01-01T00:00:00Z/Value/N/A
–> DbMonitoring/Counter/IO:Log growths/Unit/timesIncrease/Range Type/range/RangeMin/0/RangeMax/3/Timestamp/1970-01-01T00:00:00Z/Value/N/A
–> DbMonitoring/Counter/CPU:Usage/Unit/percent/Range Type/range/RangeMin/0/RangeMax/80/Timestamp/2011-09-27T18:00:01.75Z/Value/44
–> DbMonitoring/Counter/Memory:Buffer cache hit ratio/Unit/percent/Range Type/range/RangeMin/90/RangeMax/100/Timestamp/1970-01-01T00:00:00Z/Value/N/A
–> DbMonitoring/Counter/General:User Connections/Unit/count/Range Type/range/RangeMin/255/RangeMax/60559416/Timestamp/1970-01-01T00:00:00Z/Value/N/A
–> </dbMonitoring>

Per VMware’s documentation:

vCenter Server Database Monitoring captures metrics that enable the administrator to assess the status and health of the database server. Enabling Database Monitoring helps the administrator prevent vCenter downtime because of a lack of resources for the database server. Database Monitoring for vCenter Server enables administrators to monitor the database server CPU, memory, I/O, data storage, and other environment factors for stress conditions. Statistics are stored in the vCenter Server Profile Logs. You can enable Database Monitoring for a user before or after you install vCenter Server. You can also perform this procedure while vCenter Server is running.

One thing that I noticed is that these metrics were being collected in the vCenter log files prior to running the enabling script.  I’m not sure if this is because vCenter already had the required permissions to the master database (I use SQL authentication and I didn’t explicitly grant this), or perhaps this is enabled by default in the vCenter installation routine when the database prepare script runs.

The instructions provide plenty of context but are are fairly brief and don’t identify next steps or how to harvest the collected metrics.  Perhaps the vCenter Service Health agent monitors the profile log and will alarm through vCenter.  If not, then I view this as a monitoring framework VMware provides which can tailored for specific environments.  Thresholds could be defined which trigger alerts proactively before dangers or an outage occurs.  Admittedly I’m not a DBA.  With what’s provided, I’m not sure if this provides much value above and beyond native monitoring and alerting provided by SQL Server and Perfmon.

vCenter Server 5.0 and MS SQL Database Permissions

August 20th, 2011

It’s that time again (to bring up the age old topic of Microsoft SQL database permission requirements in order to install VMware vCenter Server).  This brief article focuses on vCenter 5.0.  Permissions on the SQL side haven’t changed at all based on what was required in vSphere 4.  However, the error displayed for lacking required permissions to the MSDB System database has.  In fact, in my opinion it’s a tad misleading.

To review, the vCenter database account being used to make the ODBC connection requires the db_owner role on the MSDB System database during the installation of vCenter Server.  This facilitates the installation of SQL Agent jobs for vCenter statistic rollups.

In the example below, I’m using SQL authentication with an account named vcenter.  I purposely left out its required role on MSDB and you can see below the resulting error:

The DB user entered does not have the required permissions needed to install and configure vCenter Server with the selected DB.  Please correct the following error(s):  The database user ‘vcenter’ does not have the following privileges on the ‘vc50’ database:

EXECUTE sp_add_category

EXECUTE sp_add_job

EXECUTE sp_add_jobschedule

EXECUTE sp_add_jobserver

EXECUTE sp_add_jobstep

EXECUTE sp_delete_job

EXECUTE sp_update_job

SELECT syscategories

SELECT sysjobs

SELECT sysjobsteps

Snagit Capture

Now what I think is misleading about the error thrown is that it’s pointing the finger at missing permissions on the vc50 database.  This is incorrect.  My vcenter SQL account has db_owner permissions on the vc50 vCenter database.  The problem is actually lacking the temporary db_owner permissions on the MSDB System database at vCenter installation time as described earlier.

The steps to rectify this situation are the same as before.  Grant the vcenter account the db_owner role for the MSDB System database, install vCenter, then revoke that role when vCenter installation is complete. While we’re on the subject, the installation of vCenter Update Manager 5.0 with a Microsoft SQL back end database also requires the ODBC connection account to temporarily have db_owner permissions on the MSDB System database.  I do believe this is a new requirement in vSphere 5.0.  If you’re going to install VUM, you might as well do that first before going through the process of revoking the db_owner role.

An example of where that role is added in SQL Server 2008 R2 Management Studio is shown below:

Snagit Capture

Configure a vCenter 5.0 integrated Syslog server

July 23rd, 2011

Now that VMware offers an ESXi only platform in vSphere 5.0, there are logging decisions to be considered which were a non-issue on the ESX platform.  Particularly with boot from SAN, boot from flash, or stateless hosts where logs can’t be stored locally on the host with no scratch partition due to not having local storage.  Some shops use Splunk as a Syslog server.  Other bloggers such as Simon Long have identified in the past how to send logs to the vMA appliance.  Centralized management of anything is almost always a good thing and the same holds true for logging.

New in the vCenter 5.0 bundle is a Syslog server which can be integrated with vCenter 5.0.  I’m going to go through the installation, configuration, and then I’ll have a look at the logs.

Installation couldn’t be much easier.  I’ll highlight the main steps.  First launch the VMware Syslog Collector installation:

Snagit Capture

The setup routine will open Windows Firewall ports as necessary.  Choose the appropriate drive letter and path installation locations.  Note the second drive letter and path specifies the location of the aggregated syslog files from the hosts.  Be sure there is enough space on the drive for the log files, particularly in medium to large environments:

Snagit Capture

Choose the VMware vCenter Server installation (this is not the default type of installation):

Snagit Capture

Provide the location of the vCenter Server as well as credentials to establish the connection.  In this case I’m installing the Syslog server on the vCenter Server itself:

7-23-2011 4-14-41 PM

 

The Syslog server has the ability to accept connections on three different ports:

  1. UDP 514
  2. TCP 514
  3. Encrypted SSL 1514

There’s an opportunity to change the default listening ports but I’ll leave them as is, especially UDP 514 which is an industry standard port for Syslog communications:

Snagit Capture

Once the installation is finished, it’s ready to accept incoming Syslog connections from hosts.  You’ll notice a few new items in the vSphere Client.  First is the VMware Syslog Collector Configuration plug-in:

Snagit Capture

Next is the Network Syslog Collector applet:

Snagit Capture

It’s waiting for incoming Syslog connections:

Snagit Capture

Now I’ll a configure host to send its logs to the vCenter integrated Syslog server.  This is fairly straightforward as well and there are a few ways to do it.  I’ll identify two.

In the vCenter inventory, select the ESXi 5.0 host, navigate to the Configuration tab, then Advanced Settings under Software.  Enter the Syslog server address in the field for Syslog.global.logHost.  The format is <protocol>://<f.q.d.n>:port.  So for my example:  udp://vcenter50.boche.mcse:514.  This field allows multiple Syslog protocols and endpoints separated by commas.  I could write split the logs to additional Syslog server with this entry:  udp://vcenter50.boche.mcse:514, splunk.boche.mcse, ssl://securesyslogs.boche.mcse:1514.  In that example, logs are shipped to vcenter50.boche.mcse and splunk.boche.mcse over UDP 514, as well as to securesyslogs.boche.mcse over 1514.  Another thing to point out on multiple entries.. there is a space after each comma which appears to be required for the host to interpret multiple entries properly:

Snagit Capture

There are many other Syslog loggers options which can be tuned.  Have a look at them and configure your preferred logging appropriately.

Another method to configure and enable syslog on an ESXi 5 host would be to use esxcli.  The commands for each host look something like this:

~ # esxcli system syslog config set –loghost=192.168.110.16
~ # esxcli system syslog reload

Now I’ll ensure outbound UDP 514 is opened on the ESXi 5.0 firewall.  If the Syslog ports are closed, logs won’t make it to the Syslog server:

Snagit Capture

Back to the vCenter (Syslog) Server, you’ll see a folder for each host sending logs to the Syslog server:

Snagit Capture

And here come the logs:

Snagit Capture

The same logs are going to the Splunk server too:

7-23-2011 4-00-48 PM

This is what the logs look like in Splunk.  It’s a very powerful tool for centrally storing logs and then querying those logs using a powerful engine:

7-23-2011 4-07-53 PM

And since this host actually has local disk, and as a result a scratch partition, the logs natively go to the scratch partition:

7-23-2011 4-04-33 PM

Notice the host I configured is also displayed in the Network Syslog Collector along with the general path to the logs as well as the size of each host’s respective log file (I’ve noticed that it sometimes requires exiting the vSphere Client and logging back in before the hosts show up below):

Snagit Capture

Earlier I mentioned that I’d show a second way to configure Syslog on the ESXi host.  That method is much easier and comes by way of leveraging host profiles.  Simply create a host profile and add the Syslog configuration to the profile.  Of course this profile can be used to deploy the configuration to countless other hosts which makes it a very easy and powerful method to deploy a centralized logging configuration:

Snagit Capture

For more information, see VMware KB 2003322 Configuring syslog on ESXi 5.0.

Virtualization Wars: Episode V – VMware Strikes Back

July 12th, 2011

Snagit CaptureAt 9am PDT this morning, Paul Maritz and Steve Herrod take the stage to announce the next generation of the VMware virtualized datacenter.  Each new product and set of features are impressive in their own right.  Combine them and what you have is a major upgrade of VMware’s entire cloud infrastructure stack.  I’ll highlight the major announcements and some of the detail behind them.  In addition, the embargo and NDA surrounding the vSphere 5 private beta expires.  If you’re a frequent reader of blogs or the Twitter stream, you’re going to bombarded with information at fire-hose-to-the-face pace, starting now.

7-10-2011 4-22-46 PM

 

vSphere 5.0 (ESXi 5.0 and vCenter 5.0)

At the heart of it all is a major new release of VMware’s type 1 hypervisor and management platform.  Increased scalability and new features make virtualizing those last remaining tier 1 applications quantifiable.

7-10-2011 4-55-28 PM

Snagit Capture

ESX and the Service Console are formally retired as of this release.  Going forward, we have just a single hypervisor to maintain and that is ESXi.  Non-Windows shops should find some happiness in a Linux based vCenter appliance and sophisticated web client front end.  While these components are not 100% fully featured yet in their debut, they come close.

Storage DRS is the long awaited compliment to CPU and memory based DRS introduced in VMware Virtual Infrastructure 3.  SDRS will coordinate initial placement of VM storage in addition to keeping datastore clusters balanced (space usage and latency thresholds including SIOC integration) with or without the use of SDRS affinity rules.  Similar to DRS clusters, SDRS enabled datastore clusters offer maintenance mode functionality which evacuates (Storage vMotion or cold migration) registered VMs and VMDKs (still no template migration support, c’mon VMware) off of a datastore which has been placed into maintenance mode.  VMware engineers recognize the value of flexibility, particularly when it comes to SDRS operations where thresholds can be altered and tuned on a schedule basis. For instance, IO patterns during the day when normal or peak production occurs may differ from night time IO patterns when guest based backups and virus scans occur.  When it comes to SDRS, separate thresholds would be preferred so that SDRS doesn’t trigger based on inappropriate thresholds.

Profile-Driven Storage couples storage capabilities (VASA automated or manually user-defined) to VM storage profile requirements in an effort to meet guest and application SLAs.  The result is the classification of a datastore, from a guest VM viewpoint, of Compatible or Incompatible at the time of evaluating VM placement on storage.  Subsequently, the location of a VM can be automatically monitored to ensure profile compliance.

7-10-2011 5-29-56 PM

Snagit CaptureI mentioned VASA previously which is a new acronym for vSphere Storage APIs for Storage Awareness.  This new API allows storage vendors to expose topology, capabilities, and state of the physical device to vCenter Server management.  As mentioned earlier, this information can be used to automatically populate the capabilities attribute in Profile-Driven Storage.  It can also be leveraged by SDRS for optimized operations.

The optimal solution is to stack the functionality of SDRS and Profile-Driven Storage to reduce administrative burden while meeting application SLAs through automated efficiency and optimization.

7-10-2011 7-34-31 PM

Snagit CaptureIf you look closely at all of the announcements being made, you’ll notice there is only one net-new release and that is the vSphere Storage Appliance (VSA).  Small to medium business (SMB) customers are the target market for the VSA.  These are customers who seek some of the enterprise features that vSphere offers like HA, vMotion, or DRS but lack the fibre channel SAN, iSCSI, or NFS shared storage requirement.  A VSA is deployed to each ESXi host which presents local RAID 1+0 host storage as NFS (no iSCSI or VAAI/SAAI support at GA release time).  Each VSA is managed by one and only one vCenter Server. In addition, each VSA must reside on the same VLAN as the vCenter Server.  VSAs are managed by the VSA Manager which is a vCenter plugin available after the first VSA is installed.  It’s function is to assist in deploying VSAs, automatically mounting NFS exports to each host in the cluster, and to provide monitoring and troubleshooting of the VSA cluster.

7-10-2011 8-03-42 PM

Snagit CaptureYou’re probably familiar with the concept of a VSA but at this point you should start to notice the differences in VMware’s VSA: integration.  In addition, it’s a VMware supported configuration with “one throat to choke” as they say.  Another feature is resiliency.  The VSAs on each cluster node replicate with each other and if required will provide seamless fault tolerance in the event of a host node failure.  In such a case, a remaining node in the cluster will take over the role of presenting a replica of the datastore which went down.  Again, this process is seamless and is accomplished without any change in the IP configuration of VMkernel ports or NFS exports.  With this integration in place, it was a no-brainer for VMware to also implement maintenance mode for VSAs.  MM comes in to flavors: Whole VSA cluster MM or Single VSA node MM.

VMware’s VSA isn’t a freebie.  It will be licensed.  The figure below sums up the VSA value proposition:

7-10-2011 8-20-38 PM

High Availability (HA) has been enhanced dramatically.  Some may say the version shipping in vSphere 5 is a complete rewrite.  What was once foundational Legato AAM (Automated Availability Manager) is now finally evolving to scale further with vSphere 5.  Some of the new features include elimination of common issues such as DNS resolution, node communication between management network as well as storage along with failure detection enhancement.  IPv6 support, consolidated logging into one file per host, enhanced UI and enhanced deployment mechanism (as if deployment wasn’t already easy enough, albeit sometimes error prone).

7-10-2011 3-27-11 PMFrom an architecture standpoint, HA has changed dramatically.  HA has effectively gone from five (5) fail over coordinator hosts to just one (1) in a Master/Slave model.  No more is there a concept of Primary/Secondary HA hosts, however if you still want to think of it that way, it’s now one (1) primary host (the master) and all remaining hosts would be secondary (the slaves).  That said, I would consider it a personal favor if everyone would use the correct version specific terminology – less confusion when assumptions have to be made (not that I like assumptions either, but I digress).

The FDM (fault domain manager) Master does what you traditionally might expect: monitors and reacts to slave host & VM availability.  It also updates its inventory of the hosts in the cluster, and the protected VMs each time a VM power operation occurs.

Slave hosts have responsibilities as well.  They maintain a list of powered on VMs.  They monitor local VMs and forward significant state changes to the Master. They provide VM health monitoring and any other HA features which do not require central coordination.  They monitor the health of the Master and participate in the election process should the Master fail (the host with the most datastores and then the lexically highest moid [99>100] wins the election).

Another new feature in HA the ability to leverage storage to facilitate the sharing of stateful heartbeat information (known as Heartbeat Datastores) if and when management network connectivity is lost.  By default, vCenter picks two datastores for backup HA communication.  The choices are made by how many hosts have connectivity and if the storage is on different arrays.  Of course, a vSphere administrator may manually choose the datastores to be used.  Hosts manipulate HA information on the datastore based on the datastore type. On VMFS datastores, the Master reads the VMFS heartbeat region. On NFS datastores, the Master monitors a heartbeat file that is periodically touched by the Slaves. VM availability is reported by a file created by each Slave which lists the powered on VMs. Multiple Master coordination is performed by using file locks on the datastore.

As discussed earlier, there are a number of GUI enhancements which were put in place to monitor and configure HA in vSphere 5.  I’m not going to go into each of those here as there are a number of them.  Surely there will be HA deep dives in the coming months.  Suffice it to say, they are all enhancements which stack to provide ease of HA management, troubleshooting, and resiliency.

Another significant advance in vSphere 5 is Auto Deploy which integrates with Image Builder, vCenter, and Host Profiles.  The idea here is centrally managed stateless hardware infrastructure.  ESXi host hardware PXE boots an image profile from the Auto Deploy server.  Unique host configuration is provided by an answer file or VMware Host Profiles (previously an Enterprise Plus feature).  Once booted, the host is added to vCenter host inventory.  Statelessness is not necessarily a newly introduced concept, therefore, the benefits are strikingly familiar to say ESXi boot from SAN: No local boot disk (right sized storage, increased storage performance across many spindles), scales to support of many hosts, decoupling of host image from host hardware – statelessness defined.  It may take some time before I warm up to this feature. Honestly, it’s another vCenter dependency, this one quite critical with the platform services it provides.

For a more thorough list of anticipated vSphere 5 “what’s new” features, take a look at this release from virtualization.info.

 

vCloud Director 1.5

Snagit CaptureUp next is a new release of vCloud Director version 1.5 which marks the first vCD update since the product became generally available on August 30th, 2010.  This release is packed with several new features.

Fast Provisioning is the space saving linked clone support missing in the GA release.  Linked clones can span multiple datastores and multiple vCenter Servers. This feature will go a long way in bridging the parity gap between vCD and VMware’s sun setting Lab Manager product.

3rd party distributed switch support means vCD can leverage virtualized edge switches such as the Cisco Nexus 1000V.

The new vCloud Messages feature connects vCD with existing AMQP based IT management tools such as CMDB, IPAM, and ticketing systems to provide updates on vCD workflow tasks.

vCD originally supported Oracle 10g std/ent Release 2 and 11g std/ent.  vCD now supports Microsoft SQL Server 2005 std/ent SP4 and SQL Server 2008 exp/std/ent 64-bit.  Oracle 11g R2 is now also supported.  Flexibility. Choice.

vCD 1.5 adds support for vSphere 5 including Auto Deploy and virtual hardware version 8 (32 vCPU and 1TB vRAM).  In this regard, VMware extends new vSphere 5 scalability limits to vCD workloads.  Boiled down: Any tier 1 app in the private/public cloud.

Last but not least, vCD integration with vShield IPSec VPN and 5-tuple firewall capability.

vShield 5.0

VMware’s message about vShield is that it has become a fundamental component in consolidated private cloud and multi-tenant VMware virtualized datacenters.  While traditional security infrastructure can take significant time and resources to implement, there’s an inherent efficiency in leveraging security features baked into and native to the underlying hypervisor.

Snagit Capture

There are no changes in vShield Endpoint, however, VMware has introduced static routing in vShield Edge (instead of NAT) for external connections and certificate-based VPN connectivity.

 

Site Recovery Manager 5.0

Snagit CaptureAnother major announcement from VMware is the introduction of SRM 5.0.  SRM has already been quite successful in providing simple and reliable DR protection for the VMware virtualized datacenter.  Version 5 boasts several new features which enhance functionality.

Replication between sites can be achieved in a more granular per-VM (or even sub-VM) fashion, between different storage types, and it’s handled natively by vSphere Replication (vSR).  More choice in seeding of the initial full replica. The result is a simplified RPO.

Snagit Capture

Another new feature in SRM is Planned Migration which facilitates the migration protected VMs from site to site before a disaster actually occurs.  This could also be used in advance of datacenter maintenance.  Perhaps your policy is to run your business 50% of the time from the DR site.  The workflow assistance makes such migrations easier.  It’s a downtime avoidance mechanism which makes it useful in several cases.

Snagit CaptureFailback can be achieved once the VMs are re protected at the recovery site and the replication flow is reversed.  It’s simply another push of the big button to go the opposite direction.

Feedback from customers has influenced UI enhancements. Unification of sites into one GUI is achieved without Linked Mode or multiple vSphere Client instances. Shadow VMs take on a new look at the recovery site. Improved reporting for audits.

Other miscellaneous notables are IPv6 support, performance increase in guest VM IP customization, ability to execute scripts inside the guest VM (In guest callouts), new SOAP based APIs on the protected and recovery sides, and a dependency hierarchy for protected multi tiered applications.

 

In summary, this is a magnificent day for all of VMware as they have indeed raised the bar with their market leading innovation.  Well done!

 

VMware product diagrams courtesy of VMware

Star Wars diagrams courtesy of Wookieepedia, the Star Wards Wiki