Posts Tagged ‘DRS’

VMware Update Manager Becomes Self-Aware

March 4th, 2010

@Mikemohr on Twitter tonight said it best:

“Haven’t we learned from Hollywood what happens when the machines become self-aware?”

I got a good chuckle.  He took my comment of VMware becoming “self-aware” exactly where I wanted it to go.  A reference to The Terminator series of films in which a sophisticated computer defense system called Skynet becomes self-aware and things go downhill for mankind from there.

Metaphorically speaking in today’s case, Skynet is VMware vSphere and mankind is represented by VMware vSphere Administrators.

During an attempt to patch my ESX(i)4  hosts, I received an error message (click the image for a larger version):

At that point, the remediation task fails and the host is not patched.  The VUM log file reflects the same error in a little more detail:

[2010-03-04 14:58:04:690 'JobDispatcher' 3020 INFO] [JobDispatcher, 1616] Scheduling task VciHostRemediateTask{675}
[2010-03-04 14:58:04:690 'JobDispatcher' 3020 INFO] [JobDispatcher, 354] Starting task VciHostRemediateTask{675}
[2010-03-04 14:58:04:690 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 INFO] [vciTaskBase, 534] Task started…
[2010-03-04 14:58:04:908 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 INFO] [vciHostRemediateTask, 680] Host host-112 scheduled for patching.
[2010-03-04 14:58:05:127 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 INFO] [vciHostRemediateTask, 691] Add remediate host: vim.HostSystem:host-112
[2010-03-04 14:58:13:987 'InventoryMonitor' 2180 INFO] [InventoryMonitor, 427] ProcessUpdate, Enter, Update version := 15936
[2010-03-04 14:58:13:987 'InventoryMonitor' 2180 INFO] [InventoryMonitor, 460] ProcessUpdate: object = vm-2642; type: vim.VirtualMachine; kind: 0
[2010-03-04 14:58:17:533 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 WARN] [vciHostRemediateTask, 717] Skipping host solo.boche.mcse as it contains VM that is running VUM or VC inside it.
[2010-03-04 14:58:17:533 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 INFO] [vciHostRemediateTask, 786] Skipping host 0BC5A140, none of upgrade and patching is supported.
[2010-03-04 14:58:17:533 'VciHostRemediateTask.VciHostRemediateTask{675}' 2676 ERROR] [vciHostRemediateTask, 230] No supported Hosts found for Remediate.
[2010-03-04 14:58:17:737 'VciRemediateTask.RemediateTask{674}' 2676 INFO] [vciTaskBase, 583] A subTask finished: VciHostRemediateTask{675}

Further testing in the lab revealed that this condition will be caused with a vCenter VM and/or a VMware Update Manager (VUM) VM. I understand from other colleagues on the Twitterverse that they’ve seen the same symptoms occur with patch staging.

The work around is to manually place the host in maintenance mode, at which time it has no problem whatsoever evacuating all VMs, including infrastructure VMs.  At that point, the host in maintenance mode can be remediated.

VMware Update Manager has apparently become self-aware in that it detects when its infrastructure VMs are running on the same host hardware which is to be remediated.  Self-awareness in and of itself isn’t bad, however, its feature integration is.  Unfortunately for the humans, this is a step backwards in functionality and a reduction in efficiency for a task which was once automated.  Previously, a remediation task had no problem evacuating all VMs from a host, infrastructure or not. What we have now is… well… consider the following pre and post “self-awareness” remediation steps:

Pre “self-awareness” remediation for a 6 host cluster containing infrastructure VMs:

  1. Right click the cluster object and choose Remediate
  2. Hosts are automatically and sequentially placed in maintenance mode, evacuated, patched, rebooted, and brought out of maintenance mode

Post “self-awareness” remediation for a 6 host cluster containing infrastructure VMs:

  1. Right click Host1 object and choose Enter Maintenance Mode
  2. Wait for evacutation to complete
  3. Right click Host1 object and choose Remediate
  4. Wait for remediation to complete
  5. Right click Host1 object and choose Exit Maintenance Mode
  6. Right click Host2 object and choose Enter Maintenance Mode
  7. Wait for evacutation to complete
  8. Right click Host2 object and choose Remediate
  9. Wait for remediation to complete
  10. Right click Host2 object and choose Exit Maintenance Mode
  11. Right click Host3 object and choose Enter Maintenance Mode
  12. Wait for evacutation to complete
  13. Right click Host3 object and choose Remediate
  14. Wait for remediation to complete
  15. Right click Host3 object and choose Exit Maintenance Mode
  16. Right click Host4 object and choose Enter Maintenance Mode
  17. Wait for evacutation to complete
  18. Right click Host4 object and choose Remediate
  19. Wait for remediation to complete
  20. Right click Host4 object and choose Exit Maintenance Mode
  21. Right click Host5 object and choose Enter Maintenance Mode
  22. Wait for evacutation to complete
  23. Right click Host5 object and choose Remediate
  24. Wait for remediation to complete
  25. Right click Host5 object and choose Exit Maintenance Mode
  26. Right click Host6 object and choose Enter Maintenance Mode
  27. Wait for evacutation to complete
  28. Right click Host6 object and choose Remediate
  29. Wait for remediation to complete
  30. Right click Host6 object and choose Exit Maintenance Mode

It’s Saturday and your kids want to go to the park. Do the math.

Update 5/5/10: I received this response back on 3/5/10 from VMware but failed to follow up with finding out if it was ok to share with the public.  I’ve received the blessing now so here it is:

[It] seems pretty tactical to me. We’re still trying to determine if this was documented publicly, and if not, correct the documentation and our processes.

We introduced this behavior in vSphere 4.0 U1 as a partial fix for a particular class of problem. The original problem is in the behavior of the remediation wizard if the user has chosen to power off or suspend virtual machines in the Failure response option.

If a stand-alone host is running a VM with VC or VUM in it and the user has selected those options, the consequences can be drastic – you usually don’t want to shut down your VC or VUM server when the remediation is in progress. The same applies to a DRS disabled cluster.

In DRS enabled cluster, it is also possible that VMs could not be migrated to other hosts for configuration or other reasons, such as a VM with Fault Tolerance enabled. In all these scenarios, it was possible that we could power off or suspend running VMs based on the user selected option in the remediation wizard.

To avoid this scenario, we decided to skip those hosts totally in first place in U1 time frame. In a future version of VUM, it will try to evacuate the VMs first, and only in cases where it can’t migrate them will the host enter a failed remediation state.

One work around would be to remove such a host from its cluster, patch the cluster, move the host back into the cluster, manually migrate the VMs to an already patched host, and then patch the original host.

It would appear VMware intends to grant us back some flexibility in future versions of vCenter/VUM.  Let’s hope so. This implementation leaves much to be desired.

Update 5/6/10: LucD created a blog post titled Counter the self-aware VUM. In this blog post you’ll find a script which finds the ESX host(s) that is/are running the VUM guest and/or the vCenter guest and will vMotion the guest(s) to another ESX host when needed.

Anti-affinity rules are not honored in cluster with more than 2 virtual machines

March 27th, 2009

We can put a man on the moon and we can hot migrate virtual machines with SMP and gigs of RAM, but we can’t create anti-affinity rules with three or more VMs. This has been a thorn in my side since 2006, long before I requested it fixed in February 2007 on the VMTN Product and Feature Suggestions forum.

VMware updated KB article 1006473 on 3/26 outlining anti-affinity rule behavior when using three or more VMs:

“This is expected behavior, as anti affinity rules can be set only for 2 virtual machines.

When a third virtual machine is added any rule becomes disabled (with 2.0.2 or earlier).

There has been a slight change in behavior with VirtualCenter 2.5, wherein input validation occurs, where a third virtual machine added produces a warning message indicating a maximum of two virtual machines only can be added to this rule.

To workaround this, create more rules to cover all of the combinations of virtual machines.

For example, create rules for (VM1 & VM2), then (VM2 & VM3), and (VM1 & VM3).”

That last sentence is what has been burning my cookies for the longest time. In my last environment, I had several NLB VMs which could not be on the same host for load balancing and redundancy purposes. Rather than create a minimum amount of rules to intelligently handle all of the VMs, I was left with no choice but to create several rules for each potentially deadly combination.

Work harder, not smarter. Come on VMware.

Putting some money where my VMware mouth is

February 15th, 2009

I came home this afternoon from a Valentines Day wedding in North Dakota to find that my one and only workstation in the house (other than the work laptop) had a belated Valentines Day present for me:  It would no longer boot up.  No Windows.  No POST.  No video signal.  No beep codes.

DSC00473

I was feeling adventurous and I needed a relatively quick and inexpensive fix.  I decided to take one of the thin clients I received from Chip PC via VMworld 2008 plus a freshly deployed Windows XP template on the Virtual Infrastructure and promote this VDI solution to main household workstation status for the next few weeks.  The timing on this could not have been better.  The upcoming Minnesota VMUG on Wednesday March 11th is going to be VDI focused.  I guess I’ll have more to contribute at that meeting than I had originally planned on.  With any luck, Chip PC will be in attendance and we can discuss some things.

The thin client:  Chip PC Xtreme PC NG-6600 (model: EX6600N, part number: CPN04209).

Specs:

  • RMI – Alchemy Au 1550, 500MHz RISC processor (equivalent to 1.2GHz x86 TC processors)
  • 128MB DDR RAM
  • 64MB Disk-On-Chip with TFS
  • 128-bit 3D graphics acceleration engine with separate 2x8MB display memory SDRAM
  • Dual DVI ports each supporting 1920×1200 16-bit color.  Supports quad displays up to 1024×768
  • Audio I/O
  • 4 USB 2.0 ports
  • 10/100 Ethernet NIC
  • Power draw:  3.5W work mode, .35W sleep mode
  • OS:  Enhanced Microsoft Windows CE (6.00 R2 Professional)
  • Integrated applications (Plugins – note plugins are downloaded at no charge from the Chip PC website and are not, by default, embedded or included with the thin client – just enough OS concept)
    • Citrix ICA
    • RDP 5.2 and 6
    • Internet Explorer 6.0
    • VDM Client
    • VDI Client
    • Media Player
    • VPN Client
    • Ultra VNC
    • Pericom (Team Talk) Terminal Emulation
    • LPD Printer
    • ELO Touch Screen
  • Compatibility
    • Citrix WinFrame, MetaFrame, and Presentation Server 4.5
    • MS Windows Server 2000/2003
    • MS Windows NT 4.0 – TS Edition
    • VMware Virtual Desktop Interface using RDP
  • Full support of both local and network printers:  LPD, LPR, SMB, LPT, USB, COM
  • Support for USB mass storage (thumb drives – deal breaker for me)
  • Support for wireless USB NIC (not included)
  • etc. etc. etc.

DSC00474

Truth be told, this isn’t really a promotion in the sense that I had already performed extensive testing on it.  I hadn’t even taken the thing out of the box yet other than to register it for the extended warranty.  I’ve had only a little experience on these devices as I have an identical unit in the lab at work which I’ve spent a total of 30 minutes on.  To the best of my knowledge, this is the Cadillac unit from Chip PC.

I don’t have any fancy VDI brokering solutions here in the home lab and I’m not up to speed on VMware View so the plan is to leverage Thin Client -> RDP -> Windows XP desktop on VMware Virtual Infrastructure 3.5.

I think this is going to be a good test.  A trial by fire of VDI (granted, a fairly simple variation).  I spout a lot about the goodness that is VMware and now I’ll be eating some of my own dog food from the desktop workspace.  I’m a power user.  I’ve got my standard set of applications that I use on a regular basis and I’ve got a few hardware devices such as a flatbed scanner, iPod Shuffle, USB thumb drives, digital cameras, etc.  I should know within a short period of time whether or not this will be a viable solution for the short term.  Also add to the mix my wife’s career.  She uses our home computer to access her servers at work on a fairly regular basis.  Lastly, my wife sometimes works from home while I’m away at the office or traveling.  It’s going to be critical that this solution stays up and running and continues to be viable for my wife while I’m remote and not able to provide computer support.

So where am I at now?  I’ve got the VDI session patched along with my most critical applications installed to get me by in the short term:  Quicken, SnagIt, network printer, and Citrix clients.  I’ll install MS Office later but for now I can use the published application version of Office on my virtualized Citrix servers.  I’ve been listening some Electro House on www.di.fm on the VDI and music quality is as good as it was on my PC before it died, although it doesn’t completely drive my 5.1 surround in the den.  Pretty sure I’m getting 2.1 right now.  Oh well, at least the sub is thumpin.  Shhhh… the thin client is sleeping:

DSC00478

So what else?  As long as I’m throwing caution to the wind, I think it’s time to take the training wheels off VMware DPM (Distributed Power Management) and see what happens in a two node cluster.

2-15-2009 10-53-10 PM

Based on the environment below, what do you think will happen?  CPU load is very low, however, memory utilization is close to being over committed in a one host scenario. Will DPM kick in?

2-15-2009 10-53-59 PM

Most of my infrastructure at home is virtual including all components involving internet access both incoming and outgoing.  If the blog becomes unavailable for a while in the near future, I’ll give you one guess as to what happened.  :)

No matter what the outcome, vmwarenews.de aka Roman Haug – you are no longer welcomed to republish my blog articles.  Albeit flattering, the fact that you have not even so much as asked in the first place has officially pissed me off.  You publish my content as if it were your own, written by you as indicated by the “by Roman” header preceeding each duplicated post.  Please remove my content from your site and refrain from syndicating my content going forward.  Thank you in advance.

Update: Roman Haug has offered an apology and I believe we have reached an understanding.  Thank you Roman!

Great iSCSI info!

January 27th, 2009

I’ve been using Openfiler 2.2 iSCSI in the lab for a few years with great success as a means for shared storage. Shared storage with VMware ESX/ESXi (along with the necessary licensing) allows us great things like VMotion, DRS, HA, etc. I’ve recently been kicking the tires of Openfiler 2.3 and have been anxious to implement partly due to the ease in its menu driven NIC bonding feature which I wanted to leverage for maximum disk I/O throughput.

Coincidentally, just yesterday a few of the big brains in the storage industry got together and published what I consider one of the best blog entries in the known universe. Chad Sakac and David Black (EMC), Andy Banta (VMware), Vaughn Stewart (NetApp), Eric Schott (Dell/EqualLogic), Adam Carter (HP/Lefthand) all conspired.

One of the iSCSI topics they cover is link aggregation over Ethernet. I read and re-read this section with great interest. My current swiSCSI configuration in the lab consists of a single 1Gb VMKernel NIC (along with a redundant failover NIC) connected to a single 1Gb NIC in the Openfiler storage box having a single iSCSI target with two LUNs. I’ve got more 1Gb NICs that I can add to the Openfiler storage box, so my million dollar question was “will this increase performance?” The short answer is NO with my current configuration. Although the additional NIC in the Openfiler box will provide a level of hardware redundancy, due to the way ESX 3.x iSCSI communicates with the iSCSI target, only a single Ethernet path will be used for by ESX to communicate to the single target backed by both LUNs.

However, what I can do to add more iSCSI bandwidth is to add the 2nd Gb NIC in the Openfiler box along with an additional IP address, and then configure an additional iSCSI target so that each LUN is mapped to a separate iSCSI target.  Adding the additional NIC in the Openfiler box for hardware redundancy is a no brainer and I probably could have done that long ago, but as far as squeezing more performance out of my modest iSCSI hardware, I’m going to perform some disk I/O testing to see if the single Gb NIC is a disk I/O bottleneck.  I may not have enough horsepower under the hood of the Openfiler box to warrant going through the steps of adding additional iSCSI targets and IP addressing.

A few of the keys I extracted from the blog post are as follows:

“The core thing to understand (and the bulk of our conversation – thank you Eric and David) is that 802.3ad/LACP surely aggregates physical links, but the mechanisms used to determine the whether a given flow of information follows one link or another are critical.

Personally, I found this doc very clarifying.: http://www.ieee802.org/3/hssg/public/apr07/frazier_01_0407.pdf

You’ll note several key things in this doc:

* All frames associated with a given “conversation” are transmitted on the same link to prevent mis-ordering of frames. So what is a “conversation”? A “conversation” is the TCP connection.
* The link selection for a conversation is usually done by doing a hash on the MAC addresses or IP address.
* There is a mechanism to “move a conversation” from one link to another (for loadbalancing), but the conversation stops on the first link before moving to the second.
* Link Aggregation achieves high utilization across multiple links when carrying multiple conversations, and is less efficient with a small number of conversations (and has no improved bandwith with just one). While Link Aggregation is good, it’s not as efficient as a single faster link.”

the ESX 3.x software initiator really only works on a single TCP connection for each target – so all traffic to a single iSCSI Target will use a single logical interface. Without extra design measures, it does limit the amount of IO available to each iSCSI target to roughly 120 – 160 MBs of read and write access.

“This design does not limit the total amount of I/O bandwidth available to an ESX host configured with multiple GbE links for iSCSI traffic (or more generally VMKernel traffic) connecting to multiple datastores across multiple iSCSI targets, but does for a single iSCSI target without taking extra steps.

Question 1: How do I configure MPIO (in this case, VMware NMP) and my iSCSI targets and LUNs to get the most optimal use of my network infrastructure? How do I scale that up?

Answer 1: Keep it simple. Use the ESX iSCSI software initiator. Use multiple iSCSI targets. Use MPIO at the ESX layer. Add Ethernet links and iSCSI targets to increase overall throughput. Ser your expectation for no more than ~160MBps for a single iSCSI target.

Remember an iSCSI session is from initiator to target. If use multiple iSCSI targets, with multiple IP addresses, you will use all the available links in aggregate, the storage traffic in total will load balance relatively well. But any individual one target will be limited to a maximum of single GbE connection’s worth of bandwidth.

Remember that this also applies to all the LUNs behind that target. So, consider that as you distribute the LUNs appropriately among those targets.

The ESX initiator uses the same core method to get a list of targets from any iSCSI array (static configuration or dynamic discovery using the iSCSI SendTargets request) and then a list of LUNs behind that target (SCSI REPORT LUNS command).”

Question 4: Do I use Link Aggregation and if so, how?

Answer 4: There are some reasons to use Link Aggregation, but increasing a throughput to a single iSCSI target isn’t one of them in ESX 3.x.

What about Link Aggregation – shouldn’t that resolve the issue of not being able to drive more than a single GbE for each iSCSI target? In a word – NO. A TCP connection will have the same IP addresses and MAC addresses for the duration of the connection, and therefore the same hash result. This means that regardless of your link aggregation setup, in ESX 3.x, the network traffic from an ESX host for a single iSCSI target will always follow a single link.

For swiSCSI users, they also mention some cool details about what’s coming in the next release of ESX/ESXi. Those looking for more iSCSI performance will want to pay attention. 10Gb Ethernet is also going to be a game changer, further threatening fibre channel SAN technologies.

I can’t stress enough how neat and informative this article is. To boot, technology experts from competing storage vendors pooled their knowledge for the greater good. That’s just awesome!

Make VirtualCenter highly available with VMware Virtual Infrastructure

November 17th, 2008

A few days ago I posted some information on how to make VirtualCenter highly available with Microsoft Cluster Services.

Monday Night Football kickoff is coming up but I wanted follow up quickly with another option (as suggested by Lane Leverett): Deploy the VirtualCenter Management Server (VCMS) on a Windows VM hosted on a VMware Virtual Infrastructure cluster. Why is this a good option? Here are a few reasons:

  1. It’s fully supported by VMware.
  2. You probably already have a VI cluster in your environment you can leverage. Hit the ground running without spending the time to set up MSCS.
  3. Removing MSCS removes a 3rd party infrastructure complexity and dependency which requires an advanced skill set to support.
  4. Removing MSCS removes at least one Windows Server license cost and also removes the need for the more expensive Windows Enterprise Server licensing and the special hardware needs required by MSCS.
  5. Green factor: Let VCMS leverage the use of VMware Distributed Power Management (DPM).

How does it work? It’s pretty simple. A virtualized VCMS shares the same advantages any other VM inherently has when running on a VMware cluster:

  1. Resource balancing of the four food groups (vProcessor, vRAM, vDisk, and vNIC) through VMware Distributed Resource Scheduler (DRS) technology
  2. Maximum uptime and quick recovery via VMware High Availability (HA) in the event of a VI host failure or isolation condition (yes, HA will still work if the VCMS is down. HA is a VI host agent)
  3. Maximum uptime and quick recovery via VMware High Availability (HA) in the event of a VMware Tools heartbeat failure (ie. the guest OS croaks)
  4. Ability to perform host maintenance without downtime of the VCMS

A few things to watch out for (I’ve been there and done that, more than once):

  1. If you’re going to virtualize the VCMS, be sure you do so on a cluster with the necessary licensed options to support the benefits I outlined above (DRS, HA, etc.) This means VI Enterprise licensing is required (see the licensing/pricing chart on page 4 of this document). I don’t want to hide the fact that a premium is paid for VI Enterprise licensing, but as I pointed out above, if you’ve already paid for it, the bolt ons are unlimited use so get more use out of them.
  2. If your VCMS (and Update manager) database is located on the VCMS, be sure to size your virtual hardware appropriately. Don’t go overboard though. From a guest OS perspective, it’s easier to grant additional virtual resources from the four food groups than it is to retract them.
  3. If you have a power outage and your entire cluster goes down (and your VCMS along with it), it can be difficult to get things back on their feet while you don’t have the the use of the VCMS. Particularly if you’ve lost the use of other virtualized infrastructure components such as Microsoft Active Directory. Initially it’s going to be command line city so brush up on your CLI. It really all depends on how badly the situation is once you get the VI hosts back up. One example I ran into is host A wouldn’t come back up. Host B wasn’t the registered owner of the VM I needed to bring up. This requires running the vmware-cmd command to register the VM and bring it up on host B.

Well, I missed the first few minutes of Monday Night Football, but everyone who reads (tolerates) my ramblings is totally worth it.

Go forth and virtualize!