IBM x3850 M2 shows 8 sockets on ESXi 4.1

December 9th, 2010 by jason No comments »

Working with an IBM x3850 M2, I noticed VMware ESXi 4.1 was reporting 8 processor sockets when I know this model has only 4 sockets.  It was easily noticable as I ran out of ESX host licensing prematurely.  The problem is also reported with the IBM x3950 M2 in this thread.

SnagIt Capture

SnagIt Capture

Here’s the fix:  Reboot the host and flip a setting in the BIOS.

POST -> F1 -> Advanced Setup -> CPU Options -> Clustering Technology. Toggle the Clustering Technology configuration from Logical Mode to Physical Mode.

After the above change is made, sanity is restored in that ESXi 4.1 will properly see 4 sockets and licenses will be consumed appropriately.

SnagIt Capture

SnagIt Capture

Memory Compression Video

December 9th, 2010 by jason No comments »

Vladan SEGET created a blog post on VMware ESX(i) 4.1 Memory Compression.  In his post, he linked to a fantastically simple vmwaretv video demonstration  of memory compression in action compared to a hypervisor with no memory compression enabled.  For anyone looking for the tool used in the video to perform your own memory compression testing but cannot find it, it’s “around”.  Let me know and I might be able to help you find it.

I was going to update my memory compression blog post crediting Vladan and embedding the video, but sadly, I have no memory compression blog post yet!  So instead, I send you to Vladan’s ESX Virtualization blog using the link above.

Note to self: create a memory compression blog post.

VMware vSphere 4.1 HA and DRS technical deepdive arrival

December 7th, 2010 by jason No comments »

IMG01201-20101207-1659

I think Eric “Scoop” Sloof was the first to announce this yesterday, complete with a video and everything! Come on Eric, let some of the other bloggers have your scraps. 😎

I received a copy of a brand new book hot off the presses titled VMware vSphere 4.1 HA and DRS technical deepdive by Duncan Epping and Frank Denneman.  Having just received it tonight, of course I haven’t had time to finish reading it yet.  This is the pre-game party blog post.  Just by thumbing through the pages, I’m going to draw a few conclusions.  I’ll see if I’m right by the time I actually finish reading the book.

  1. 224 pages and 18 chapters in length.  I’ve seen entire virtual infrastructure books which have been written in as many or less pages than this.  And this book covers just HA and DRS.
    Conclusion: Even factoring in a fair amount of diagrams, this will be the most comprehensive HA and DRS handbook in existence.
  2. HA and DRS are perhaps two of the most misunderstood and misinterpreted technologies in VMware’s suite of virtual infrastructure offerings.  What exactly is confusing about these tools?  First, they are both set-it-and-forget-it automation.  The technologies will more or less “just work” out of the box.  This simplicity bestows an overwhelming amount of confidence in cluster configuration because the complexity is masked by an easy to use interface.
    Conclusion: There’s a lot going on under the hood in both HA and DRS that administrators should know about to properly configure and tune their environment.  The detail this book goes into should rock your world.
  3. This book covers DPM.
    Conclusion: That is good.
  4. There are many great looking diagrams and flowcharts.
    Conclusion: Very helpful in reinforcing what’s written in detail.

I look forward to relaxing with this book while on vacation the rest of this week.  Nice job from what I’ve seen so far guys!

You can read a review, write a review, or purchase this book on Amazon’s web site here.

Old Games Revisited

December 1st, 2010 by jason No comments »

I got the bug tonight to try one of my old PC games.  I still have several of them on my hard drive dating back to the early to mid 1990’s.  Each time I re-image PC, I make sure that I preserve these games by backing up and restoring their directory structures. 

I wasn’t sure if they would work under Windows 7 but I decided to give it a try.  I made a few attempts to get Doom II launched using various compatibility mode settings but none worked. 

When that failed, I quickly stumbled on skulltag.com.  It’s a free Windows download which lets you play Doom and Doom II on modern Windows platforms.  Not only that, you can play online with other players from the internet.  I downloaded and installed the software and I was literally playing online with another player within a minute.

The following videos bring back a lot of great memories of modem and LAN gaming with old friends in my 20’s and are nothing short of amazing!

Doom II finished in 14:41

 

Quake finished in 17:38

Quake 2 finished in 21:06

Flow Control

November 29th, 2010 by jason No comments »

Thanks to the help from blog sponsorship, I’m able to maintain a higher performing lab environment than I ever had been up until this point.  One area which I hadn’t invested much in, at least from a lab standpoint, is networking.  In the past, I’ve always had some sort of small to mid density unmanageable Ethernet switch.  And this was fine.  Household name brand switches like Netgear and SMC from Best Buy and NewEgg performed well enough and survived for years in the higher temperature lab environment.  Add to that, by virtue of being unmanaged, they were plug and play.  No time wasted fighting a mis configured network. 

I recently picked up a 3Com SuperStack 3 Switch 3870 (48 1GbE ports).  It’s not 10GbE but it does fit my budget along with a few other networking nice-to-haves like VLANs and Layer 3 routing.  Because this switch is managed, I can now apply some best practices from the IP based storage realm.  One of those best practices is configuring Flow Control for VMware vSphere with network storage.  This blog post is mainly to record some pieces of information I’ve picked up along the way and to open a dialog with network minded readers who may have some input.

So what is network Flow Control? 

NetApp defines Flow Control in TR-3749 as “the process of managing the rate of data transmission between two nodes to prevent a fast sender from over running a slow receiver.”  NetApp goes on to advise that Flow Control can be set at the two endpoints (ESX(i) host level and the storage array level) and at the Ethernet switch(es) in between.

Wikipedia is in agreement with the above and adds more meat to the discussion including the following “The overwhelmed network element will send a PAUSE frame, which halts the transmission of the sender for a specified period of time. PAUSE is a flow control mechanism on full duplex Ethernet link segments defined by IEEE 802.3x and uses MAC Control frames to carry the PAUSE commands. The MAC Control opcode for PAUSE is 0X0001 (hexadecimal). Only stations configured for full-duplex operation may send PAUSE frames.

What are network Flow Control best practices as they apply to VMware virtual infrastructure with NFS or iSCSI network storage?

Both NetApp and EMC agree that Flow Control should be enabled in a specific way at the endpoints as well as at the Ethernet switches which support the flow of traffic:

  • Endpoints (that’s the ESX(i) hosts and the storage arrays) should be configured with Flow Control send/tx on, and receive/rx off.
  • Supporting Ethernet switches should be configured with Flow Control “Desired” or send/tx off and receive/rx on.

One item to point out here is that although both mainstream storage vendors recommend these settings for VMware infrastructures as a best practice, neither of their multi protocol arrays ship configured this way.  At least not the units I’ve had my hands on which includes the EMC Celerra NS-120 and the NetApp FAS3050c.  The Celerra is configured out of the box with Flow Control fully disabled and I found the NetApp configured for Flow Control set to full (duplex?).

Here’s another item of interest.  VMware vSphere hosts are configured out of the box to auto negotiate Flow Control settings.  What does this mean?  Network interfaces are able to advertise certain features and protocols which they were purpose built to understand following the OSI model and RFCs of course.  One of these features is Flow Control.  VMware ESX ships with a Flow Control setting which adapts to its environment.  If you plug an ESX host into an unmanaged switch which doesn’t advertise Flow Control capabilities, ESX sets its tx and rx flags to off.  These flags tie specifically to PAUSE frames mentioned above.  When I plugged in my ESX host into the new 3Com managed switch and configured the ports for Flow Control to be enabled, I subsequently found out using the ethtool -a vmnic0 command that both tx and rx were enabled on the host (the 3Com switch has just one Flow Control toggle: enabled or disabled).  NetApp provides a hint to this behavior in their best practice statement which says “Once these [Flow Control] settings have been configured on the storage controller and network switch ports, it will result in the desired configuration without modifying the flow control settings in ESX/ESXi.”  Jase McCarty pointed out back in January a “feature” of the ethtool in ESX.  Basically, ethtool can be used to display current Ethernet adapter settings (including Flow Control as mentioned above) and it can also be used to configure settings.  Unfortunately, when ethtool is used to hard code a vmnic for a specific Flow Control configuration, that config lasts until the next time ESX is rebooted.  After reboot, the modified configuration does not persist and it reverts back to auto/auto/auto.  I tested with ESX 4.1 and the latest patches and the same holds true.  Jase offers a workaround in his blog post which allows the change to persist by embedding it in /etc/rc.local.

Third item of interest.  VMware KB 1013413 talks about disabling Flow Control using esxcfg-module for Intel NICs and ethtool for Broadcom NICs.  This article specifically talks about disabling Flow Control when PAUSE frames are identified on the network.  If PAUSE frames are indicative of a large amount of traffic which a receiver isn’t able to handle, it would seem to me we’d want to leave Flow Control enabled (by design to mediate the congestion) and perform root cause analysis on exactly why we’ve hit a sustained scaling limit (and what do we do about it long term).

Fourth.  Flow Control seems to be a simple mechanism which hinges on PAUSE frames to work properly.  If the Wikipedia article is correct in that only stations configured for full-duplex operation may send PAUSE frames, then it would seem to me that both network endpoints (in this case ESX(i) and the IP based storage array) should be configured with Flow Control set to full duplex, meaning both tx and rx ON.  This conflicts with the best practice messages from EMC and NetApp although it does align with the FAS3050 out of box configuration.  The only reasonable explanation is that I’m misinterpreting the meaning of full-duplex here.

Lastly, I’ve got myself all worked up into a frenzy over the proper configuration of Flow Control because I want to be sure I’m doing the right thing from both a lab and infrastructure design standpoint, but in the end Flow Control is like the Shares mechanism in VMware ESX(i):  The values or configurations invoked apply only during periods of contention.  In the case of Flow Control, this means that although it may be enabled, it serves no useful purpose until a receiver on the network says “I can’t take it any more” and sends the PAUSE frames to temporarily suspend traffic.  I may never reach this tipping point in the lab but I know I’ll sleep better at night knowing the lab is configured according to VMware storage vendor best practices.

More VCDX Insight and a New Blog

November 17th, 2010 by jason No comments »

Yuri Semenikhin, a Systems Engineer from Georgia, Tbilisi, has recently launched a virtualization blog by the name of vEra of the Virtual Revolution.  Yuri published his VCDX certification attempt experience in his blog post VCDX “be or not to be”. not YET !  His writing is not in English, however, he offers an English translator on the right hand edge of his blog. 

While some compare the VCDX to the Cisco CCIE certification, Yuri contrasts the two by saying the CCIE is a technical certification mapping closer to the VCAP4-DCA while the VCDX is an architect certification.  I would agree the VCAP-DCA exam compares to the CCIE from a hands on lab approach and the VCAP-DCA was plenty difficult, but I don’t think the VCAP-DCA requires near the level of training, preparation, and expense (or investment, depending on your view) that the CCIE lab exam does.  This is merely a difference in opinion and I’m not saying either is right or wrong.

The purpose of my blog post is to provide some exposure to Yuri and his blog.  Yuri tells his story in great length and detail.  I wish him the best of luck with his blog and his next VCDX attempt!

Submit a VMware Feature Request

November 13th, 2010 by jason No comments »

SnagIt Capture

If you have a suggestion for how to improve or enhance VMware software, VMware always welcomes your input. Please submit your suggestions through the Feature Request form on VMware’s website. Unless additional information is needed, you will not receive a personal response. Any suggestions for enhancements to VMware software that you submit will become the property of VMware. VMware may use this information for any VMware business purposes, without restriction, including for product support and development. VMware will not use the information in a form that personally identifies you.

http://www.vmware.com/support/policies/feature.html

Provide your input to VMware and help them maintain status as the most innovative, flexible, and scalable hypervisor on the planet.