The time for ESXi is now. The Susan Gudenkauf interview

July 5th, 2011 by jason No comments »

I suspect VMware is going to orchestrate the release of the next generation of vSphere with VMworld 2011 or perhaps even sooner.  There is a big media event coming up on Tuesday July 12th called Raising the Bar, Part V. I’m guessing announcements and details will be showcased at this event.  At any rate, most people have known for quite some time that VMware ESX is being retired in favor of ESXi as VMware’s flagship type 1, enterprise, scalable, datacenter hypervisor going forward.  This next version of vSphere which VMware is about to release officially marks the end of ESX.  Only ESXi will be available onward into the future.  For most people, this doesn’t mean a lot since many have already made the formal transition from ESX to ESXi.  However, others have yet to commit to ESXi for various reasons.  Those who have already embraced ESXi are prepared for this next release of vSphere and all of the new features that it brings.  Those on ESX still are Behind the Eight Ball.  With the upcoming version of vSphere, we will no longer have a choice to stay the course with ESX.  The time to make the transition to ESXi is becoming critical and that time is now. I had a chance to talk with Susan Gudenkauf who oversees the ESXi program and is helping customers make this transition.  Following is the interview.  If you’re still hesitant about ESXi, I hope this Q&A session helps.

Q. Can you please introduce yourself and tell us a little about your history at VMware?

A. My name is Susan Gudenkauf and I have been a VMware employee for 8 years. I started at VMware in June of 2003 when there were about 240 employees. My first role here was Senior Systems Engineer and there were only about 10 of us in the world. We are still a tight knit group and those guys are some of my best friends now. I stayed in that role for about 18 months and left when there was around 100 SE’s (me still being the only female in the group). I went on to become a Technical Account Manager (TAM) so I could concentrate on the relationships with my customers instead of the more ‘hit-and-run’ work I was doing as an SE. At the time the territories were a lot larger and I covered 6 states and 3 Canadian Provinces by myself. After I had been a TAM for a couple of years I was promoted to TAM Manager and then Senior Manager. In January of 2011 I left Professional Services (PSO) to oversee the ESXi program as a Senior Program Manager focusing on customer migration. It was a huge step outside of my comfort zone but it’s been a wonderful experience so far. I’ve really been enjoying my career at VMware and it’s been fun getting to do different roles and having varying responsibilities.

Q. You are a legend in VMware certification history. Would you mind sharing that story?

A. I think the word ‘legend’ is a bit much (although it is certainly flattering) but I agree there is definitely some vibe around the VCP #1 thing (VMware Certified Professional). It’s one of the strangest and sweetest things that have ever happened to me. I didn’t know at the time it would become such a big deal, but I regularly get people asking for my autograph and for me to take photos with them. It’s really funny and I’m finally starting to enjoy that people care enough about it to even mention it. The first time I took the VMware course (on ESX 1.5) there wasn’t a VCP exam yet. I was working as a consultant for a partner at the time, but ended up going to VMware about 8 months after the course. The first week I started at VMware I took the VMware ESX 1.52 course and this time they had a certification exam – which we had to write on PAPER. The worst part was that we didn’t find out if we passed or failed for 6 weeks or something. It was pretty nerve wracking. Later I found out that the two guys I was friends with in the class (Ferhan Khan and Michael Cambian) were VCP #2 and VCP #3. It’s funny how we all ended up the first three VCPs in the world.

Q. What is your primary responsibility in your current role?

A. One of the most important responsibilities of my role is to bring awareness to the fact that ESXi is the only VMware hypervisor going forward. In July 2010 we announced that VMware ESX was going away and the next major release would have ESXi as the only platform. It is amazing how many people didn’t realize that when I started with this program. It’s been a major focus for me for all of 2011. Most of what the team has been working on is at the ESXi Info Center here: http://www.vmware.com/products/vsphere/esxi-and-esx/

Q. VMware ESXi may be new to some readers. Can you talk about what ESXi is and provide some historical background on its development?

A. Introduced in 2007, ESXi is the most advanced hypervisor in the market today. It is a “bare-metal” hypervisor and is thinner, lighter, more secure and easier to manage than ESX. ESXi also has a great advantage over other hypervisors due to the fact that it has complete independence from an Operating System. This is key because the hypervisor is the foundation to your private or hybrid cloud and you need it to be solid.

Q. Is ESXi experimental or for non-production use? What is VMware’s support stance on ESXi?

A. ESXi is absolutely designed for production use. It is a fully featured hypervisor that delivers greater performance, reliability, security and scalability than ESX. It can be used to run any of the advanced features of vSphere in a multitude of use cases. In fact, ESXi is already used in Production by a large percentage of VMware customers. We do have an entry level product called VMware vSphere Hypervisor which is based on ESXi but has limited management capabilities and doesn’t give users the advanced features such as high availability, live migration, power management, automatic load balancing, etc. Our support stance on ESXi is the same as our other solutions. We have Production Support (24×7), Basic Support (normal business hours) and Per-Incident Support.

Q. What can you tell me about the adoption rate of ESXi since it’s release?

A. I don’t have specific numbers in front of me, but since ESXi was released in 2007 we had seen a fairly gradual uptick in adoption…until vSphere 4 was released. I think that was really the tipping point for mainstream adoption. An interesting thing to note is that a leading indicator of adoption is the number of downloads each product gets. We’ve seen a reversal of ESX downloads to where they only count for 20% of the overall downloads now whereas ESXi is 80%. That’s really great validation for the strategic direction we have chosen.

Q. Is there a features, support, hardware compatibility, scalability, or stability gap between ESX and ESXi platforms?

A. Interestingly I hear the statement from some people that ESXi “doesn’t have the same functionality” that ESX does. This may have been true at one time, but since ESXi 4.1 came out we’ve really had feature parity. ESXi 4.1 supports Boot from SAN, scripted installations, integrated Active Directory support among other features. You can also expect the scalability that you’ve grown accustomed to with ESX.

Q. ESXi has significantly smaller code base than ESX. How does this impact the effort and time required to deploy and patch ESXi vs. ESX and what does the reduced footprint mean from a security standpoint?

A. You are right about the smaller code base Jason. ESXi is built on less than 100MB of code, whereas ESX is built on over 2GB. That’s a significant savings in space and it brings with it greater reliability and stability. An additional benefit of less code and independence from an Operating System is a lower risk of bugs and other security vulnerabilities.

Q. Has ESXi boot from firmware really taken off? Are there any caveats there?

A. Our major OEM partners offer ESXi pre-installed on their servers due to continuing customer demand. These customers are very enthusiastic about the super-convenient delivery model – just rack the server, power it on and ESXi is up and running. These customers also love the fact that they can run ESXi without local storage which increases the reliability of the server. There really aren’t any caveats other than making sure to use a flash device that’s certified for use for ESXi. These can be obtained from the OEMs. In the future, we intend to provide even more ways to deploy ESXi so that the customer can choose what’s best for their environment.

Q. Some customers have raised software compatibility concerns. How is ESXi impacting the partner ecosystem and what efforts are in place to ensure a seamless migration for ESX shops?

A. I work pretty closely with the Eco-Engineering team regarding our partners and software compatibility. Our partners have known about this transition for years now and those that have not already transitioned their tools to be compatible with ESXi are working diligently to complete this in the near future.

Q. Is VMware offering any special incentives for ESXi purchases, upgrades, or migrations?

A. One of the things I felt adamant about when I took this role was that we needed a way to help our customers migrate to ESXi with the least amount of disruption. Awareness and education are critical to any successful plan. I worked with our VMware Education team to bring an online course to our customers and most importantly to make that course available at no cost to them. It was a little bit of an uphill battle in the beginning since not only did I want VMware to pick up the cost of creating the course, but also the cost of purchasing a number of ESXi eBooks. We are running a promotion now where those that take the course and fill out the survey at the end will get a free eBook while our supplies last. We do still have some eBooks left right now but supplies are running short since it has been a pretty popular promotion. Here is a link to the course so folks can take the course while we still have the ebooks: http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&a=one&id_subject=23970

Q. Once vSphere.next is released as an ESXi only platform, how long will customers be supported on ESX?

A. VMware will offer 7 years of support from the general availability of a new Major Release. That means ESX 4.x will be supported 7 years from the date of general availability on our next major release. The 7 years of support is broken down into 5 years of General Support and 2 years of Technical Guidance following the end of the General Support.

Q. Futures: Is the ESXi the final frontier for the VMware virtualized datacenter/vCloud platform or are there more platform changes coming?

A. I don’t know that I would say anything in technology is the “final frontier”. I know that everything evolves and ESXi is the platform that brings greater efficiency, reliability and stability to virtualization. My whole philosophy on life is if we aren’t evolving, we are dying. The best part of that is VMware will continue to evolve and be a leader for virtualization and cloud infrastructure. Our company meetings are so cool since we get to see and hear ideas from some of the smartest people on the planet. We hear people say that they have no idea how some of these things are even possible, but when Steve Herrod (our CTO and Senior Vice President of R&D) says it can happen…it just does. It’s fun to be a part of that and watch ideas come to life. Now, as far as a discussion on futures goes I’ll go into detail on some really cool things coming up since I’m certain all of your readers are under NDA, right? OK, I’m kidding. I think we should get Steve to have that discussion.

Thank you for your time Susan!

Thanks so much for the interview Jason, it was an honor to be asked and I had a lot of fun with this. I hope your readers enjoy it!

You can contact Susan on Twitter at: @susangude

7-5-2011 6-40-19 AM

New Diskeeper White Paper: Optimization of VMware Systems

June 28th, 2011 by jason No comments »

diskeeperDiskeeper Corporation reached out to me via email last week letting me know that they’ve released a new white paper on optimizing VMs.  I’m making the three page document available for download via the following link:

Best Practice Protocols: Optimization of VMware Systems (416KB)

Force a Simple VMware vMA Password

June 27th, 2011 by jason 5 comments »

VMware ESXi is mainstream.  If you’ve ever deployed a VMware vMA appliance to manage ESXi (or heck, even ESX for that matter), you may have noticed the enforcement of a complex password policy for the vi-admin account.  For example, setting a password of password is denied because it is based on a dictionary word (in addition to other morally obvious reasons).

6-27-2011 7-17-52 PM

However, you can bend the complexity rules and force a simple password after the initial deployment using sudo.  You’ll still be warned about the violation of the complexity policy but by using sudo, the policy is allowed to be bypassed by a higher authority:

sudo passwd vi-admin

6-27-2011 7-14-16 PM

This tip isn’t specific to VMware or the vMA appliance.  It is general *nix knowledge.  There is ample documentation available which discusses the password complexity mechanism in various versions of *nix.  Another approach to bypassing the complexity requirement would be to actually relax the requirement itself but this would impact other local accounts potentially in use on the vMA appliance which may still require complex passwords.  Using the sudo command will be faster and leaves the default complex security mechanism in place.

Tech Support Mode Warnings

June 23rd, 2011 by jason No comments »

After enabling Local Tech Support Mode on an ESXi host via the DCUI (Direct Console User Interface), a yellow balloon styled warning will be displayed in the vSphere Client:

The Local Tech Support Mode for the host has been enabled

Likewise, if you’ve enabled Remote Tech Support Mode via SSH, you’ll see:

Remote Tech Support Mode(SSH) for the host has been enabled

Snagit Capture

KB Article 1016205 describes this condition as a security measure.  Adhering to the warnings would be a best practice for a production or high risk environment.  However, for lab, development, or environments with adequate perimeter security, it may be desirable to have either or both modes enabled but the warnings throughout the vSphere Client aren’t welcomed.

The VMware KB article goes on to say that there is no way to eliminate the warnings while leaving Local or Remote Tech Support Mode enabled.

Disabling Remote Tech Support Mode (SSH) and Local Tech Support Mode is the only way to prevent this warning.

While there may not be be an advanced configuration exposed, rebooting the host eliminates the conditional warnings.  It has also been reported in the VMware community forums that restarting the hostd service also works as follows, but as a side effect, will likely and temporarily disconnect the host from a vCenter Server:

/etc/init.d/hostd restart

Xangati Packs More Power in Free VMware Management Tool

June 22nd, 2011 by jason No comments »

Press Release:

Xangati Packs More Power in Free VMware Management Tool

Expands Functionality of Xangati for ESX with Performance Health Engine for Any Given Host

Cupertino, CA – June 22, 2011 – Xangati, the recognized leader in infrastructure performance management, today announced that it has expanded the capabilities and power offered in its free VMware management tool, Xangati for ESX. Xangati for ESX now includes several features from its recently announced and patent-pending Performance Health Engine – a real-time health index that monitors the health of every object within the virtualized infrastructure and a key component of Xangati’s multi-host Xangati Virtual Infrastructure (VI) and Virtual Desktop Infrastructure (VDI) Dashboards. With the updated Xangati for ESX, virtualization managers now have an even clearer picture of their VM activity, as well as the ability to fully monitor a single ESX host – all at no cost.

“Xangati is continuously looking for ways to improve our infrastructure performance management solutions in order to provide the highest value to virtualization managers – and that objective is absolutely no different for our free Xangati for ESX tool,” said Alan Robin, CEO of Xangati. “The response to our Performance Health Engine – for both our VI and VDI dashboards – inspired us to incorporate some of its capabilities into our free tool, so that everyone can experience and benefit from real-time health analysis – in any stage of their virtualization initiative.”

“By incorporating its health index into the free Xangati for ESX, Xangati allows virtualization managers to create a baseline for the infrastructure,” said David Davis, vExpert and blogger. “When any unusual activity occurs on the infrastructure, the tool alerts you and identifies the problem area. This ability – plus Xangati’s trademark DVR recordings – provide for the most comprehensive troubleshooting available, differentiating Xangati from other virtual performance monitoring tools – all for free.”

New Capabilities Streamline Management and Ensure User Satisfaction

With its new enhancements, Xangati for ESX gives managers deeper insights into any potential problems within virtualized environments by immediately and visually alerting them to any anomalies. Xangati achieves this unique health alert system by comparing real-time data feeds with established performance profiles for up to 10 VMs running on an ESX host supporting virtualized servers or virtual desktops. Its memory-based architecture allows Xangati to compare this data and identify any performance shifts live and continuously – not through intermittent polling intervals – giving managers unparalleled insights for faster troubleshooting. These insights, in turn, provide confidence for the migration of mission-critical applications in the VI and ensure end user satisfaction – the biggest factor in determining the success of VDI initiatives.

Xangati for ESX still includes all of Xangati’s trademark features, including: continuous scroll-bar and drill-down user interface (UI) capabilities for dynamic and real-time navigation; visibility into more than 100 metrics on an ESX/ESXi host and its VMs activity; and automated DVR recordings (triggered by VMware alerts) to capture critical events for replay analysis for precision troubleshooting as opposed to sifting through unstructured log files. Xangati for ESX is also deployed in Open Virtualization Format (OVF) to facilitate a faster and easier installation process. Xangati is committed to continue to incorporate capabilities that add value and help accelerate virtualization initiatives.

Available immediately, the updated Xangati for ESX works with VMware 3.5, 4.0 and 4.1 for ESX and ESXi. Xangati has also created an updated installation video and documentation for additional background about the new features in order to enable virtualization managers to begin using and benefiting from the free tool as quickly as possible. To access the installation video and download a copy of the free Xangati for ESX, go to http://xangati.com/xangati-for-esx-new-features/.

About Xangati

Xangati, the recognized leader in Infrastructure Performance Management (IPM), provides unparalleled performance management for the emerging and transformational data center architectures impacting IT today, including server virtualization, cloud computing and VDI. Its award-winning suite of IPM solutions accelerates cloud computing and virtualization initiatives by providing unprecedented visibility and real-time continuous insights into the entire infrastructure. Leveraging its powerful precision analytics, Xangati’s health performance index provides a new way to view and manage performance – in real-time – at a scale previously not possible.

Founded in 2006, Xangati, Inc. is a privately held company with corporate headquarters based in Cupertino, California. Xangati has been granted numerous technology patents for its unique and comprehensive approach to Infrastructure Performance Management. Xangati is a VMware Technology Alliance Partner and certified Citrix Ready Partner and supports both VMware View and Citrix XenDesktop, as well as other virtualization environments. For more information, visit the company website at http://www.xangati.com.

Father’s Day Fun With ESXi Annotations

June 19th, 2011 by jason No comments »

Tired of the same old DCUI look of your ESXi host?

SnagIt Capture

Change it with your own custom Annotation (found under Host|Configuration|Software|Advanced Settings):

SnagIt Capture

Viola!

SnagIt Capture

Clearly this is more fun than any one person should be allowed to have.

Clear the Annotation field to restore the original ESXi look.

Disk.SchedNumReqOutstanding and Queue Depth

June 16th, 2011 by jason No comments »

There is a VMware storage whitepaper available which is titled Scalable Storage Performance.  It is an oldie but goodie.  In fact, next to VMware’s Configuration Maximums document,  it is one of my favorites and I’ve referenced it often.  I like it because it is efficient in specifically covering block storage LUN queue depth and SCSI reservations.  It was written pre-VAAI but I feel the concepts are still quite relevant in the block storage world.

One of the interrelated components of queue depth on the VMware side is the advanced VMkernel parameter Disk.SchedNumReqOutstanding.  This setting determines the maximum number of active storage commands (IO) allowed at any given time at the VMkernel.  In essence, this is queue depth at the hypervisor layer.  Queue depth can be configured at various points in the path of an IO such as the VMkernel which I already mentioned, in addition to the HBA hardware layer, the kernel module (driver) layer, as well as at the guest OS layer.

Getting back to Disk.SchedNumReqOutstanding, I’ve always lived by the definition I felt was most clear in the Scalable Storage Performance whitepaper.  Disk.SchedNumReqOutstanding is the maximum number of active commands (IO) per LUN.  Clustered hosts don’t collaborate on this value which implies this queue depth is per host.  In other words, each host has its own independent queue depth, again, per LUN.  How does Disk.SchedNumReqOutstanding impact multiple VMs living on the same LUN (again, same host)?  The whitepaper states each VM will evenly share the queue depth (assuming each VM has identical shares from a storage standpoint).

When virtual machines share a LUN, the total number of outstanding commands permitted from all virtual machines to that LUN is governed by the Disk.SchedNumReqOutstanding configuration parameter that can be set using VirtualCenter. If the total number of outstanding commands from all virtual machines exceeds this parameter, the excess commands are queued in the ESX kernel.

I was recently challenged by a statement agreeing to all of the above but with one critical exception:  Disk.SchedNumReqOutstanding provides an independent queue depth for each VM on the LUN.  In other words, if Disk.SchedNumReqOutstanding is left at its default value of 32, then VM1 has a queue depth of 32, VM2 has a queue depth of 32, and VM3 has its own independent queue depth of 32.  Stack those three VMs and we arrive at a sum total of 96 outstanding IOs on the LUN.  A few sources were provided to me to support this:

Fibre Channel SAN Configuration Guide:

You can adjust the maximum number of outstanding disk requests with the Disk.SchedNumReqOutstanding parameter in the vSphere Client. When two or more virtual machines are accessing the same LUN, this parameter controls the number of outstanding requests that each virtual machine can issue to the LUN.

VMware KB Article 1268 (Setting the Maximum Outstanding Disk Requests per Virtual Machine):

You can adjust the maximum number of outstanding disk requests with the Disk.SchedNumReqOutstanding parameter. When two or more virtual machines are accessing the same LUN (logical unit number), this parameter controls the number of outstanding requests each virtual machine can issue to the LUN.

The problem with the two statements above is that I feel they are poorly worded, and as a result, misinterpreted.  I understand what the statement is trying to say, but it’s implying something quite a bit different depending on how a person reads it.  Each statement is correct in that Disk.SchedNumReqOutstanding will gate the amount of active IO possible per LUN and ultimately per VM.  However, the wording implies that the value assigned to Disk.SchedNumReqOutstanding applies individually to each VM which is not the case.  The reason I’m pointing this out is due to the number of misinterpretations I’ve subsequently discovered via Google which I gather are the result of reading one of the latter sources above.

The scenario can be quickly proven in the lab.  Disk.SchedNumReqOutstanding is configured for the default value of 32 active IOs.  Using resxtop, I see my three VMs cranking out IO with IOMETER.  Each VM is configured with IOMETER to create 32 active IOs.  If what I’m being told by the challenge is true, I should be seeing 96 active IO being generated to the LUN from the combined activity of the three VMs.

Snagit Capture

But that’s not what’s happening.  Instead what I see is approximately 32 ACTV (active) IOs on the LUN, with another 67 IOs waiting in queue (by the way, ESXTOP statistic definitions can be found here).  In my opinion, the Scalable Storage Performance whitepaper most accurately and best defines the behavior of the Disk.SchedNumReqOutstanding value.

Snagit Capture

Now going back to the possibility of the Disk.SchedNumReqOutstanding stacking, LUN utilization could get out of hand rapidly with 10, 15, 20, 25 VMs per LUN.  We’d quickly exceed the max supported value of Disk.SchedNumReqOutstanding (and all HBAs I’m aware of) which is 256.  HBA ports themselves typically support a few thousand IOPS.  Stacking the queue depths for each VM could quickly saturate an HBA meaning we’d get a lot less mileage out of those ports as well.

While having a queue depth discussion, it’s also worth noting the %USD value is at 100% and LOAD is approximately 3.  The LOAD statistic corroborates the 3:1 ratio of total IO:queue depth and both figures paint the picture of an oversubscribed LUN from an IO standpoint.

In conclusion, I’d like to see VMware modify the wording in their documentation to provide better understanding leaving nothing open to interpretation.

Update 6/23/11:  Duncan Epping at Yellow Bricks responded with a great followup Disk.SchedNumReqOutstanding the story.