Posts Tagged ‘VMotion’

VMworld 2012 Announcements – Part I

August 27th, 2012

VMworld 2012 is underway in San Francisco.  Once again, a record number of attendees is expected to gather at the Moscone Center to see what VMware and their partners are announcing.  From a VMware perspective, there is plenty.

Given the sheer quantity of announcements, I’m actually going to break up them up into a few parts, this post being Part I.  Let’s start with the release of vSphere 5.1 and some of its notable features.

Enhanced vMotion – the ability to now perform a vMotion as well as a Storage vMotion simultaneously. In addition, this becomes an enabler to perform vMotion without the shared storage requirement.  Enhanced vMotion means we are able to migrate a virtual machine stored on local host storage, to shared storage, and then to local storage again.  Or perhaps migrate virtual machines from one host to another with each having their own locally attached storage only.  Updated 9/5/12 The phrase “Enhanced vMotion” should be correctly read as “vMotion that has been enhanced”.  “Enhanced vMotion” is not an actual feature, product, or separate license.  It is an improvement over the previous vMotion technology and included wherever vMotion is bundled.

Snagit Capture

Enhanced vMotion Requirements:

  • Hosts must be managed by same vCenter Server
  • Hosts must be part of same Datacenter
  • Hosts must be on the same layer-2 network (and same switch if VDS is used)

Operational Considerations:

  • Enhanced vMotion is a manual process
  • DRS and SDRS automation do not leverage enhanced vMotion
  • Max of two (2) concurrent Enhanced vMotions per host
  • Enhanced vMotions count against concurrent limitations for both vMotion and Storage vMotion
  • Enhanced vMotion will leverage multi-NIC when available

Next Generation vSphere Client a.k.a. vSphere Web Client – An enhanced version of the vSphere Web Client which has already been available in vSphere 5.0.  As of vSphere 5.1, the vSphere Web Client becomes the defacto standard client for managing the vSphere virtualized datacenter.  Going forward, single sign-on infrastructure management will converge into a unified interface which any administrator can appreciate.  vSphere 5.1 will be the last platform to include the legacy vSphere client. Although you may use this client day to day while gradually easing into the Web Client, understand that all future development from VMware and its partners now go into the Web Client. Plug-ins currently used today will generally still function with the legacy client (with support from their respective vendors) but they’ll need to be completely re-written vCenter Server side for the Web Client.  Aside from the unified interface, the architecture of the Web Client has scaling advantages as well.  As VMware adds bolt-on application functionality to the client, VMware partners will now have the ability to to bring their own custom objects objects into the Web Client thereby extending that single pane of glass management to other integrations in the ecosystem.

Here is a look at that vSphere Web Client architecture:

Snagit Capture

Requirements:

  • Internet Explorer / FireFox / Chrome
  • others (Safari, etc.) are possible, but will lack VM console access

A look at the vSphere Web Client interface and its key management areas:

Snagit Capture

Where the legacy vSphere Client fall short and now the vSphere Web Client solves these issues:

  • Single Platform Support (Windows)
    • vSphere Web Client is Platform Agnostic
  • Scalability Limits
    • Built to handle thousands of objects
  • White Screen of Death
    • Performance
  • Inconsistent look and feel across VMware solutions
    • Extensibility
  • Workflow Lock
    • Pause current task and continue later right where you left off (this one is cool!)
    • Browser Behavior
  • Upgrades
    • Upgrade a Single serverside component

 vCloud Director 5.1

In the recent past, VMware aligned common application and platform releases to ease issues that commonly occurred with compatibility.  vCloud Director, the cornerstone of the vCloud Suite, is obviously the cornerstone in how VMware will deliver infrastructure, applications, and *aaS now and into the future. So what’s new in vCloud Director 5.1?  First an overview of the vCloud Suite:

Snagit Capture

And a detailed list of new features:

  • Elastic Virtual Datacenters – Provider vDCs can span clusters leveraging VXLAN allowing the distribution and mobility of vApps across infrastructure and the growing the vCloud Virtual Datacenter
  • vCloud Networking & Security VXLAN
  • Profile-Driven Storage integration with user and storage provided capabilities
  • Storage DRS (SDRS) integration
    • Exposes storage Pod as first class storage container (just like a datastores) making it visible in all workflows where a datastore is visible
    • Creation, modification, and deletion of spods not possible in vCD
    • Member datastore operations not permissible in VCD
  • Single level Snapshot & Revert support for vApps (create/revert/remove); integration with Chargeback
  • Integrated vShield Edge Gateway
  • Integrated vShield Edge Configuration
  • vCenter Single Sign-On (SSO)
  • New Features in Networking
    • Integrated Organization vDC Creation Workflow
    • Creates compute, storage, and networking objects in a single workflow
    • The Edge Gateway are exposed at Organization vDC level
    • Organization vDC networks replace Organization networks
    • Edge Gateways now support:
      • Multiple Interfaces on a Edge Gateway
      • The ability to sub-allocate IP pools to a Edge Gatewa
      • Load balancing
      • HA (not the same as vSphere HA)
        • Two edge VMs deployed in Active-Passive mode
        • Enabled at time of gateway creation
        • Can also be changed after the gateway has been completed
        • Gets deployed with first Organizational network created that uses this gateway
      • DNS Relay
        • Provides a user selectable checkbox to enable
        • If DNS servers are defined for the selected external network, DNS requests will be sent to the specified server. If not, then DNS requests will be sent to the default gateway of the external network.
      • Rate limiting on external interface
    • Organization networks replaced by Organization vDC Networks
      • Organization vDC Networks are associated with an Organization vDC
      • The network pool associated with Organization vDC is used to create routed and isolated Organization vDC networks
      • Can be shared across Organization vDCs in an Organization
    • Edge Gateways
      • Are associated with an Organization vDC, can not be shared across Organization vDCs
      • Can be connected to multiple external networks
        • Multiple routed Organization vDC networks will be connected to the same Edge Gateway
      • External network connectivity for the Organization vDC Network can be changed after creation by changing the external networks which the edge gateway is connected.
      • Allows IP pool of external networks to be sub-allocated to the Edge Gateway
        • Needs to be specified in case of NAT and Load Balancer
    • New Features in Gateway Services
      • Load balancer service on Edge Gateways
      • Ability to add multiple subnets to VPN tunnels
      • Ability to add multiple DHCP IP pools
      • Ability to add explicit SNAT and DNAT rules providing user with full control over address translation
      • IP range support in Firewall and NAT services
      • Service Configuration Changes
        • Services are configured on Edge Gateway instead of at the network level
        • DHCP can be configured on Isolated Organization vDC networks.
  • Usability Features
    • New default branding style
      • Cannot revert back to the Charcoal color scheme
      • Custom CSS files will require modification
    • Improved “Add vApp from Catalog” wizard workflow
    • Easy access to VM Quota and Lease Expirations
    • New dropdown menu that includes details and search
    • Redesigned catalog navigation and sub-entity hierarchy
    • Enhanced help and documentation links
  • Virtual Hardware Version 9
    • Supports features presented by HW9 (like 64 CPU support)
    • Supports Hardware Virtualization Calls
    • VT-x/EPT or AMD-V/RVI
    • Memory overhead increased, vMotion limited to like hardware
    • Enable/Disable exposed to users who have rights to create a vApp Template
  • Additional Guest OS Support
    • Windows 8
    • Mac OS 10.5, 10.6 and 10.7
  • Storage Independent of VM Feature
    • Added support for Independent Disks
    • Provides REST API support for actions on Independent Disks
      • As these consume disk space, the vCD UI was updated to show user when they are used:
      • Organizations List Page
      • A new Independent Disks count column is added.
      • Organization Properties Page
      • Independent Disks tab is added to show all independent disks belonging to vDC
      • Tab is not shown if no independent disk exists in the vDC.
      • Virtual Machine Properties Page
      • Hardware tab->Hard Disks section, attached independent disks are shown by their names and all fields for the disk are disabled as they are not editable.

That’s all I have time for right now.  As I said, there is more to come later on topics such as vDS enhancements, VXLAN, SRM, vCD Load Balancing, and vSphere Replication.  Stay tuned!

Jumbo Frames Comparison Testing with IP Storage and vMotion

January 24th, 2011

Are you thinking about implementing jumbo frames with your IP storage based vSphere infrastructure?  Have you asked yourself why or thought about the guaranteed benefits? Various credible sources discuss it (here’s a primer).  Some will highlight jumbo frames as a best practice but the majority of what I’ve seen and heard talk about the potential advantages of jumbo frames and what the technology might do to make your infrastructure more efficient.  But be careful to not interpret that as an order of magnitude increase in performance for IP based storage.  In almost all cases, that’s not what is being conveyed, or at least, that shouldn’t be the intent.  Think beyond SPEED NOM NOM NOM.  Think efficiency and reduced resource utilization which lends itself to driving down overall latency.  There are a few stakeholders when considering jumbo frames.  In no particular order:

  1. The network infrastructure team: They like network standards, best practices, a highly performing and efficient network, and zero down time.  They will likely have the most background knowledge and influence when it comes to jumbo frames.  Switches and routers have CPUs which will benefit from jumbo frames because processing less frames but more payload overall makes the network device inherently more efficient while using less CPU power and consequently producing less heat.  This becomes increasingly important on 10Gb networks.
  2. The server and desktop teams: They like performance and unlimited network bandwidth provided by magic stuff, dark spirits, and friendly gnomes.  These teams also like a postive end user experience.  Their platforms, which include hardware, OS, and drivers, must support jumbo frames.  Effort required to configure for jumbo frames increases with a rising number of different hardware, OS, and driver combinations.  Any systems which don’t support network infrastructure requirements will be a showstopper.  Server and desktop network endpoints benefit from jumbo frames much of the same way network infrastructure does: efficiency and less overhead which can lead to slightly measurable amounts of performance improvement.  The performance gains more often than not won’t be noticed by the end users except for process that historically take a long amount of time to complete.  These teams will generally follow infrastructure best practies as instructed by the network team.  In some cases, these teams will embark on an initiative which recommends or requires a change in network design (NIC teaming, jumbo frames, etc.).
  3. The budget owner:  This can be a project sponsor, departmental manager, CIO, or CEO.  They control the budget and thus spending.  Considerable spend thresholds require business justification.  This is where the benefit needs to justify the cost.  They are removed from the most of the technical persuasions.  Financial impact is what matters.  Decisions should align with current and future architectural strategies to minimize costly rip and replace.
  4. The end users:  Not surprisingly, they are interested in application uptime, stability, and performance.  They could care less about the underlying technology except for how it impacts them.  Reduction in performance or slowness is highly visible.  Subtle increases in performance are rarely noticed.  End user perception is reality.

The decision to introduce jumbo frames should be carefully thought out and there should be a compelling reason, use case, or business justification which drives the decision.  Because of the end to end requirements, implementing jumbo frames can bring with it additional complexity and cost to an existing network infrastructure.  Possibly the single best one size fits all reason for a jumbo frames design is a situation where jumbo frames is already a standard in the existing network infrastructure.  In this situation, jumbo frames becomes a design constraint or requirement.  The evangelistic point to be made is VMware vSphere supports jumbo frames across the board.  Short of the previous use case, jumbo frames is a design decision where I think it’s important to weigh cost and benefit.  I can’t give you the cost component as it is going to vary quite a bit from environment to environment depending on the existing network design.  This writing speaks more to the benefit component.  Liberal estimates claim up to 30% performance increase when integrating jumbo frames with IP storage.  The numbers I came up with in lab testing are nowhere close to that.  In fact, you’ll see a few results where IO performance with jumbo frames actually decreased slightly.  Not only do I compare IO with or without jumbo frames, I’m also able to compare two storage protocols with and without jumbo frames which could prove to be an interesting sidebar discussion.

I’ve come across many opinions regarding jumbo frames.  Now that I’ve got a managed switch in the lab which supports jumbo frames and VLANs, I wanted to see some real numbers.  Although this writing is primarily regarding jumbo frames, by way of the testing regimen, it is in some ways a second edition to a post I created one year ago where I compared IO performance of the EMC Celerra NS-120 among its various protocols. So without further ado, let’s get onto the testing.

 

Lab test script:

To maintain as much consistency and integrity as possible, the following test criteria was followed:

  1. One Windows Server 2003 VM with IOMETER was used to drive IO tests.
  2. A standardized IOMETER script was leveraged from the VMTN Storage Performance Thread which is a collaboration of storage performance results on VMware virtual infrastructure provided by VMTN Community members around the world.  The thread starts here, was locked due to length, and continues on in a new thread here.  For those unfamiliar with the IOMETER script, it basically goes like this: each run consists of a two minute ramp up followed by five minutes of disk IO pounding.  Four different IO patterns are tested independently.
  3. Two runs of each test were performed to validate consistent results.  A third run was performed if the first two were not consistent.
  4. One ESXi 4.1 host with a single IOMETER VM was used to drive IO tests.
  5. For the mtu1500 tests, IO tests were isolated to one vSwitch, one vmkernel portgroup, one vmnic, one pNIC (Intel NC360T PCI Express), one Ethernet cable, and one switch port on the host side.
  6. For the mtu1500 tests, IO tests were isolated to one cge port, one datamover, one Ethernet cable, and one switch port on the Celerra side.
  7. For the mtu9000 tests, IO tests were isolated to the same vSwitch, a second vmkernel portgroup configured for mtu9000, the same vmnic, the same pNIC (Intel NC360T PCI Express), the same Ethernet cable, and the same switch port on the host side.
  8. For the mtu9000 tests, IO tests were isolated to a second cge port configured for mtu9000, the same datamover, a second Ethernet cable, and a second switch port on the Celerra side.
  9. Layer 3 routes to between host and storage were removed to lessen network burden and to isolate storage traffic to the correct interfaces.
  10. 802.1Q VLANs were used isolate traffic and categorize standard traffic versus jumbo frame traffic.
  11. RESXTOP was used to validate storage traffic was going through the correct vmknic.
  12. Microsoft Network Monitor and Wireshark were used to validate frame lengths during testing.
  13. Activities known to introduce large volumes of network or disk activity were suspended such as backup jobs.
  14. Dedupe was suspended on all Celerra file systems to eliminate datamover contention.
  15. All storage tests were performed on thin provisioned virtual disks and datastores.
  16. The same group of 15 spindles were used for all NFS and iSCSI tests.
  17. The uncached write mechanism was enabled on the NFS file system for all NFS tests.  You can read more about that in the following EMC best practices document VMware ESX Using EMC Celerra Storage Systems

Lab test hardware:

SERVER TYPE: Windows Server 2003 R2 VM on ESXi 4.1
CPU TYPE / NUMBER: 1 vCPU / 512MB RAM (thin provisioned)
HOST TYPE: HP DL385 G2, 24GB RAM; 2x QC AMD Opteron 2356 Barcelona
STORAGE TYPE / DISK NUMBER / RAID LEVEL: EMC Celerra NS-120 / 15x 146GB 15K / 3x RAID5 5×146
SAN TYPE: / HBAs: NFS / swiSCSI / 1Gb datamover ports (sorry, no FCoE)
OTHER: 3Com SuperStack 3 3870 48x1Gb Ethernet switch

 

Lab test results:

NFS test results.  How much better is NFS performance with jumbo frames by IO workload type?  The best result seen here is about a 7% performance increase by using jumbo frames, however, 100% read is a rather unrealistic representation of a virtual machine workload.  For NFS, I’ll sum it up as a 0-3% IOPS performance improvement by using jumbo frames.

SnagIt Capture

SnagIt Capture

iSCSI test results.  How much better is iSCSI performance with jumbo frames by IO workload type?  Here we see that iSCSI doesn’t benefit from the move to jumbo frames as much as NFS.  In two workload pattern types, performance actually decreased slightly.  Discounting the unrealistic 100% read workload as I did above, we’re left with a 1% IOPS performance gain at best by using jumbo frames with iSCSI.

SnagIt Capture

SnagIt Capture

NFS vs iSCSI test results.  Taking the best results from each protocol type, how do the protocol types compare by IO workload type?  75% of the best results came from using jumbo frames.  The better performing protocol is a 50/50 split depending on the workload pattern.  One interesting observation to be made in this comparison is how much better one protocol performs over the other.  I’ve heard storage vendors state that the IP protocol debate is a snoozer, they preform roughly the same.  I’ll grant that in two of the workload types below, but in the other two, iSCSI pulls a significant performance lead over NFS. Particularly in the Max Throughput-50%Read workload where iSCSI blows NFS away.  That said, I’m not outright recommending iSCSI over NFS.  If you’re going to take anything away from these comparisons, it should be “it depends”.  In this case, it depends on the workload pattern, among a handful of other intrinsic variables.  I really like the flexibility in IP based storage and I think it’s hard to go wrong with either NFS or iSCSI.

SnagIt Capture

SnagIt Capture

vMotion test results.  Up until this point, I’ve looked at the impact of jumbo frames on IP based storage with VMware vSphere.  For curiosity sake, I wanted to to address the question “How much better is vMotion performance with jumbo frames enabled?”  vMotion utilizes a VMkernel port on ESXi just as IP storage does so the ground work has already been established making this a quick test.  I followed roughly the same lab test script outlined above so that the most consistent and reliable results could be produced.  This test wasn’t rocket science.  I simply grabbed a few different VM workload types (Windows, Linux) with varying sizes of RAM allocated to them (2GB, 3GB, 4GB).  I then performed three batches of vMotions of two runs each on non jumbo frames (mtu1500) and jumb frames (mtu9000).  Results varied.  The first two batches showed that jumbo frames provided a 7-15% reduction in elapsed vMotion time.  But then the third and final batch contrasted previous results with data revealing a slight decrease in vMotion efficiency with jumbo frames.  I think there’s more variables at play here and this may be a case where more data sampling is needed to form any kind of reliable conclusion.  But if you want to go by these numbers, vMotion is quicker on jumbo frames more often than not.

SnagIt Capture

SnagIt Capture

The bottom line:

So what is the bottom line on jumbo frames, at least today?  First of all my disclaimer:  My tests were performed on an older 3Com network switch.  Mileage may vary on newer or different network infrastructure.  Unfortunately I did not have access to a 10Gb lab network to perform this same testing.  However, I believe my findings are consistent with the majority of what I’ve gathered from the various credible sources.  I’m not sold on jumbo frames as a provider of significant performance gains.  I wouldn’t break my back implementing the technology without an undisputable business justification.  If you want to please the network team and abide by the strategy of an existing jumbo frames enabled network infrastructure, then use jumbo frames with confidence.  If you want to be doing everything you possibly can to boost performance from your IP based storage network, use jumbo frames.  If you’re betting the business on IP based storage, use jumbo frames.  If you need a piece of plausible deniability when IP storage performance hits the fan, use jumbo frames. If you’re looking for the IP based storage performance promise land, jumbo frames doesn’t get you there by itself.  If you come across a source telling you otherwise, that jumbo frames is the key or sole ingredient to the Utopia of incomprehendable speeds, challenge the source.  Ask to see some real data.  If you’re in need of a considerable performance boost of your IP based storage, look beyond jumbo frames.  Look at optimizing, balancing, or upgrading your back end disk array.  Look at 10Gb.  Look at fibre channel.  Each of these alternatives are likely to get you better overall performance gains than jumbo frames alone.  And of course, consult with your vendor.

Meet the Engineer: VMware vMotion

September 14th, 2010

I caught this VMware video announcement on Twitter but didn’t see a formal blog post or landing page to provide the proper introduction which it deserves, so I’ll go ahead here and do the cheeseful.  I have no shame in this.

vMotion is a historically significant technology in VMware’s portfolio of datacenter products and has become a staple of virtualized datacenter operations.  It paves a foundation which many other key VMware technologies leverage.  Dilpreet Bindra is the Senior Engineering Manager, VM Mobility Team at VMware (which encompases both vMotion and Storage vMotion).  

Dilpreet is the star of this video and he explains some of the barriers his group has conquered in vSphere 4.1 – these are awesome improvements!  Watch the video. You’re being treated to a sizable slice of VMware history.

New ESX(i) 3.5 security patch released; scenarios and installation notes

April 11th, 2009

On Friday April 10th, VMware released two patches:

Both address the same issue:

A critical vulnerability in the virtual machine display function might allow a guest operating system to run code on the host. The Common Vulnerabilities and Exposures Project (cve.mitre.org) has assigned the name CVE-2009-1244 to this issue.

Hackers must love vulnerabilities like this because they can get a lot of mileage out of essentially a single attack. The ability to execute code on an ESX host can impact all running VMs on that host.

Although proper virtualization promises isolation, the reality is that no hardware or software vendor is perfect and from time to time we’re going to see issues like this. Products are under constant attack from hackers (both good and bad) to find exploits. In virtualized environments, it’s important to remember that guest VMs and guest operating systems are no different than their physical counterparts in that they need to be properly protected from the network. That means adequate virus protection, spyware protection, firewalls, encryption, packet filtering, etc.

This vulnerability in VMware ESX and ESXi is really a two factor attack. In order to compromise the ESX or ESXi host, the guest VM must first be vulnerable to compromise on the network to provide the entry point to the host. Once the guest VM is compromised, the next step is to get from the guest VM to the ESX(i) host. Hosts without the patch will be vulnerable to the next attack which we know from reading above will allow who knows what code to be executed on the host. If the host is patched, we maintain our guest isolation and the attack stops at the VM level. Unfortunately, the OS running in the guest VM is still compromised, again highlighting the need for adequate protection of the operating system and applications running in each VM.

The bottom line is this is an important update for your infrastructure. If your ESX or ESXi hosts are vulnerable, you’ll want to get this one tested and implemented as soon as possible.

I installed the updates today in the lab and discovered something interesting that is actually outlined in both of the KB articles above:

  • The ESXi version of the update requires a reboot. Using Update Manager, the patch process goes like this: Remediate -> Maintenance Mode -> VMotion VMs off -> Patch -> Reboot -> Exit Maintenance Mode. The duration of installation of the patch until exiting maintenance mode (including the reboot in between) took 12 minutes.
  • The ESX version of the update does not require a reboot. Using Update Manager, the patch process goes like this: Remediate -> Maintenance Mode -> VMotion VMs off -> Patch -> Exit Maintenance Mode. The duration of installation of the patch until exiting maintenance mode (with no reboot in between) took 1.5 minutes.

Given reboot times of the host, patching ESX hosts goes much quicker than patching ESXi hosts. Reboot times on HP Proliant servers aren’t too bad but I’ve been working with some powerful IBM servers lately and the reboot times on those are significantly longer than HP. Hopefully we’re not rebooting ESX hosts on a regular basis so with that in mind, reboot times aren’t a huge concern, but if you’ve got a large environment with a lot of hosts requiring reboots, the reboot times are going to be cumulative in most cases. Consider my environment above. A 6 node ESXi cluster is going to take 72 minutes to patch, not including VMotions. A 6 node ESX cluster is going to take 9 minutes to patch, not including VMotions. This may be something to really think about when weighing the decision of ESX versus ESXi for your environment.

Update: One more item critical to note is that although the ESX version of the patch requires no reboot, the patch does require three other patches to be installed, at least one of which requires a reboot.  If you already meet the requirements, no reboot will be required for ESX to install the new patch.

In closing, while we are on the subject of performing a lot of VMotions, take a look at a guest blog post from Simon Long called VMotion Performance. Simon shows us how to modify VirtualCenter (vCenter Server) to allow more simultaneous VMotions which will significantly cut down the amount of time spent patching ESX hosts in a cluster.

Andrew Kutz joins Hyper9

February 28th, 2009

This news is a little over a week old but I just found out two nights ago while reading vExpert profiles and it’s definitely worth repeating.

Andrew Kutz is a recently named vExpert by VMware, Inc. and a well known developer in the VMware community. Andrew has authored a number of VirtualCenter plugins, of which the most famous might be his free Storage VMotion (sVMotion) plugin which provides VMware administrators a GUI interface to hot migrate VM storage from one LUN to another. Andrew has received well deserved praise for his work because he makes the lives of VI administrators easier.

Hyper9 is a startup company in Austin, TX that works in the virtualization infrastructure management space, developing tools that automate the management of virtualization in the datacenter. Hyper9 recently secured an additional round of investment funding and it would seem they are totally serious about delivering quality products to the virtualization community in the hiring of Andrew Kutz. What can we expect out of this? Given what I’ve seen from Andrew in the past, I’ll guess the future will be plugin based architecture which I think makes a lot of sense and is probably what the majority of the community wants.

Congratulations to both Andrew Kutz and Hyper9. I look forward to your accomplishments with great anticipation!

Read the official announcement from Hyper9 here.

Tripwire Annoucement

February 19th, 2009

Press release from Tripwire.  I haven’t had time to take a look at the product yet but the announcement comes from a trustworthy and reputable source whom I respect.  I look forward to seeing some commentary either on the blog here or over at vwire.com.

TRIPWIRE ANNOUNCES FREE UTILITY TO HELP MANAGE VMWARE VMOTION, LAUNCHES NEW VIRTUALIZATION COMMUNITY
Tripwire OpsCheck addresses key virtual infrastructure operational issues; vWire.com offers an opportunity for virtual infrastructure professionals to share ideas and best practices

Portland, OR – Feb. 17, 2009 – Tripwire, Inc. today announced a major new initiative for virtual infrastructure (VI) professionals, which includes Tripwire OpsCheckTM, a free tool to manage VMware VMotion, and an online community for VI administrators. Tripwire OpsCheck assesses common configuration problems that may prevent VMotion from operating properly, and provides troubleshooting tips for configuring VMotion based on Tripwire OpsCheck test results. To download Tripwire OpsCheck, go to www.vwire.com.

To further support the needs of VI professionals, Tripwire has unveiled vWire.com, an online community built around the concerns of VI professionals. Virtualization administrators, engineers and architects are invited to join the community and conversation to share best practices, network, and gain new resources and tools. For more information about the forum, visit www.vwire.com.

“Virtualization professionals are faced with unknown territory, requiring new tools to manage the complexities and risks of virtual environments,” said Dan Schoenbaum, chief operating officer of products, Tripwire. “That’s why Tripwire is committed to developing utilities specifically for virtualization, such as OpsCheck and ConfigCheck, and to creating a forum where VI professionals can share their experiences and knowledge.”

Tripwire ConfigCheck, released in 2008, provides an immediate assessment of the configurations of a VMware ESX hypervisor, comparing them against VMware hardening security guidelines, and then providing remediation instructions if any are needed. ConfigCheck is also available for free and can be downloaded at www.vwire.com.

About Tripwire, Inc.
Tripwire helps over 6,500 enterprises worldwide reduce security risk, attain compliance and increase operational efficiency across virtual and physical environments. With its industry leading configuration assessment and change auditing software solutions, IT organizations achieve and maintain configuration control. Tripwire is headquartered in Portland, Ore. with offices worldwide. http://www.tripwire.com/.

Great iSCSI info!

January 27th, 2009

I’ve been using Openfiler 2.2 iSCSI in the lab for a few years with great success as a means for shared storage. Shared storage with VMware ESX/ESXi (along with the necessary licensing) allows us great things like VMotion, DRS, HA, etc. I’ve recently been kicking the tires of Openfiler 2.3 and have been anxious to implement partly due to the ease in its menu driven NIC bonding feature which I wanted to leverage for maximum disk I/O throughput.

Coincidentally, just yesterday a few of the big brains in the storage industry got together and published what I consider one of the best blog entries in the known universe. Chad Sakac and David Black (EMC), Andy Banta (VMware), Vaughn Stewart (NetApp), Eric Schott (Dell/EqualLogic), Adam Carter (HP/Lefthand) all conspired.

One of the iSCSI topics they cover is link aggregation over Ethernet. I read and re-read this section with great interest. My current swiSCSI configuration in the lab consists of a single 1Gb VMKernel NIC (along with a redundant failover NIC) connected to a single 1Gb NIC in the Openfiler storage box having a single iSCSI target with two LUNs. I’ve got more 1Gb NICs that I can add to the Openfiler storage box, so my million dollar question was “will this increase performance?” The short answer is NO with my current configuration. Although the additional NIC in the Openfiler box will provide a level of hardware redundancy, due to the way ESX 3.x iSCSI communicates with the iSCSI target, only a single Ethernet path will be used for by ESX to communicate to the single target backed by both LUNs.

However, what I can do to add more iSCSI bandwidth is to add the 2nd Gb NIC in the Openfiler box along with an additional IP address, and then configure an additional iSCSI target so that each LUN is mapped to a separate iSCSI target.  Adding the additional NIC in the Openfiler box for hardware redundancy is a no brainer and I probably could have done that long ago, but as far as squeezing more performance out of my modest iSCSI hardware, I’m going to perform some disk I/O testing to see if the single Gb NIC is a disk I/O bottleneck.  I may not have enough horsepower under the hood of the Openfiler box to warrant going through the steps of adding additional iSCSI targets and IP addressing.

A few of the keys I extracted from the blog post are as follows:

“The core thing to understand (and the bulk of our conversation – thank you Eric and David) is that 802.3ad/LACP surely aggregates physical links, but the mechanisms used to determine the whether a given flow of information follows one link or another are critical.

Personally, I found this doc very clarifying.: http://www.ieee802.org/3/hssg/public/apr07/frazier_01_0407.pdf

You’ll note several key things in this doc:

* All frames associated with a given “conversation” are transmitted on the same link to prevent mis-ordering of frames. So what is a “conversation”? A “conversation” is the TCP connection.
* The link selection for a conversation is usually done by doing a hash on the MAC addresses or IP address.
* There is a mechanism to “move a conversation” from one link to another (for loadbalancing), but the conversation stops on the first link before moving to the second.
* Link Aggregation achieves high utilization across multiple links when carrying multiple conversations, and is less efficient with a small number of conversations (and has no improved bandwith with just one). While Link Aggregation is good, it’s not as efficient as a single faster link.”

the ESX 3.x software initiator really only works on a single TCP connection for each target – so all traffic to a single iSCSI Target will use a single logical interface. Without extra design measures, it does limit the amount of IO available to each iSCSI target to roughly 120 – 160 MBs of read and write access.

“This design does not limit the total amount of I/O bandwidth available to an ESX host configured with multiple GbE links for iSCSI traffic (or more generally VMKernel traffic) connecting to multiple datastores across multiple iSCSI targets, but does for a single iSCSI target without taking extra steps.

Question 1: How do I configure MPIO (in this case, VMware NMP) and my iSCSI targets and LUNs to get the most optimal use of my network infrastructure? How do I scale that up?

Answer 1: Keep it simple. Use the ESX iSCSI software initiator. Use multiple iSCSI targets. Use MPIO at the ESX layer. Add Ethernet links and iSCSI targets to increase overall throughput. Ser your expectation for no more than ~160MBps for a single iSCSI target.

Remember an iSCSI session is from initiator to target. If use multiple iSCSI targets, with multiple IP addresses, you will use all the available links in aggregate, the storage traffic in total will load balance relatively well. But any individual one target will be limited to a maximum of single GbE connection’s worth of bandwidth.

Remember that this also applies to all the LUNs behind that target. So, consider that as you distribute the LUNs appropriately among those targets.

The ESX initiator uses the same core method to get a list of targets from any iSCSI array (static configuration or dynamic discovery using the iSCSI SendTargets request) and then a list of LUNs behind that target (SCSI REPORT LUNS command).”

Question 4: Do I use Link Aggregation and if so, how?

Answer 4: There are some reasons to use Link Aggregation, but increasing a throughput to a single iSCSI target isn’t one of them in ESX 3.x.

What about Link Aggregation – shouldn’t that resolve the issue of not being able to drive more than a single GbE for each iSCSI target? In a word – NO. A TCP connection will have the same IP addresses and MAC addresses for the duration of the connection, and therefore the same hash result. This means that regardless of your link aggregation setup, in ESX 3.x, the network traffic from an ESX host for a single iSCSI target will always follow a single link.

For swiSCSI users, they also mention some cool details about what’s coming in the next release of ESX/ESXi. Those looking for more iSCSI performance will want to pay attention. 10Gb Ethernet is also going to be a game changer, further threatening fibre channel SAN technologies.

I can’t stress enough how neat and informative this article is. To boot, technology experts from competing storage vendors pooled their knowledge for the greater good. That’s just awesome!