SRM 5.0 Replication Bits and Bytes

October 3rd, 2011 by jason Leave a reply »

VMware has pushed out several releases and features in the past several weeks.  It can be a lot to digest, particularly if you’ve been involved in the beta programs for these new products because there were some changes made when the bits made their GA debut. One of those new products is SRM 5.0.  I’ve been working a lot with this product lately and I thought it would be helpful to share some of the information I’ve collected along the way.

One of the new features in SRM 5.0 is vSphere Replication.  I’ve heard some people refer to it as Host Based Replication or HBR for short.  In terms of how it works, this is an accurate description and it was the feature name during the beta phase.  However, by the time SRM 5.0 went to GA, each of the replication components went through a name change as you’ll see below. If you know me, you’re aware that I’m somewhat of a stickler on branding.  As such, I try to get it right as much as possible myself, and I’ll sometimes point out corrections to others in an effort to lessen or perpetuate confusion.

Another product feature launched around the same time is the vSphere Storage Appliance or VSA for short.  In my brief experience with both products I’ve mentioned so far, I find it’s not uncommon for people to associate or confuse SRM replication with a dependency on the VSA.  This is not the case – they are quite independent.  In fact, one of the biggest selling points of SRM based replication is that it works with any VMware vSphere certified storage and protocol.  If you think about it for a minute, this now becomes a pretty powerful for getting a DR site set up with what you have today storage wise.  It also allows you to get SRM in the door based on the same principles, with the ability to grow into scalable array based replication in an upcoming budget cycle.

With that out of the way, here’s a glimpse at the SRM 5.0 native replication components and terminology (both beta and GA).

Beta Name GA Name GA Acronym
HBR vSphere Replication VR
HMS vSphere Replication Management Server vRMS
HBR server vSphere Replication Server vRS
ESXi HBR agent vSphere Replication Agent vR agent


Here is a look at how the SRM based replication pieces fit in the SRM 5.0 architecture.  Note the storage objects shown are VMFS but they could be both VMFS datastores as well as NFS datastores on either side:

Snagit Capture

Diagram courtesy VMware, Inc.

To review, the benefits of vSphere Replication are:

  1. No requirement for enterprise array based replication at both sites.
  2. Replication between heterogeneous storage, whatever that storage vendor or protocol might be at each site (so long as it’s supported on the HCL).
  3. Per VM replication. I didn’t mention this earlier but it’s another distinct advantage of VR over per datastore replication.
  4. It’s included in the cost of SRM licensing. No extra VMware or array based replication licenses are needed.

Do note that access to the VR feature is by way of a separate installable component of SRM 5.0.  If you haven’t already installed the component during the initial SRM installation, you can do so afterwards by running the SRM 5.0 setup routine again at each site.

I’ve talked about the advantages of VR.  Again, I think they are a big enabler for small to medium sized businesses and I applaud VMware for offering this component which is critical to the best possible RPO and RTO.  But what about the disadvantages compared to array based replication?  In no particular order:

  1. Cannot replicate templates.  The ‘why’ comes next.
  2. Cannot replicate powered off virtual machines.  The ‘why’ for this follows.
  3. Cannot replicate files which don’t change (powered off VMs, ISOs, etc.)  This is because replications are handled by the vRA component – a shim in vSphere’s storage stack deployed on each ESX(i) host.  By the way, Changed Block Tracking (CBT) and VMware snapshots are not used by the vRA.  The mechanism uses a bandwidth efficient technology similar to CBT but it’s worth pointing out it is not CBT.  Another item to note here is that VMs which are shut down won’t replicate writes during the shutdown process.  This is fundamentally because only VMs which are powered on and stay powered on are replicated by VR.  Current state of the VM would, however, be replicated once the VM is powered back on.
  4. Cannot replicate FT VMs. Note that array based replication can be used to protect FT VMs but once recovered they are not longer FT enabled.
  5. Cannot replicate linked clone trees (Lab Manager, vCD, View, etc.)
  6. Array based replication will replicate a VMware based snapshot hierarchy to the destination site while leaving them in tact. VR can replicate VMs with snapshots but they will be consolidated at the destination site.  This is again based on the principle that only changes are replicated to the destination site.
  7. Cannot replicate vApp consistency groups.
  8. VR does not work with virtual disks opened in “multi-writer mode” which is how MSCS VMs are configured.
  9. VR can only be used with SRM.  It can’t be used as a data replication for your vSphere environment outside of SRM.
  10. Losing a vSphere host means that the vRA and the current replication state of a VM or VMs is also lost.  In the event of HA failover, a full-sync must be performed for these VMs once they are powered on at the new host (and vRA).
  11. The number of VMs which can be replicated with VR will likely be less than array based replication depending on the storage array you’re comparing to.  In the beta, VR supported 100 VMs.  At GA, SRM 5.0 supports up to 500 VMs with vSphere Replication. (Thanks Greg)
  12. In band VR requires additional open TCP ports:
    1. 31031 for initial replication
    2. 44046 for ongoing replication
  13. VR requires vSphere 5 hosts at both the protected and recovery sites while array based replication follows only general SRM 5.0 minimum requirements of vCenter 5.0 and hosts which can be 3.5, 4.x, and/or 5.0.

The list of disadvantages appears long but don’t let that stop you from taking a serious look at SRM 5.0 and vSphere Replication.  I don’t think there are many, if any, showstoppers in that list for small to medium businesses.

I hope you find this useful.  I gathered the information from various sources, much of it from an SRM Beta FAQ which to the best of my knowledge are still fact today in the GA release.  If you find any errors or would like to offer corrections or additions, as always please feel free to use the Comments section below.


No comments

  1. The list of disadvantages of vSphere Replication versus array replication is even longer:
    vSphere Replication does not support automated failback. It also has file level consistency only (no application consistency). vSphere Replication needs ESXi 5. vSphere 4 or lower hosts are not supported.

    More info

  2. jason says:

    Thanks Marcel. The ESXi 5 requirement was covered in bullet 13.


  3. latoga says:

    Been dealing with SRM 5.0 recently with a client of mine…

    RE VR limits: SRM 5.0 supports up to 500 VMs with vSphere Replication (see Admin Guide page 25).

  4. jason says:

    Thanks for the update Greg!

  5. Tomas Fojta says:

    I would also mention that asynchronous replication is not supported by VR. With vSphere Replication RPO is 15 min or higher.

  6. LP says:

    Refer to point 10 under the disadvantages section, the full re-sync after HA. Is this the case if the VM vMotions’ to other host, i.e will this invoke a a full re-sync?

  7. jason says:

    From the SRM Beta FAQ:
    “If no changes to the disk have been made ‘behind the back’ of the VR infrastructure, then only deltas (sets of changed blocks) are sent to complete a new copy. If the disks are changed outside of the VR infrastructure (or if the VM is new to VR), the VR system will compare the disks (without sending the content, just checksums between sites) to re-establish its state, and then send deltas going forward. This complete disk sync step involves reading the entire disk at each site, but matching disk blocks will not be re-sent.”

  8. Ian Masters says:

    What do you do if you have physical servers that also nee protecting and do not wish to have multiple products running for HA and DR??

  9. Bert says:

    Can anyone tell me how the replication works?
    – Are only the changes replicated
    – what line capicity is needed (are there any numbers)
    – What if the line is interrupted? does SRM start a new full replication?
    – Is the data encrypted? Or how is security provided?

    I cannot find these answers with VMware.

  10. David says:

    I’m with Bert. These answers seem hard to find.
    I would like to set this up between sites connected via VPN and some of the sites only have 1.5 mb/s of bandwidth. Also what doesn’t seem to be covered is the IP addressing for each site. If running SRM replication via VPN, i would think you would create the internal IP addressing of your DR site with a different subnet. If you then need to failover to the replicated VM’s, you would need to change around your public DNS since each site would have different public IP’s, as well as internal DNS A records for the replicated VM’s. Then switch back when switching back to protected site. Does anyone have any info on these questions as well as Bert’s? Can you stop the replication and restart it until the replication completes if you have limited bandwidth?