Reducing FT logging traffic for disk read intensive workloads

October 28th, 2010 by jason Leave a reply »

I was researching FT documentation to find out more about asymmetric logging traffic between primary and secondary FT VMs when I stumbled onto a KB article which the document mentioned. VMware KB 1011965 talks about changing the traffic pattern on the FT logging network. This is particularly helpful for a high read disk I/O FT protected VM. Normally, all disk I/O is going to traverse the FT logging network from primary to secondary VM. For FT protected VMs which have disk I/O read patterns, the FT logging network may become saturated depending on the bandwidth (1Gb vs. 10Gb) and depending on the number of protected VMs on that network, not only between two hosts, but between all the hosts in the cluster or perhaps spanning clusters, depending on how far the FT network is stretched. What the workaround does is it makes the secondary VM issue disk reads directly to the shared disk (out of band) instead of getting that data over the FT logging network to stay within vLockstep tolerances.

Given the many restrictions for FT, particularly the 1vCPU requirement, you may not have run into FT logging network saturation. However, when some of these FT restrictions are lifted, I expect disk I/O to scale up on FT protected VMs. FT will become more popular and I can see where this tweak may come in handy, particularly for those who are looking to get more mileage out of 1Gb network infrastructure which FT networks are tied to.

The workaround in the KB article is applied at the VM level by adding the following line to the .vmx configuration:

replay.logReadData = checksum

In addition, the VM must be powered off before making the .vmx change, andthen unregistered and re-registered on the host after making the .vmx change.

I wouldn’t call the configuration itself t very scalable as it’s hidden and could become an administrative burden to document and track. Perhaps we’ll see this tweak move to a spot somewhere in the GUI and maybe the option of making it a host/cluster level configuration.


Comments are closed.