Posts Tagged ‘Security’

Site Recovery Manager Firewall Rules for Windows Server

April 29th, 2020

I have a hunch Google sent you here. Before we get to what you’re looking for, I’m going to digress a little. tl;dr folks feel free to jump straight to the frown emoji below for what you’re looking for.

Since the Industrial Revolution, VMware has supported Microsoft Windows and SQL Server platforms to back datacenter and cloud infrastructure products such as vCenter Server, Site Recovery Manager, vCloud Director (rebranded recently to VMware Cloud Director), and so on. However, if you’ve been paying attention to product documentation and compatibility guides, you will have noticed support for Microsoft platforms diminishing in favor of easy to deploy appliances based on Photon OS and VMware Postgres (vPostgres). This is a good thing – spoken by a salty IT veteran with a strong Windows background.

2019 is where we really hit a brick wall. vCenter Server 6.7 is the last version that supports installation on Windows and that ended on Windows Server 2016 – there was never support for Windows Server 2019 (reference VMware KB 2091273 – Supported host operating systems for VMware vCenter Server installation). In vSphere 7.0, vCenter Server for Windows has been removed and support is not available. For more information, see Farewell, vCenter Server for Windows. Likewise, Microsoft SQL Server 2016 was the last version to support vCenter Server (matrix reference).

Site Recovery Manager (SRM) is in the same boat. It was born and bred on Winodws and SQL Server back ends. But once again we find a Photon OS-based appliance with embedded vPostres available along with product documentation which highlights diminishing support for Microsoft Windows and SQL.

Taking a closer look at the documentation…

Compatibility Matrices for VMware Site Recovery Manager 8.2

Compatibility Matrices for VMware Site Recovery Manager 8.3

  • “Site Recovery Manager Server 8.3 supports the same Windows host operating systems that vCenter Server 7.0 supports.” SRM 8.3 supports vCenter Server 6.7 as well so that should been included here also but was left out, probably an oversight.
  • Supported host operating systems for VMware vCenter Server installation (2091273)
  • Takeaway: vCenter Server 7 cannot be installed on Windows. This implies SRM 8.3 supports no version of Windows Server for installation (this implication is not at all correct as SRM 8.3 ships as a Windows executable installation for vSphere 6.x environments). Not a great spot to be in since the Photon OS-based SRM appliance employs a completely different Storage Replication Adapter (SRA) than the Windows installation and not all storage vendors support both (yet).

Ignoring the labyrinth of supported product and platform compatibility matrices above, one may choose to forge ahead and install SRM on Windows Server 2019 anyway. I’ve done it several times in the lab but there was a noted takeaway.

When I logged into the vSphere Client, the SRM plug-in was not visible. In my travels, there’s a few reasons why this symptom can occur.

  • The SRM services are not started.
  • The logged on user account is not a member of the SRM Administrators group (yes even super users like administrator@vsphere.local will need to be added to this group for SRM management).
  • The Windows Firewall is blocking ports used to present the plug-in.

Wait, what? The Windows Firewall wasn’t typically a problem in the past. That is correct. The SRM installation does create four inbound Windows Firewall rules (none outbound) on Windows Server up through 2016. However, for whatever reason, the SRM installation does not create these needed firewall rules on Windows Server 2019. The lack of firewall rules allowing SRM related traffic will block the plug-in. Reference Network Ports for Site Recovery Manager.

One obvious workaround here would be to disable the Windows Firewall but what fun would that be? Also this may violate IT security plans, trigger an audit, require exception filings. Been there, done that, ish. Let’s dig a little deeper.

The four inbound Windows Firewall rules ultimately wind up in the Windows registry. A registry export of the four rules actually created by an SRM installation is shown below. Through trial and error I’ve found that importing the rules into the Windows registry with a .reg file results in broken rules so for now I would not recommend that method.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\SharedAccess\Parameters\FirewallPolicy\FirewallRules]
"{B37EDE84-6AC1-4F7D-9E42-FA44B8D928D0}"="v2.26|Action=Allow|Active=TRUE|Dir=In|App=C:\\Program Files\\VMware\\VMware vCenter Site Recovery Manager\\bin\\vmware-dr.exe|Name=VMware vCenter Site Recovery Manager|Desc=This feature allows connections using VMware vCenter Site Recovery Manager.|"
"{F6EAE3B7-C58F-4582-816B-C3610411215B}"="v2.26|Action=Allow|Active=TRUE|Dir=In|Protocol=6|LPort=9086|Name=VMware vCenter Site Recovery Manager - HTTPS Listener Port|"
"{F6E7BE93-C469-4CE6-80C4-7069626636B0}"="v2.26|Action=Allow|Active=TRUE|Dir=In|App=C:\\Program Files\\VMware\\VMware vCenter Site Recovery Manager\\external\\commons-daemon\\prunsrv.exe|Name=VMware vCenter Site Recovery Manager Client|Desc=This feature allows connections using VMware vCenter Site Recovery Manager Client.|"
"{66BF278D-5EF4-4E5F-BD9E-58E88719FA8E}"="v2.26|Action=Allow|Active=TRUE|Dir=In|Protocol=6|LPort=443|Name=VMware vCenter Site Recovery Manager Client - HTTPS Listener Port|"

The four rules needed can be created by hand in the Windows Firewall UI, configured centrally via Group Policy Object (GPO), or scripted with netsh or PowerShell. I’ve chosen PowerShell and created the script below for the purpose of adding the rules. Pay close attention in that two of these rules are application path specific. Change the drive letter and path to the applications as necessary or the two rules won’t work properly.

# Run this PowerShell script directly on a Windows based VMware Site
# Recovery Manager server to add four inbound Windows firewall rules
# needed for SRM functionality.
# Jason Boche
# http://boche.net
# 4/29/20

New-NetFirewallRule -DisplayName "VMware vCenter Site Recovery Manager" -Description "This feature allows connections using VMware vCenter Site Recovery Manager." -Direction Inbound -Program "C:\Program Files\VMware\VMware vCenter Site Recovery Manager\bin\vmware-dr.exe" -Action Allow

New-NetFirewallRule -DisplayName "VMware vCenter Site Recovery Manager - HTTPS Listener Port" -Direction Inbound -LocalPort 9086 -Protocol TCP -Action Allow

New-NetFirewallRule -DisplayName "VMware vCenter Site Recovery Manager Client" -Description "This feature allows connections using VMware vCenter Site Recovery Manager Client." -Direction Inbound -Program "C:\Program Files\VMware\VMware vCenter Site Recovery Manager\external\commons-daemon\prunsrv.exe" -Action Allow

New-NetFirewallRule -DisplayName "VMware vCenter Site Recovery Manager Client - HTTPS Listener Port" -Direction Inbound -LocalPort 443 -Protocol TCP -Action Allow

Test execution was a success.

PS S:\PowerShell scripts> .\srmaddwindowsfirewallrules.ps1


Name                  : {10ba5bb3-6503-44f8-aad3-2f0253c980a6}
DisplayName           : VMware vCenter Site Recovery Manager
Description           : This feature allows connections using VMware vCenter Site Recovery Manager.
DisplayGroup          :
Group                 :
Enabled               : True
Profile               : Any
Platform              : {}
Direction             : Inbound
Action                : Allow
EdgeTraversalPolicy   : Block
LooseSourceMapping    : False
LocalOnlyMapping      : False
Owner                 :
PrimaryStatus         : OK
Status                : The rule was parsed successfully from the store. (65536)
EnforcementStatus     : NotApplicable
PolicyStoreSource     : PersistentStore
PolicyStoreSourceType : Local

Name                  : {ea88dee8-8c96-4218-a23d-8523e114d2a9}
DisplayName           : VMware vCenter Site Recovery Manager - HTTPS Listener Port
Description           :
DisplayGroup          :
Group                 :
Enabled               : True
Profile               : Any
Platform              : {}
Direction             : Inbound
Action                : Allow
EdgeTraversalPolicy   : Block
LooseSourceMapping    : False
LocalOnlyMapping      : False
Owner                 :
PrimaryStatus         : OK
Status                : The rule was parsed successfully from the store. (65536)
EnforcementStatus     : NotApplicable
PolicyStoreSource     : PersistentStore
PolicyStoreSourceType : Local

Name                  : {a707a4b8-b0fd-4138-9ffa-2117c51e8ed4}
DisplayName           : VMware vCenter Site Recovery Manager Client
Description           : This feature allows connections using VMware vCenter Site Recovery Manager Client.
DisplayGroup          :
Group                 :
Enabled               : True
Profile               : Any
Platform              : {}
Direction             : Inbound
Action                : Allow
EdgeTraversalPolicy   : Block
LooseSourceMapping    : False
LocalOnlyMapping      : False
Owner                 :
PrimaryStatus         : OK
Status                : The rule was parsed successfully from the store. (65536)
EnforcementStatus     : NotApplicable
PolicyStoreSource     : PersistentStore
PolicyStoreSourceType : Local

Name                  : {346ece5b-01a9-4a82-9598-9dfab8cbfcda}
DisplayName           : VMware vCenter Site Recovery Manager Client - HTTPS Listener Port
Description           :
DisplayGroup          :
Group                 :
Enabled               : True
Profile               : Any
Platform              : {}
Direction             : Inbound
Action                : Allow
EdgeTraversalPolicy   : Block
LooseSourceMapping    : False
LocalOnlyMapping      : False
Owner                 :
PrimaryStatus         : OK
Status                : The rule was parsed successfully from the store. (65536)
EnforcementStatus     : NotApplicable
PolicyStoreSource     : PersistentStore
PolicyStoreSourceType : Local



PS S:\PowerShell scripts>

After logging out of the vSphere Client and logging back in, the Site Recovery plug-in loads and is available.

Feel free to use this script but be advised, as with anything from this site, it comes without warranty. Practice due diligence. Test in a lab first. Etc.

With virtual appliance being fairly mainstream at this point, this article probably won’t age well but someone may end up here. Maybe me. It has happened before.

vCloud Director 5.6.4 Remote consoleproxy issues

June 12th, 2015

vCloud Director is a wonderful IaaS addition to any lab, development, or production environment. When it’s working properly, it is a very satisfying experience wielding the power of agility, consistency, and efficiency vCD provides. However, like many things tech with upstream and human dependencies, it can and does break. Particularly in lab or lesser maintained environments that don’t get all the care and feeding production environments benefit from. When it breaks, it’s not nearly as much fun.

This week I ran into what seemed like a convergence of issues with vCD 5.6.4 relating to the Remote Console functionality in conjunction with SSL certificates, various browser types, networking, and 32-bit Java. As is the case often, what I’m documenting here is really more for my future benefit as there were a number of sparse areas I covered which I won’t necessarily retain in memory long but as it goes with blogs and information sharing, sharing is caring.

The starting point was a functional vCD 5.6.4-2496071 environment on vSphere 5.5. Everything historically and to date working normally with the exception of the vCD console which had stopped working recently in Firefox and Google Chrome browsers. Opening the console in either browser from seemingly any client workstation yielded the pop out console window with toolbar buttons along the top, but there was no guest OS console painted in the main window area. It was blank. The status of the console would almost immediately change to Disconnected. I’ve dealt with permutations of this in the past and I verified all of the usual suspects: NTP, DNS, LDAP, storage capacity, 32-bit Java version, blocked browser plug-ins, etc. No dice here.

In Firefox, the vCD console status shows Disconnected while the Inspect Element console shows repeated failed attempts to connect to the consoleproxy address.

10:11:30.195 "10:11:30 AM [TRACE] mks-connection: Connecting to wss://172.16.21.151/902;cst-t3A6SwOSPRuUqIz18QAM1Wrz6jDGlWrrTlaxH8k6aYuBKilv/1mc7ap50x3sPiHiSJYoVhyjlaVuf6vKfvDPAlq2yukO7qzHdfUTsWvgiZISK56Q4r/4ZkD7xWBltn15s5AvTSSHKsVbByMshNd9ABjBBzJMcqrVa8M02psr2muBmfro4ZySvRqn/kKRgBZhhQEjg6uAHaqwvz7VSX3MhnR6MCWbfO4KhxhImpQVFYVkGJ7panbjxSlXrAjEUif7roGPRfhESBGLpiiGe8cjfjb7TzqtMGCcKPO7NBxhgqU=-R5RVy5hiyYhV3Y4j4GZWSL+AiRyf/GoW7TkaQg==--tp-B5:85:69:FF:C3:0A:39:36:77:F0:4F:7C:CA:5F:FE:B1:67:21:61:53--"1 debug.js:18:12

10:11:30.263 Firefox can't establish a connection to the server at wss://172.16.21.151/902;cst-t3A6SwOSPRuUqIz18QAM1Wrz6jDGlWrrTlaxH8k6aYuBKilv/1mc7ap50x3sPiHiSJYoVhyjlaVuf6vKfvDPAlq2yukO7qzHdfUTsWvgiZISK56Q4r/4ZkD7xWBltn15s5AvTSSHKsVbByMshNd9ABjBBzJMcqrVa8M02psr2muBmfro4ZySvRqn/kKRgBZhhQEjg6uAHaqwvz7VSX3MhnR6MCWbfO4KhxhImpQVFYVkGJ7panbjxSlXrAjEUif7roGPRfhESBGLpiiGe8cjfjb7TzqtMGCcKPO7NBxhgqU=-R5RVy5hiyYhV3Y4j4GZWSL+AiRyf/GoW7TkaQg==--tp-B5:85:69:FF:C3:0A:39:36:77:F0:4F:7C:CA:5F:FE:B1:67:21:61:53--.1 wmks.js:321:0

tail -f /opt/vmware/vcloud-director/logs/vcloud-container-debug.log |grep consoleproxy revealed:
2015-06-12 10:50:54,808 | DEBUG    | consoleproxy              | SimpleProxyConnectionHandler   | Initiated handling for channel 0x22c9c990 [java.nio.channels.SocketChannel[connected local=/172.16.21.151:443 remote=/172.31.101.6:61719]] |
2015-06-12 10:50:54,854 | DEBUG    | consoleproxy              | ReadOperation                  | IOException while reading data: java.io.IOException: Broken pipe |
2015-06-12 10:50:54,855 | DEBUG    | consoleproxy              | ChannelContext                 | Closing channel java.nio.channels.SocketChannel[connected local=/172.16.21.151:443 remote=/172.31.101.6:61719] |
2015-06-12 10:50:55,595 | DEBUG    | consoleproxy              | SimpleProxyConnectionHandler   | Initiated handling for channel 0xd191a58 [java.nio.channels.SocketChannel[connected local=/172.16.21.151:443 remote=/172.31.101.6:61720]] |
2015-06-12 10:50:55,648 | DEBUG    | pool-consoleproxy-4-thread-289 | SSLHandshakeTask               | Exception during handshake: java.io.IOException: Broken pipe |
2015-06-12 10:50:56,949 | DEBUG    | consoleproxy              | SimpleProxyConnectionHandler   | Initiated handling for channel 0x3f0c025b [java.nio.channels.SocketChannel[connected local=/172.16.21.151:443 remote=/172.31.101.6:61721]] |
2015-06-12 10:50:57,003 | DEBUG    | pool-consoleproxy-4-thread-301 | SSLHandshakeTask               | Exception during handshake: java.io.IOException: Broken pipe |
2015-06-12 10:50:59,902 | DEBUG    | consoleproxy              | SimpleProxyConnectionHandler   | Initiated handling for channel 0x1bda3590 [java.nio.channels.SocketChannel[connected local=/172.16.21.151:443 remote=/172.31.101.6:61723]] |
2015-06-12 10:50:59,959 | DEBUG    | pool-consoleproxy-4-thread-295 | SSLHandshakeTask               | Exception during handshake: java.io.IOException: Broken pipe |

In Google Chrome, the vCD console status shows Disconnected while the Inspect element console (F12) shows repeated failed attempts to connect to the consoleproxy address.

10:26:43 AM [TRACE] init: attempting ticket acquisition for vm vcdclient
10:26:44 AM [TRACE] plugin: Connecting vm
10:26:44 AM [TRACE] mks-connection: Connecting to wss://172.16.21.151/902;cst-f2eeAr8lNU6BTmeVelt9L8VKoe92kJJMxZCC2watauBV6/x…fmI8Xg==--tp-B5:85:69:FF:C3:0A:39:36:77:F0:4F:7C:CA:5F:FE:B1:67:21:61:53--
WebSocket connection to 'wss://172.16.21.151/902;cst-f2eeAr8lNU6BTmeVelt9L8VKoe92kJJMxZCC2watauBV6/x…fmI8Xg==--tp-B5:85:69:FF:C3:0A:39:36:77:F0:4F:7C:CA:5F:FE:B1:67:21:61:53--' failed: WebSocket opening handshake was canceled
10:26:46 AM [ERROR] mks-console: Error occurred: [object Event]
10:26:46 AM [TRACE] mks-connection: Disconnected [object Object]

tail -f /opt/vmware/vcloud-director/logs/vcloud-container-debug.log |grep consoleproxy revealed:
2015-06-12 10:48:35,760 | DEBUG    | consoleproxy              | SimpleProxyConnectionHandler   | Initiated handling for channel 0x55efffb3 [java.nio.channels.SocketChannel[connected local=/172.16.21.151:443 remote=/172.31.101.6:61675]] |
2015-06-12 10:48:39,754 | DEBUG    | consoleproxy              | SimpleProxyConnectionHandler   | Initiated handling for channel 0x3f123a13 [java.nio.channels.SocketChannel[connected local=/172.16.21.151:443 remote=/172.31.101.6:61677]] |
2015-06-12 10:48:42,658 | DEBUG    | consoleproxy              | SimpleProxyConnectionHandler   | Initiated handling for channel 0x7793f0a [java.nio.channels.SocketChannel[connected local=/172.16.21.151:443 remote=/172.31.101.6:61679]] |

If you have acute attention to detail, you’ll notice the time stamps from the cell logs don’t correlate closely with the time stamps from the browser Inspect element console. Normally this would indicate time skew or an NTP issue which does cause major headaches with functionality but that’s by design here for my various screen captures and log examples aren’t from the exact same point in time. So it’s safe to move on.

Looking at the most recent vCloud Director For Service Providers installation documentation, I noticed a few things.

  1. Although I did upgrade vCD a few months ago to the most current build at the time, there’s a newer build available: 5.6.4-2619597
  2. Through repetition, I’ve gotten quite comfortable with the use of Java keytool and its parameters. However, additional parameters have been added to the recommended use of the tool. Noted going forward.
  3. VMware self signed certificates expire within three (3) months. Self signed certificates were in use in this environment. I haven’t noticed this behavior in the past nor has it presented itself as an issue but after a quick review, the self signed certificates generated a few months ago with the vCD upgrade had indeed expired recently.

At this point I was quite sure the expired certificates was the problem although it seemed strange the vCD portal was still usable while only the consoleproxy was giving me fits.  So I went through the two minute process of regenerating and installing new self signed certificates for both http and the consoleproxy.  The vCD installation guide more or less outlines this process as it is the same for a new cell installation as it is for replacing certificates. VMware also has a few KB articles which address it as well (1026309, 2014237). For those going through this process, you should really note the keytool parameter changes/additions in the vCD installation guide.

While I was at it, I also built a new replacement cell on a newer version of RHEL 6.5, performed the database upgrades, extended the self signed certificate default expiration from three months to three years, and I retired the older RHEL 6.4 cell. Fresh new cell. New certs. Ready to rock and roll.

Not so much. I still had the same problem with the console showing Disconnected. However, the Inspection element console for each browser are now indicating some new error message which I don’t have handy at the moment but basically it can’t talk to the consoleproxy adddress at all. I tried to ping the address and it was dead from a remote station point of view although it was quite alive at a RHEL 6.5 command prompt. Peters Virtual Notes had this one covered thankfully. According to https://access.redhat.com/site/solutions/53031, a small change is needed for the file /etc/sysctl.conf.

net.ipv4.conf.default.rp_filter = 1

must be changed to

net.ipv4.conf.default.rp_filter = 2

Success. Surely consoleproxy will work now. Unfortunately it still does not want to work. Back to the java.io.IOException: Broken pipe SSL handshake issues although the new certificate for vCD’s http address is registered and working fine (remembering again each vCD cell has two IP addresses, one for http access and one for consoleproxy functionality – each requires a trusted SSL certificate or an exception).

The last piece of the puzzle was something I have never had to do in the past and that is to manually add an exception (Firefox) for the consoleproxy self signed certificate and install it (Google Chrome). For each browser, this is a slightly different process.

For Firefox, browse to the https:// address of the consoleproxy, don’t worry, nothing visible should be displayed when it does not receive a properly formatted request. The key here is to add an exception for the certificate associated specifically to the consoleproxy address.

Once this certificate exception is added, the consoleproxy certificate is essentially trusted and so is the IP address for the host and the console service it is supposed to provide.

To resolve the consoleproxy issue for Google Chrome, the process is slightly different. Ironically I found it easiest to use Internet Explorer for this. Open Internet Explorer and when you do so, be sure to right click on the IE shortcut and Run as administrator (this is key in a moment). Browse to the https:// address of the consoleproxy, don’t worry, nothing visible should be displayed when it does not receive a properly formatted request. Continue to this website and then use the Certificate Error status message in the address bar to view the certificate being presented. The self signed consoleproxy certificate needs to be installed. Start this task using the Install Certificate button. This button is typically missing when launching IE normally but it is revealed when launching IE with Run as administrator rights.

Browse for the location to install the self signed certificate. Tick the box Show physical stores. Drill down under Third-Party Root Certification Authorities. Install the certificate in the Local Computer folder. This folder is typically missing when launching IE normally but it is revealed when launching IE with Run as administrator rights.

Once this certificate is installed, the consoleproxy certificate is essentially trusted in Google Chrome. Just as with the Firefox remedy, the Java SSL handshake with the consoleproxy succeeds and the vCD remote console is rendered.

Note that for Google Chrome, there is another quick method to temporarily gain functional console access without installing the consoleproxy certificate via Internet Explorer.

  1. Open a Google Chrome browser and browse to the https:// address of the consoleproxy.
  2. When prompted with Your connection is not private, click the Advanced link.
  3. Click the Proceed to <console proxy IP address> (unsafe) link.
  4. Nothing will visibly happen except Google Chrome will now temporarily trust the consoleproxy certificate and the vCD remote console will function for as long as a Google Chrome tab remains open.
  5. Without closing Google Chrome, now continue into the vCD organization portal and resume business as usual with functional remote consoles.

On the topic of Google Chrome, internet searches will quickly reveal vCloud Director console issues with Google Chrome and NPAPI. VMware discusses this in the vCloud Director 5.5.2.1 Release Notes:

Attempts to open a virtual machine console on Google Chrome fail
When you attempt to open a virtual machine console on a Google Chrome browser, the operation fails. The occurs due to the deprication of NPAPI in Google Chrome. vCloud Director 5.5.2.1 uses WebMKS instead of the VMware Remote Console to open virtual machine consoles in Google Chrome, which resolves this issue.

I was working with vCD 5.6.x which leverages WebKMS in lieu of NPAPI so the NPAPI issue was not relevant in this case but if you are running into an NPAPI issue, Google offers How to temporarily enable NPAPI plugins here.

Update 8/8/15: Josiah points out a useful VMware forum thread which may help resolve this issue further when FQDNs are defined in DNS for remote console proxies or where multiple vCloud cells are installed in a cluster behind a front end load balancer, NAT/reverse proxy, or firewall.

Update 7/17/20: The VMware Cloud Director virtual appliance with embedded PostgreSQL database by default uses eth0 for the console proxy address along with port 8443. ie. https://100.88.144.13:8443. This is the URL that must be trusted in order to open a VMware Cloud Director remote console without the dreaded Disconnected message. Find this address and port combination to trust in a Disconnected console browser window by pressing SHIFT + CTRL + J or F12 which opens the Elements window. This information was previously published in VMware KB 2058496 Cannot connect to vCloud Director WebMKS console with Mozilla Firefox or Google Chrome which has been taken down but the cached version of the page still remains.

VMware vSphere Hardening Guides

June 7th, 2014

Quick security related resource pointer on a Saturday morning. Over the years I’ve been collecting the various vSphere hardening guide documents as they are released.  These guides can be used to lock down your own (or your customer’s) environment to prevent or isolate security related breaches and to satisfy internal or external IT audits. Thanks to Mike Foley, I noticed the vSphere Hardening Guide 5.5 Update 1 was released yesterday. You’ll find adds/moves/changes in the following categories:

  • General (VCM, etc.)
  • SSO
  • ESXi
  • Virtual Machines
  • vCenter Server and VCSA
  • VUM (Update Manager)
  • vSphere Web Client

If you haven’t yet, grab the guide, take a look at it, and upgrade to vSphere 5.5 Update 1, hopefully in that order.

In the past I recall these guides were spread out on somewhat sparsely on VMware’s site. What I hadn’t noticed until this morning is that VMware has now compiled all available vSphere hardening guide links onto a single landing page in addition to providing change tracking between each of the vSphere 5.x guides which I think is quite helpful.

Failed to connect to VMware Lookup Service

March 14th, 2014

Judging by the search results returned by Google, it looks like my blog is among the few virtualization blogs remaining which does not have a writeup on this topic.  It’s Friday so… why not.

Scenario:  vSphere 5.5 Update 1 VMware vSphere Web Client fails to log into vCenter Server (appliance version) with the following error returned:

Failed to connect to VMware Lookup Service

https://fqdn:7444/lookupservice/sdk –

SSL certificate verification failed.

Snagit Capture

Contributing factors in my case which may have played a role in this once working environment:

  1. Recently upgraded vCenter 5.5.0 Server appliance to Update 1 (unlikely as other similar environments were not impacted after upgrade)
  2. This particular vCenter appliance was deployed as a vApp from a vCloud Director catalog (likely  but unknown at this time if a customization was possible or attempted during deployment)
  3. The hostname of the appliance may have been changed recently (very likely)

The solution is quite simple.

  1. Log into the vCenter Server appliance management interface (https://fqdn:5480/)
  2. Navigate to the Admin tab
  3. Certificate regeneration enabled: choose Yes
  4. Click the Submit button
  5. Navigate to the System tab
  6. Reboot the appliance

After the appliance reboots

  1. Log into the vCenter Server appliance management interface (https://fqdn:5480/)
  2. Navigate to the Admin tab
  3. Certificate regeneration enabled: choose No
  4. Click the Submit button
  5. Log out of the vCenter Server appliance management interface
  6. Log into the VMware vSphere Web Client normally

Admittedly I recalled the Certificate regeneration feature first by logging into the vCenter Server appliance management interface, but then verified with a search to ensure the purpose of the Certificate regeneration feature.  The search results turned up Failed to connect to VMware Lookup Service – SSL Certificate Verification Failed (among many other blog posts as mentioned earlier) in addition to VMware KB 20333338 Troubleshooting the vCenter Server Appliance with Single Sign-On login.  Both more or less highlight a discrepancy between the appliance hostname and the SSL certificate resulting in the need to regenerate the certificate to match the currently assigned hostname.

I ran across another issue this week during the Update 1 upgrade to the vCenter appliance which I may or may not get to writing about today.

At any rate, have wonderful and Software Defined weekend!

vCenter Server Appliance 5.5 root account locked out after password expiration

January 10th, 2014

Thanks to Chris Colotti, I learned of a new VMware KB article today which could potentially have wide spread impact, particularly in lab, development, or proof of concept environments.  The VMware KB article number is 2069041 and it is titled The vCenter Server Appliance 5.5 root account locked out after password expiration.

In summary, the root account of the vCenter Server Appliance version 5.5 becomes locked out 90 days after deployment or root account password change.  This behavior is by design which follows a security best practice of password rotation.  In this case, the required password rotation interval is 90 days after which the account will be forcefully locked out if not changed.

The KB article describes processes to prevent a forced lockout as well as unlocking a locked out root account.

Approximately 90 days have elapsed since the release of vSphere 5.5 and I imagine this issue will quickly begin surfacing in large numbers where the vCenter Server Appliance 5.5 has been deployed using system defaults.

Update 6/16/16: For more information on vCenter Server Appliance password policies, including the local root account, check out vCSA 6.0 tricks: shell access, password expiration and certificate warnings.

Single Sign-On Warning 25000

November 12th, 2013

Up to this point, I’ve deployed several net new instances of vCenter Server 5.5 and of course its essential components including Single Sign-On, Inventory Service, next generation Web Client, and the legacy vSphere Client.  Most of these deployments leveraged the vCenter appliance.  Using the appliance is a very easy to deploy vCenter because all of the essential components are pre-installed in the appliance and only need to be configured.

One area I hadn’t tackled much yet are upgrades of existing Windows-based vCenter environments to vSphere 5.5.  Having recently completed an inline upgrade of vCloud Director 5.1.2 to 5.5, it was now time to upgrade said vCloud’s underlying vSphere 5.1 (Update 1a I believe) virtual infrastructure.   Prior to starting the upgrade, I took the necessary precautions of getting a point in time snapshot of the vCenter Server, the vCloud Director Cells, and the Microsoft SQL Server databases for each (three total: SSO, vCenter, and vCD).  I accomplished this using array based snapshots – in this case Dell Compellent Storage Center Replays.

I launched autorun from the vCenter 5.5 installation media.  I opted for the custom installation and started with the Single Sign-On (SSO) upgrade from 5.1 to 5.5.  During the installation, I was met with

Warning 25000.  Please verify that the SSL certificate for your vCenter Single Sign-On 5.1 SSL is not expired.  If it did expire, please replace it with a valid certificate before upgrading to vCenter Single Sign-On 5.5.

Snagit Capture

In this particular environment, self-signed certificates from VMware were in use.  I know that this environment was deployed new less than two years ago and a verification of the SSL certificates in use proved that none were expired.  But because SSO and vCenter are such integral components to vCloud Director, I didn’t want to proceed without further vetting this out.

Google.

Upgrade from vSphere 5.1 to vSphere 5.5 rolls back after importing Lookup Service data (2060511) – This KB article describes a situation in which Warning 25000 results when a registry value on the existing Windows-based SSO 5.1 server does not match a field on the SSL certificate.  The resolution involves simply changing the registry value to match that which is on the SSL certificate.  I won’t repeat the details because you can read the KB article yourself.  Furthermore it didn’t resolve the problem in this case because the field on my SSL certificate and the registry key were an identical match.

Upgrading to VMware vCenter Single Sign-On 5.5 displays the error: Warning 25000 (2061478) – This KB article describes a problem for which there is no resolution. However, there is a workaround and that involves changing service_id and service.properties files.  More detail is available in the KB article and again the symptoms in the log files weren’t a close match.

The Trouble With SSL Certificates and Upgrading to VMware SSO 5.5 – Then I took a look at Michael Webster’s blog article on precisely the same error message.  Michael briefly discusses the two SSL certificate deployment models and then digs into VMware KB 2060511 mentioned above.  While the information in Michael’s blog article reassured me I was not alone in my journey, KB 2060511 didn’t solve my problem either.  But sometimes the value of blog articles is not only in the original author’s content, but also in the follow up comments from the readers.  Such was the case here.  A number of Michael’s readers responded by saying they were essentially in the same boat I’m in – it sounds like KB 2060511, but in the end this article doesn’t have the solution because there was nothing wrong with their SSO registry values.  The readers found no choice but to push onward beyond Warning 25000 with fingers crossed.  As it turned out in my as well as with some others, Warning 25000 was benign in nature and the installation completed successfully with no rollback.

In summary, this blog post does not represent global authority to ignore Warning 25000.  Rather it is meant to highlight one particular scenario where Warning 25000 may present itself and the actions that were taken to work through the problem.  I can’t stress enough the importance of the SSO component of vCenter going forward.  If any conclusion can be drawn here, it is that a backup of the infrastructure components should be secured before committing to the upgrade steps.  In this case, snapshots are the quickest and easiest method to provide data protection and recovery.  Although vSphere snapshots would work in some deployment architectures, recovering an environment when the environment being upgraded is managing the snapshots could be a challenge.  That is why I chose an out of band array based snapshot in this instance.

I would also like to point out in closing that vSphere 5.5 is still relatively new and VMware appears to still be chasing down all possible causes, resolutions, and workarounds to Warning 25000.  New information as well as VMware KB articles may develop subsequent to this writing so it may be worth continuing your own Google searching beyond this point.

Have a great week!

A Look At vCenter 5.5 SSO RC Installation

August 30th, 2013

This week at VMworld 2013, I attended a few sessions directly related to vCenter 5.5 as well as its components, one of which is vCenter Single Sign On (SSO):

  • VSVC5234 – Extreme Performance Series: vCenter of the Universe
  • VSVC4830 – vCenter Deep Dive

First of all, both sessions were excellent and I highly recommend viewing them if you have access to the post conference recordings. 

If you followed my session tweets or if perhaps you’ve read half a dozen or more already available blog posts on the subject, you know that several improvements have been made to vCenter SSO for the vSphere 5.5 release.  For instance:

  • Completely re-written from the ground
  • Multi-master architecture
  • Native replication mechanism
  • SSO now has site awareness (think of the possibilities for HA stretched clusters)
  • MMC based diagnostic suite available as a separately maintained download
  • The external database and its preparation dependency has been removed
  • Database patitioning to improve both scalability and performance (this was actually added in 5.1 but I wanted to call it out)
  • Revamped multi-site deployment architecture
  • Full Mac OS X web client support including remote console
  • Improved certificate management
  • Multi-tenant capabilities
  • Drag ‘n’ Drop in the 5.5 web client

With some of the new features now identified and VMware’s blessing, have a look at the installation screens and see if you can spot the differences as compared to a vCenter 5.1 SSO installation.  These stem from a manual installation of SSO, not an automated installation of all vCenter components (by the way, the next gen web client is now installed as part of an automated vCenter 5.5 installation whereas it was not in 5.1).  Keep in mind these were pulled from a release candidate version and may change when vCenter 5.5 GAs at a future date.

I noticed one subtle change here – clicking on the Microsoft .NET 3.5 SP1 link in Windows 2008R2 actually installs the feature rather than just throwing up a dialogue box asking you to install the feature yourself.

Snagit Capture

As this is a manual installation, we have the option to use the default or specify the installation location.  Best practice is to install all vCenter components together so that they can communicate at server bus speed and won’t be impacted by network latency.  However, for larger scale environments, SSO should be isolated on a separate server with five or more vCenter Servers in the environment.  On a somewhat related note, the Inventory Service may benefit from an installation on SSD, again in large infrastructures.

Snagit Capture

We won’t likely see this in the GA version.

Snagit Capture

We’re going through the process of installing vCenter version 5.5 but in terms of the SSO component, again this is a complete re-write and bears the respective version of 2.0.

Snagit Capture

We always read the EULA in full and agree to the license terms and conditions.

Snagit Capture

 

Snagit Capture

Big changes here.  Note the differences in the deployment models compared to the previous 5.1 version – previous deployment models are honored through an upgrade to 5.5.  Again, this is where the VMworld sessions noted above really go into detail. 

Snagit Capture

the System-Domain namespace has been replaced with vsphere.local.

Snagit Capture

The new site awareness begins here.

Snagit Capture

Snagit Capture

Snagit Capture

Snagit Capture

I hope you agree that SSO installation in vCenter 5.5 has been simplified while many new features have been added at the same time.

As always, thank you for reading and it was a pleasure to meet and see everyone again this year at VMworld.