VMware released a new update for vCenter Server 7.0 u3f (read more about the new patch) however in the last few days I observed repeatable issues during patching to this version in many of vCenter environments. The patching process is going very well and suddenly stop with failure at 93%
Hmmm …. What happened? Let’s get some more information from the logs. In 1st step i checked Patchrunner.log at /var/log/Vmware/applmgmt/Patchrunner.log and see the below entries:
<Date_time> ERROR vmware_b2b.patching.phases.patcher Patch hook Patch got unhandled exception. Traceback (most recent call last): File "/storage/updatemgr/software-update5797hjqm/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 203, in patch _patchComponents(ctx, userData, statusAggregator.reportingQueue) File "/storage/updatemgr/software-update5797hjqm/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 84, in _patchComponents _startDependentServices(c) File "/storage/updatemgr/software-update5797hjqm/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 53, in _startDependentServices serviceManager.start(depService) File "/storage/updatemgr/software-update5797hjqm/stage/scripts/patches/libs/sdk/service_manager.py", line 901, in wrapper return getattr(controller, attr)(*args, **kwargs) File "/storage/updatemgr/software-update5797hjqm/stage/scripts/patches/libs/sdk/service_manager.py", line 794, in start super(VMwareServiceController, self).start(serviceName) File "/storage/updatemgr/software-update5797hjqm/stage/scripts/patches/libs/sdk/service_manager.py", line 665, in start raise IllegalServiceOperation(errorText) service_manager.IllegalServiceOperation: Service cannot be started. Error: Error executing start on service sts. Details { "detail": [ { "id": "install.ciscommon.service.failstart", "translatable": "An error occurred while starting service '%(0)s'", "args": [ "sts" "localized": "An error occurred while starting service 'sts'" "componentKey": null, "problemId": null, "resolution": null } Service-control failed. Error: { "detail": [ "id": "install.ciscommon.service.failstart", "translatable": "An error occurred while starting service '%(0)s'", "args": [ "sts" ], "localized": "An error occurred while starting service 'sts'" "componentKey": null, "problemId": null, "resolution": null <Date_time> WARNING root stopping status aggregation... <Date_time> ERROR __main__ Patch vCSA failed
Due to the above, I’m sure an error occurred during vmware-sts services tried to start. Let’s check STS logs located in sso directory /var/log/vmware/sso/vmware-identity-sts-default.log
<Date_time> WARN sts-default[25:Thread-10] [CorId= OpId=] [com.vmware.identity.util.VapiClientConnection] Caught exception invoking stub type interface com.vmware.vcenter.trust.VcTrusts. Marking connection invalid so that it can be re-established. Exception was: HTTP response with status code 503 (enable debug logging for details): no healthy upstream … <Date_time> ERROR sts-default[25:Thread-10] [CorId= OpId=] [com.vmware.identity.util.VcTrustCache] Refresh thread failed to retreive Vctrusts. com.vmware.vapi.client.exception.TransportProtocolException: HTTP response with status code 503 (enable debug logging for details): no healthy upstream at com.vmware.vapi.internal.protocol.client.rpc.http.ApacheHttpUtil.validateHttpResponse(ApacheHttpUtil.java:100) ~[vapi-runtime-7.0.0.jar:?] at com.vmware.vapi.internal.protocol.client.rpc.http.HttpClient.invoke(HttpClient.java:160) ~[vapi-runtime-7.0.0.jar:?] at com.vmware.vapi.internal.protocol.client.rpc.http.HttpClient.send(HttpClient.java:172) ~[vapi-runtime-7.0.0.jar:?] at com.vmware.vapi.internal.protocol.client.msg.json.JsonApiProvider.sendRequest(JsonApiProvider.java:186) ~[vapi-runtime-7.0.0.jar:?] at com.vmware.vapi.internal.protocol.client.msg.json.JsonApiProvider.invoke(JsonApiProvider.java:539) ~[vapi-runtime-7.0.0.jar:?] at com.vmware.vapi.internal.bindings.Stub.invoke(Stub.java:241) ~[vapi-runtime-7.0.0.jar:?] at com.vmware.vapi.internal.bindings.Stub.invokeMethodAsync(Stub.java:191) ~[vapi-runtime-7.0.0.jar:?] at com.vmware.vapi.internal.bindings.Stub.invokeMethod(Stub.java:137) ~[vapi-runtime-7.0.0.jar:?] at com.vmware.vcenter.trust.VcTrustsStub.list(VcTrustsStub.java:46) ~[vmodl2-trustmanagement-7.0.0.jar:?] at com.vmware.vcenter.trust.VcTrustsStub.list(VcTrustsStub.java:37) ~[vmodl2-trustmanagement-7.0.0.jar:?] at com.vmware.identity.util.VapiClient$1.invoke(VapiClient.java:51) ~[samlauthority-7.0.0.jar:?] at com.vmware.identity.util.VapiClient$1.invoke(VapiClient.java:47) ~[samlauthority-7.0.0.jar:?] at com.vmware.identity.util.VapiClientConnection.revalidateConnection(VapiClientConnection.java:215) ~[samlauthority-7.0.0.jar:?] at com.vmware.identity.util.VapiClientConnection.invokeStub(VapiClientConnection.java:295) ~[samlauthority-7.0.0.jar:?] at com.vmware.identity.util.VapiClient.doVcTrustsList(VapiClient.java:45) ~[samlauthority-7.0.0.jar:?] at com.vmware.identity.util.VcTrustCache.refreshTrustCache(VcTrustCache.java:419) [samlauthority-7.0.0.jar:?] at com.vmware.identity.util.VcTrustCache$TrustCacheThread.run(VcTrustCache.java:464) [samlauthority-7.0.0.jar:?] Additionally i checked vmon logs in /var/log/vmware/vmon/vmon.log <Date_time> In(05) host-45791 Received start request for sts <Date_time> In(05) host-45791 <sts-prestart> Constructed command: /usr/bin/python /usr/lib/vmidentity/install/sts-prestart-script.py /var/log/vmware/sso/sts-prestart.log <Date_time> In(05) host-45791 <sts> Service pre-start command completed successfully. <Date_time> In(05) host-45791 <sts> Constructed command: /usr/lib/vmidentity/install/sts-start-script.sh <Date_time> In(05) host-45791 <sts> Running the API Health command as user <Date_time> In(05) host-45791 <sts-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmidentity/install/sts-vmon-health-checker.py <Date_time> In(05) host-45791 Client info Uid=0,Gid=0,Pid=47658,Comm=(vmon-coredumper),PPid=2,Comm=(kthreadd),PPid=0 <Date_time> In(05) host-45791 <sts> Service is dumping core. Coredump count 0. CurrReq: 1 <Date_time> Wa(03) host-45791 [ReadSvcSubStartupData] No startup information from sts. <Date_time> Wa(03) host-45791 <sts> Service exited. Exit code 1
Due to above STS is crashing and its reason why services is not starting
Troubleshooting steps related to the vCenter certificate:
– used checksts.py script – no expired certificate
– checked solution users certificate – there are valid
– run lsdoctor.py -l script – no issue found
As no issue/errors have been found in the above steps, it’s time to search at VMware Knowledge Base article and Vmware community, however, there aren’t any relevant articles related to patching to any u3 version and vmware-sts services, so everything points that its fresh issue with not many information about it yet, but below workaround should allow You to patch successfully
Workaround (partially checked with the condition – VCSA must be AD joined):
- Recommended – take offline snapshot vCenter
- Change authentication from IWA to LDAPs and leave vCenter from AD
- service-control –stop –all & service-control –start –all
- Re-try patching again
- Do not join vCenter to Active Directory till patch are not available yet
The error is still being checked by VMware developers – follow the post for further updates soon!
Update! (25.07.2022)
VMware already published kb article with fully description of this issue:
https://kb.vmware.com/s/article/89027
Resolution
This issue is resolved in VMware vCenter Server 7.0 U3g build 20150588, available at VMware Downloads.
Please like and share to spread the knowledge in the community.
If you want to chat with me please use Twitter: @AngrySysOps
Join my VMware Knowledge Base Group: https://bit.ly/3w54tbc
Visit my FB page: https://www.facebook.com/AngrySysOps
Read my blog: https://angrysysops.com
Subscribe to my channel : https://bit.ly/3vY16CT