SCCM SUP cannot connect with WSUS Server - WSUS Server version 3.0 SP2 or above is not installed
On 6/1, one our Software Update Points lost the ability to connect to its WSUS Server:
WSUS Control Manager failed to monitor WSUS Server "SCCM.ad.contoso.gov". Possible cause: WSUS Server version 3.0 SP2 or above is not installed or cannot be contacted.
The SMS_WSUS_CONFIGURATION_MANAGER
log file suggests that either WSUS 3.0 SP2 is not installed or is not able to be contacted by the SMS SUP services (SMS_WSUS_CONFIGURATION_MANAGER, and SMS_WSUS_CONTROL_MANAGER):
Error Milestone 004 6/8/2015 5:01:30 AM SCCM.ad.contoso.gov SMS_WSUS_CONTROL_MANAGER 7003 WSUS Control Manager failed to monitor WSUS Server "SCCM.ad.contoso.gov". Possible cause: WSUS Server version 3.0 SP2 or above is not installed or cannot be contacted. Solution: Verify that the WSUS Server version 3.0 SP2 or greater is installed. Verify that the IIS ports configured in the site are same as those configured on the WSUS IIS website.
Error Milestone 004 6/8/2015 5:01:30 AM SCCM.ad.contoso.gov SMS_WSUS_CONTROL_MANAGER 7000 WSUS Control Manager failed to configure proxy settings on WSUS Server "SCCM.ad.contoso.gov". Possible cause: WSUS Server version 3.0 SP2 or above is not installed or cannot be contacted. Solution: Verify that the WSUS Server version 3.0 SP2 or greater is installed. Verify that the IIS ports configured in the site are same as those configured on the WSUS IIS website.You can receive failure because proxy is set but proxy name is not specified or proxy server port is invalid.
Information Milestone 004 6/8/2015 4:01:39 AM SCCM.ad.contoso.gov SMS_WSUS_CONTROL_MANAGER 4609 Component Status Summarizer set the status of component "SMS_WSUS_CONTROL_MANAGER", running on computer "SCCM.ad.contoso.gov", to Critical. Possible cause: The component is experiencing a problem. Solution: Diagnose and fix the problem by: 1. Examining the status messages that the component reports. 2. Correcting the problem. 3. Instructing Component Status Summarizer to reset the counts of Error, Warning, and/or Informational status messages reported by the component. To reset the counts, right-click Reset Counts on the component in the Component Status summary in the Configuration Manager Console. When the counts are reset, Component Status Summarizer will change the status of the component to OK. This might take some time if site "004" is a child site. 4. Delete any unwanted status messages from the site database, if necessary. 5. Monitor the component occasionally to verify that the problem does not reoccur. Possible cause: The component is OK and you were unnecessarily alerted because the Component Status Thresholds are set too low for the component. Solution: Increase the Component Status Thresholds for the component using the Thresholds tab of the Component Status Summarizer Properties dialog box in the Configuration Manager Console. Possible cause: The component is flooding the status system by rapidly reporting the same message repeatedly. Solution: Diagnose and control the flood of status messages by: 1. Verifying that the component is actually flooding the status system. View the status messages reported by the component and verify that the same message is continually reported every several minutes or seconds. 2. Noting the Message ID of the flooded status message. 3. Creating a Status Filter Rule for site "004" that instructs Status Manager to discard the flooded status message when component "SMS_WSUS_CONTROL_MANAGER" on computer "SCCM.ad.contoso.gov" reports it. 4. Verifying that your sites' databases were not filled up by the flooded status message. Del
Information Milestone 004 6/8/2015 4:01:39 AM SCCM.ad.contoso.gov SMS_WSUS_CONTROL_MANAGER 4605 Component Status Summarizer detected that component "SMS_WSUS_CONTROL_MANAGER", running on computer "SCCM.ad.contoso.gov", has reported 5 or more Error status messages during the Component Status Threshold Period. Possible cause: The count equals or exceeds the Component Status Critical Threshold (5 status messages) for Error status messages for the component. Solution: Component Status Summarizer will set the component's status to Critical in the Component Status summary in the Configuration Manager Console.
I verified that the WSUS role was indeed still installed on SCCM.ad.contoso.gov; however it does not appear to be healthy. I cannot connect with it using the Windows Server Update Services MMC SnapIn and the Event Log is filled with the following errors going back to 6/1:
PS C:\Windows\system32> Get-EventLog -LogName Application -Source "Windows Server Update Services" -After $(Date -Month 06 -Day 07)
Index Time EntryType Source InstanceID Message
----- ---- --------- ------ ---------- -------
267564 Jun 08 04:14 Error Windows Server Up... 12052 The DSS Authentication Web Service is not working.
267563 Jun 08 04:14 Error Windows Server Up... 12042 The SimpleAuth Web Service is not working.
267562 Jun 08 04:14 Error Windows Server Up... 12022 The Client Web Service is not working.
267561 Jun 08 04:14 Error Windows Server Up... 12032 The Server Synchronization Web Service is not w...
267560 Jun 08 04:14 Error Windows Server Up... 12012 The API Remoting Web Service is not working.
267559 Jun 08 04:14 Error Windows Server Up... 12002 The Reporting Web Service is not working.
267558 Jun 08 04:14 Warning Windows Server Up... 10021 The catalog was last synchronized successfully ...
I verified that the WsusService was actually running and then checked IIS:
Huh. That's probably not good. The WsusPool Application Pool should probably be running... If I manually start the WsusPool I can then connect with the WSUS WebServices by browsing to http://SCCM.ad.contoso.gov:8530/Selfupdate
... and then after about 15 minutes the App Pool stops.
Also its running on the wrong ports (8530/8531)! About a month ago with the assistance of PFE we configured this SUP to be available to Internet-based clients. Part of that reconfiguration meant the WSUS web services need to be relocated to 80/443 so they are available through our perimeter firewall.
I don't have documentation on the exact commands we used but I am reasonably sure it was WSUSUtil.exe usecustomwebsite false which should move WSUS from its "WSUS Administration" IIS back to the Default Web Site which is bound under *:80 and *:443.
Again. This is not the case:
Well that's not good. It looks like the WSUS site has magically migrated back to its standalone site because... FUN! If the SCCM SUP is looking for WSUS on 80/443 and it's no longer there no wonder it doesn't work.
If I look at the registry key (HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Update Services\Server\Setup\PortNumbner
) that WSUSUtil.exe is manipulating I see that it still thinks WSUS should be running on 80.
Maybe I just need to run WsusUtil.exe
more than once for extra... FUN?
C:\Program Files\Update Services\Tools>WsusUtil.exe usecustomwebsite false
Using port number: 80
Except... nothing in IIS changes. I'm either not remembering a step we did previously to move the WSUS IIS Site or something is broken.
I really have two problems:
- The WSUS website 'WSUS Administration' has returned to its as-installed configuration as a standalone IIS website bound to *:8530 and *:8531 but underlying pieces of the system think it should be running under the 'Default Web Site' bound to *:80 and *:443.
- The WsusPool Application Pool keeps crashing or stopping preventing me from just reconfiguring the SUP point to use WSUS on its original *:8530 and *:8531 ports.
At this point I'm kind of at a loss on how to continue to troubleshoot this issue. I really want to avoid re-installing the WSUS role and/or SUP if possible due to the imminent release of Microsoft Updates tomorrow.
Any advice on further troubleshooting?
With some assistance from jscott I compared the Registry Keys in HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Update Services\Server\Setup
to the ones in his infrastructure and found them to be inconsistent. Values like IISTargetWebSiteIndex
was set as 'WSUS Administration' IIS site's id but the the PortNumber
value was set to 80 which was bound via *:80 to the Default Web Site
.
Since we had been through at least three iterations of reconfiguring WSUS on this server it seemed best to just reinstall the Role to make sure things were consistent, albeit still broken.
I finally went to Microsoft Support where it was kindly pointed out that the WsusPool Application Pool had its Private Memory Usage limited to 18530 KB. We removed the limit yesterday morning and things have been running fine since then. I'm not sure how that limitation got set or whether or its the default but it seems pretty small to me.