Saturday, July 07, 2012

Installing Exchange rollup fixes

While I certainly see the value in Windows 2008 UAC (User Access Control), it can be a big pain, too.  Too often, you can try to run a program only to have the program fail for no reason.  I have seen this recently with Exchange 2010 Rollup fixes.  Today while installing E2K10 SP2 RU3, I had an issue where the installer appeared to run, but failed within a minute or two and gave no obvious reason.

The Windows Application Event log had 2 fairly generic errors:

Event 1036
Windows Installer installed an update. Product Name: Microsoft Exchange Server. Product Version: Product Language: 1033. Manufacturer: Microsoft Corporation. Update Name: Update Rollup 3 for Exchange Server 2010 Service Pack 2 (KB2685289) 14.2.309.2. Installation success or error status: 1603.

Event 1024
Product: Microsoft Exchange Server - Update 'Update Rollup 3 for Exchange Server 2010 Service Pack 2 (KB2685289) 14.2.309.2' could not be installed. Error code 1603. Windows Installer can create logs to help troubleshoot issues with installing software packages. Use the following link for instructions on turning on logging support:

The fix for this, though, is pretty simple.  Open up a command prompt as an administrator and simply run the patch from the command line.  In my case, I just typed:


Then the roll-up installs correctly (and it does take 30-45 minutes to complete).

Wednesday, July 04, 2012

W2K8 / E2K7 Cluster Comms issue

A few weeks ago, someone called me and asked me to help out with an E2K7 CCR cluster running on W2K8.   Regardless of what they tried, the cluster would not achieve quorum.  It had been working up until a few weeks prior and no one had noticed.  The Failover Cluster Administrator suggested that the file share witness could not be contacted. 

However, the CIFS shares on the FSW were accessible and it was ping-able.  The Cluster.log had some interesting errors in it, but the cause was not immediately obvious.  Here are some of the errors that were occuring when the cluster was trying to achieve quorum:

Network Name <Cluster Name>: Unable to Logon. winError 1326
Error 1326 from ResourceControl for resource Cluster Name.
ResourceControl(NETNAME_GET_VIRTUAL_SERVER_TOKEN) to Cluster Name returned 1326.
File Share Witness <File Share Witness (\\FSWSERVER\CLU-01-MNS)>: Failed to get virtual server token from core NetName resource, error 1326.
File Share Witness <File Share Witness (\\FSWSERVER\CLU-01-MNS)>: Failed to retrieve the virtual server token from the core netname resource with 1326. 
RhsCall::Perform_NativeEH: ERROR_LOGON_FAILURE(1326)' because of 'Resource File Share Witness (\\FSWSERVER\CLU-01-MNS): Open call failed.
rcm::RcmAgent::Online: ERROR_LOGON_FAILURE(1326)' because of 'There is a problem with the resource DLL.'
ERROR_LOGON_FAILURE(1326)' because of 'Failed to bring quorum resource e86bd5ca-7bab-4d1c-b9ac-94ef54acdb03 online, status 1326
Signaled NetftRemoteUnreachable  event, local address remote address  
Signaled NetftRemoteUnreachable  event, local address remote address
Signaled NetftRemoteUnreachable  event, local address remote address

  Frankly, it looked like that problem was with the FSW until the logs suggested that it was the *other* node of the cluster that was not reachable (the IP via port 3853.

  Upon further investigation (which should have been the first thing I looked for), I found my old enemy lurking in the shadows.  Symantec Endpoint Protection with the Network Access protection features enabled.  I checked the SEP firewall logs on the clustered nodes, but it was not showing any errors.   However, once I disabled the Network Access protection component of SEP, the cluster immediately established quorum.

The Exchange support team was unaware the servers were even running SEP.  Their IT security department had deployed SEP to upgrade from an older version of Symantec Antivirus and had not told anyway.