Search stops working and Central admin Search screens error

Problem:

In my redundant Search farm, search falls over.  It was working, nothing appears to of changed but it suddenly stops working.  This problem has caused itself to surface in several places:

1. In  Central Admin going to "Search Administration" displays the following error message: 

Search Application Topology - Unable to retrieve topology component health states. This may be because the admin component is not up and running.


2. Query and crawl stopped working.
3. Using PowerShell I can't get the status of the Search Service Application

PS> $srchSSA = Get-SPEnterpriseSearchServiceApplication
PS> Get-SPEnterpriseSearchStatus -SearchApplication $srchSSA


Error:
Get-SPEnterpriseSearchStatus : Failed to connect to system manager. SystemManagerLocations: net.tcp://sp2013-srch2/CD8E71/AdminComponent2/Management

4. In the "Search Administration" page within Central Admin, if I click "Content Sources" I get the error message:
Sorry, something went wrong
The search application 'ef5552-7c93-4555-89ed-cd8f1555a96b' on server SP2013-SRCH2 did not finish loading.

View the event logs on the affected server for more information.
I used PowerShell to get the ULS logs for the correlation Id returned on the screen via the Central Admin error message

PS> Merge-SPLogFile -Path "d:\error.log" -Correlation "ef109872-7c93-4e6c-89ed-cd8f14bda96b"

Out shows.
Logging Correlation Data       Medium Name=Request (GET:http://sp2013-app1:2013/_admin/search/listcontentsources.aspx?appid=b555d269%255577%2D430d%2D80aa%2D30d55556dc57)
Authentication Authorization   Medium Non-OAuth request. IsAuthenticated=True, UserIdentityName=, ClaimsCount=0
Logging Correlation Data       Medium Site=/ 05555e9c-5555-555a-69e2-3f65555be9f4
Topology  Medium WcfSendRequest: RemoteAddress: 'https://sp2013-srch2:32844/8c2468a555594301abf555ac41a555b0/SearchAdmin.svc' Channel: 'Microsoft.Office.Server.Search.Administration.ISearchApplicationAdminWebService' Action: 'http://tempuri.org/ISearchApplicationAdminWebService/GetVersion' MessageId:
General    Medium Application error when access /_admin/search/listcontentsources.aspx, Error=The search application 'ef155572-7555-4e6c-89ed-cd8f14bda96b' on server SP2013-SRCH2 did not finish loading. View the event logs on the affected server for more information.  Server stack trace: at System.ServiceModel.Channels.ServiceChannel.ThrowIfFaultUnderstood(Message reply, MessageFault fault, String action, MessageVersion version, FaultConverter faultConverter) at System.ServiceModel.Channels.ServiceChannel.HandleReply(ProxyOperationRuntime operation, ProxyRpc& rpc) at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Ob General  Medium ...(IMethodCallMessage methodCall, ProxyOperationRuntime operation)  at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)  Exception rethrown at [0]: at Microsoft.Office.Server.Search.Internal.UI.SearchCentralAdminPageBase.ErrorHandler(Object sender, EventArgs e) at Microsoft.Office.Server.Search.Internal.UI.SearchCentralAdminPageBase.OnError(EventArgs e) at System.Web.UI.Page.HandleError(Exception e) at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) at System.Web.UI.Page.ProcessRequest(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) 
General      Medium ...em.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) 05b5559c-4215-0555-69e2-3f656555e9f4 Runtime   tkau Unexpected System.ServiceModel.FaultException`1[[System.ServiceModel.ExceptionDetail, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]: The search application 'ef109555-7c93-444c-85555-cd8f14b5556b' on server SP2013-SRCH2 did not finish loading. View the event logs on the affected server for more information.   Server stack trace:      at System.ServiceModel.Channels.ServiceChannel.ThrowIfFaultUnderstood(Message reply, MessageFault fault, String action, MessageVersion version, FaultConverter faultConverter)     at System.ServiceModel.Channels.ServiceChannel.HandleReply(ProxyOperationRuntime operation, ProxyRpc& rpc) 
Runtime     tkau Unexpected ...s, TimeSpan timeout) at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation) at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)  Exception rethrown at [0]: at Microsoft.Office.Server.Search.Internal.UI.SearchCentralAdminPageBase.ErrorHandler(Object sender, EventArgs e) at Microsoft.Office.Server.Search.Internal.UI.SearchCentralAdminPageBase.OnError(EventArgs e) at System.Web.UI.Page.HandleError(Exception e) at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)
Runtime       tkau Unexpected ...ProcessRequest() at System.Web.UI.Page.ProcessRequest(HttpContext context) at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously
General    ajlz0 High Getting Error Message for Exception System.ServiceModel.FaultException`1[System.ServiceModel.ExceptionDetail]: The search application 'ef10-555-a96b' on server SP2013-SRCH2 did not finish loading. View the event logs on the affected server for more information. (Fault Detail is equal to An ExceptionDetail, likely created by IncludeExceptionDetailInFaults=true, whose value is: System.Runtime.InteropServices.COMException: The search application '7c93-555c-89ed-cd8f555da6b' on server SP2013-SRCH2 did not finish loading. View the event logs on the affected server for more information. at icrosoft.Office.Server.Search.Administration.SearchApi.RunOnServer[T]General        ajlz0 High ...   at Microsoft.Office.Server.Search.Administration.SearchApi..ctor(String applicationName)     at Microsoft.Office.Server.Search.Administration.SearchAdminWebServiceApplication.GetVersion()     at SyncInvokeGetVersion(Object , Object[] , Object[] )     at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object instance, Object[] inputs, Object[]& outputs)     at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc) at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMess...) Micro Trace uls4 Medium Micro Trace Tags: 0 nasq,5 agb9s,52 e5mc,21 8nca,0 tkau,0 ajlz0,0 aat87 05b16e9c-4215-00ba-69e2-3f656eabe9f4
Monitoring   b4ly Medium Leaving Monitored Scope (Request (GET:http://sp2013-app1:2013/_admin/search/listcontentsources.aspx?appid=be28d269%255577%2D430d%2D555a%2D30d55556dc57)). Execution Time=99.45349199409
Topology    e5mb Medium WcfReceiveRequest: LocalAddress: 'https://sp2013-srch2.demo.local:32844/8c2468a555943555bf42cac555f05b0/SearchAdmin.svc' Channel: 'System.ServiceModel.Channels.ServiceChannel' Action: 'http://tempuri.org/ISearchApplicationAdminWebService/GetVersion' MessageId: 'urn:uuid:c2d55565-bb59-4555-b34b-6555ef1d79a5'
 
 

I can see my Admin component on SP2013-SRCH2 is not working, so I turned off the machine hoping it would resolve to the other admin component, it did not change and keeps giving me the same log errors.  I turned the Server back on and reviewed the event log on the admin component Server (SP2013-SRCH2).  Following errors occured: 

Application Server Administration job failed for service instance Microsoft.Office.Server.Search.Administration.SearchServiceInstance + Reason: The device is not ready. 
The Execute method of job definition Microsoft.Office.Server.Search.Administration.IndexingScheduleJobDefinition + The search application + on server SP2013-SRCH2 did not finish loading.


Examining the Windows application event logs on SP2013-SRCH2 and I noticed event log errors relating to permissions:

A database error occurred. Source: .Net SqlClient Data Provider Code: 229 occurred 0 time(s) Description:  Error ordinal: 1 Message: The EXECUTE permission was denied on the object 'proc_MSS_GetConfigurationProperty', database 'SP_Search'
Unable to read lease from database - SystemManager, System.Data.SqlClient.SqlException (0x80131904): The EXECUTE permission was denied on the object 'proc_MSS_GetLease', database 'SP_Search', schema 'dbo'.


The event log showed me the account trying to execute these stored procss.  It was my Search service account e.g. demo\sp_searchservice

Initial Hypothesis:

It looks like the Admin search service is not starting or failing over.  By tracing the permissions I can see the demo\SP_SearchService account no longer has execute SP permissions on the SP_Search database.

By opening the SP_Search database and looking at effective permissions I can see the "SPSearchDBadmin" role has "effective" permissions over the failing stored procs (Proc_Mss_GetConfigurationProperty).

If I look at the account calling the Stored Proc (demo\SP_SearchService), I can see it is not assigned to the role.  This examination leads me to my conjecture that the management tool or someone/something on the farm has caused the permissions to be changed

Resolution:

Change the permissions on the database.  Give minimal permissions so go to the database and give the service account (demo\sp_SearchService) SPSearchDBAdmin role permissions.

All the issues recorded in the problem statement were working withing 5 minutes on my farm with the issue.

Note: I have a UAT environment in exact sync with my PR environment; UAT has the correct permissions in place already.

Comments

Popular posts from this blog

"There's a configuration problem preventing us from getting your document. If possible, try opening this document in Microsoft Word." Office WebApp Error

"We’re sorry. We ran into a problem completing your request. Please try that again in few minutes." Excel Service SharePoint

Deployment error "The web.config is invalid on this IIS Web Site"