Today I had an issue where a full crawl would take forever. First thought was to add an extra Search Server and to split Query and Crawling components. Once the new server was added to the farm I added the crawl component to the new server:

The only option to do this is PowerShell:

 

$ssa = Get-SPEnterpriseSearchServiceApplication -Identity

$ahost = Get-SPEnterpriseSearchServiceInstance -Identity “MyNewSearchHostname”

$active = Get-SPEnterpriseSearchTopology -SearchApplication $ssa -Active

$clone = New-SPEnterpriseSearchTopology -SearchApplication
$ssa -Clone -SearchTopology $active

New-SPEnterpriseSearchCrawlComponent -SearchTopology $clone
 -SearchServiceInstance $ahost

New-SPEnterpriseSearchIndexComponent -SearchTopology $clone
 -SearchServiceInstance $ahost -indexPartition 0

Set-SPEnterpriseSearchTopology -Identity $clone

 Ok, so now I’ve got in my Search administration 2 Servers and things are processing. I run a full crawl and after 4 hours all is crawled. It looks like the system is running ok now. Wrong!

I enabled continuous crawl.

17 hours later I had another look. The continuous crawl is still actively running and it seems to get itself into a never ending crawl. Continuous crawl should have some idle time every now and then.

Then I decided to add the Content Processing to the new search server as well. The above cloning process had to be repeated as it isn’t possible to add search components to an active clone.

$ssa = Get-SPEnterpriseSearchServiceApplication -Identity

$ahost = Get-SPEnterpriseSearchServiceInstance -Identity “MyNewSearchHostname”

$active = Get-SPEnterpriseSearchTopology -SearchApplication $ssa -Active

$clone = New-SPEnterpriseSearchTopology -SearchApplication
$ssa -Clone -SearchTopology $active

New-SPEnterpriseSearchCrawlComponent -SearchTopology $clone
 -SearchServiceInstance $ahost

New-SPEnterpriseSearchIndexComponent -SearchTopology $clone
 -SearchServiceInstance $ahost -indexPartition 0

New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $clone -SearchServiceInstance $ahost

Set-SPEnterpriseSearchTopology -Identity $clone

 

Just one additional line was added to add the Content Processing Component.

Then within 10 minutes my continuous crawl got to an idle state. problem solved. But what was the problem???

I had a look at the crawl health reports in my Search Administration. Ok the crawl rate looks better now. documents are being processed since I have made the latest topology change.

Then I looked at the Crawl Report – Crawl Queue. This report gave the answer to my question. the queue cleared up quite quickly after my latest configuration change. So what happened?

I had two servers crawling and one server processing content. The crawl processes gave the content processing component too much to do and therefore the crawl queue was flooding. As the continuous crawl will give more requests for processing every 15 minutes therefore the queue would never be emptied.

Maybe some flood protection should have been build in here.

 

Advertisements