sharepoint search topology and optimization

42
Search Topology and Optimization Mike Maadarani SharePoint Architect [email protected] November 23 rd , 2013

Upload: mike-maadarani

Post on 01-Jun-2015

513 views

Category:

Technology


0 download

DESCRIPTION

This presentation covers the architecture of SharePoint Search Topology, how to extend search and how to optimize your search farm for better results. It describes how you can build your Search topology with PowerShell commands and it explains how you can use the Query Rules and Query Builder for a great search results.

TRANSCRIPT

Page 1: SharePoint Search Topology and Optimization

Search Topology and Optimization

Mike MaadaraniSharePoint [email protected]

November 23rd, 2013

Page 2: SharePoint Search Topology and Optimization

Thank you to all of our Sponsors!!

Page 3: SharePoint Search Topology and Optimization

Bio..

Mike MaadaraniApp Dev and Architecture for over 18 years (15 Years Microsoft, 3 Years with the “Other Guys”)Business focused on Enterprise Content Management & Publishing SitesTechnology focused on SharePoint, SQL Server and SharePoint IntegrationArchitect, trainer, and presenterBlog: www.maadarani.com [email protected]; @mikemaadarani

Page 4: SharePoint Search Topology and Optimization

Configuring SSA and PS

Topology Scenarios

Agenda

Closing and Q&A

Relevancy, Query Builder, &Optimization

SharePoint 2013 Search Overview

Architecture and Resource Utilization

Page 5: SharePoint Search Topology and Optimization

Search in 2010

Crawl Component

Query Component

SharePoint 2010 Search Service Application

Crawl Indexing Engine

Query Engine

Search Admin

Property Store (SQL)

Content

UserWFE

Page 6: SharePoint Search Topology and Optimization

FAST Search for SharePoint 2010

FAST Content SSA

FAST Query SSA

FAST back-end components(managed separately)

Extensibility:• Sandbox• Entity

Extraction

Crawl Indexing Engine

Query Engine

Content Pipeline

Analysis Engine

Query Pipeline

Search AdminContent

UserWFE

Page 7: SharePoint Search Topology and Optimization

… In SharePoint 2013

SharePoint 2013 Search Service Application

Index Component

Query Engine

Content Pipeline

Content ProcessingComponent

CrawlComponent

Query ProcessingComponent

AnalyticsProcessingComponent

Query Pipeline

Search Admin

Admin Component

Entire index on local disk

Property Store (SQL)

Content

UserWFE

Analysis Engine

Crawl Indexing Engine

Link/query analysis & recommendations

Separate crawl and indexing

Extensibility:• Web

callout• Entity

Extraction

Page 8: SharePoint Search Topology and Optimization

SharePoint 2013 Search Architecture

SearchAdmin

Content UserCrawlContentProcessing Index

QueryProcessing WFE

API

AnalyticsProcessing

Crawl

Search Admin

Link

Analytics Reporting

FAST Search Index

SharePointSP AppsDevicesNon-SP UX

HTTPFile sharesSharePointUser profilesLotus Notes DocumentumExchange foldersCustom - BCS

Public API

Search topology components

Content Query

Page 9: SharePoint Search Topology and Optimization

Why Search is so important?

I just uploaded a document.

Make it searchable, quick!

FAST

Page 10: SharePoint Search Topology and Optimization

Why Search is so important?

EASY

Page 11: SharePoint Search Topology and Optimization

Why Search is so important?

EASY

Page 12: SharePoint Search Topology and Optimization

Why Search is so important?

Search Driven Applications

Page 13: SharePoint Search Topology and Optimization

Why Search is so important?

Search Everything

I can find ALL of Rob Ford’s hidden videos!

Page 14: SharePoint Search Topology and Optimization

noderunner.exe noderunner.exe noderunner.exe noderunner.exe

Where does Search live in the farm?

Windows servicesSharePoint Search Host Controller service

Runtime/lifecycle control of search components (except crawler) hostcontrollerservice.exe

SharePoint Server Search serviceCrawl Component mssearch.exe mssdmn.exe

ProcessesNoderunner.exe

Runtime environment for search components (except crawler)

msseearch.exemssdmn.exe

CrawlComponentnoderunner.exe

Search Runtime Environment

hostcontrollerservice.exe

Host Controller

Sh

are

Poin

t A

pp

Serv

er

Admin entitiesSearch Service Instance: Provisioning of the search service on each boxSearch Service Application: SharePoint Configuration entity

Still there, but only Crawl Component

AdminComponent

Query ProcessingComponent

Content ProcessingComponent

IndexComponent

Analytics ProcessingComponent

Page 15: SharePoint Search Topology and Optimization

Where do I host my components?

Page 16: SharePoint Search Topology and Optimization

CPU loadDriving factors

QPS

Query transformations

Network loadDriving factors

Number of index partitions

Size of queries and results

Example:20 index partitions @ 20 qps => 200/100 Mbit/s in/outbound

Query processing component (QPC)

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Page 17: SharePoint Search Topology and Optimization

CPU loadDriving factors

QPS and item count

Guidelines per index component @ 2 GHz CPU1M items: 5 QPS per CPU core

5M items: 2 QPS per CPU core

10M items: 1 QPS per CPU core

Disk loadDriving factors

QPS and item count

New content invalidates caches

Disk size: 500GB @ 10M items per index component

Index component

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Page 18: SharePoint Search Topology and Optimization

Crawl component

CPU loadDriving factors

Documents per second

Link discovery

Crawl management

Network loadDriving factors

Downloading items from content sources

Passing items on to CPC

Disk loadAll documents are temporarily stored in data folder

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Page 19: SharePoint Search Topology and Optimization

Content processing component (CPC)

CPU loadDriving factors

Documents per second

Document size and complexity

Feature extraction

Estimate: 5-10 DPS per CPU core

Network loadDriving factors

Documents per second

Document size

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Page 20: SharePoint Search Topology and Optimization

Analytics processing component (APC)

CPU loadDriving factors

Number of items

Site activity

Disk loadLocal disk used for temporary storage

Bulk load, primacy concern is load isolation

Network loadSame as for CPU load

PLUS: Network traffic increases when distributing APC across multiple machines

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Page 21: SharePoint Search Topology and Optimization

Search administration component

Low CPU and network load

Load increase with more components in the search topology

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Page 22: SharePoint Search Topology and Optimization

Create your SSA

$SSADB = "SharePoint_Demo_Search"

$SSAName = "Search Service Application Ottawa"

$SVCAcct = "mcm\sp_search"

$SSI = get-spenterprisesearchserviceinstance -local

#1. Start the search services for SSI

Start-SPEnterpriseSearchServiceInstance -Identity $SSI

#2. Create the Application Pool

$AppPool = new-SPServiceApplicationPool -name $SSAName"-AppPool" -account $SVCAcct

#3. Create the search application and set it to a variable

$SearchApp = New-SPEnterpriseSearchServiceApplication -Name $SSAName -applicationpool $AppPool -databaseserver SQL2012 -databasename $SSADB

#4. Create search service application proxy

$SSAProxy = new-SPEnterpriseSearchServiceApplicationProxy -name $SSAName" Application Proxy" -Uri $SearchApp.Uri.AbsoluteURI

#5. Provision Search Admin Component

Set-SPEnterpriseSearchAdministrationComponent -searchapplication $SearchApp -searchserviceinstance $SSI

#6. Create the topology

$Topology = New-SPEnterpriseSearchTopology -SearchApplication $SearchApp

#7. Assign server(s) to the topology

$hostApp1 = Get-SPEnterpriseSearchServiceInstance -Identity "SPWFE"

New-SPEnterpriseSearchAdminComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchCrawlComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchAnalyticsProcessingComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchQueryProcessingComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchIndexComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1 –IndexPartition 0

#8. Create the topology

$Topology | Set-SPEnterpriseSearchTopology

Page 23: SharePoint Search Topology and Optimization

Small Search Topology

Page 24: SharePoint Search Topology and Optimization

Fault tolerant small search topology

Host

VM

Index QPC

VM

Admin

Crawl

CPC

APC

Host

VM

Index QPC

VM

Admin

Crawl

CPC

APC

Page 25: SharePoint Search Topology and Optimization

Other SharePoint applications

Web front end

Admin

Crawl

CPC

APC

Index

QPC

Small search farm (up to 10M items)

Resources @ 10M items8x CPU cores24 GB RAM800 GB disk

Sized independently

Separate disk

for index

Page 26: SharePoint Search Topology and Optimization

Scaling from small to medium search topology

Adm

Crawl

Index Index IndexIndex QPCCPC CPC

APC

Adm

Crawl

Index Index Index IndexQPCCPC CPC

APC

Page 27: SharePoint Search Topology and Optimization

Extend your SSA

#2. Extend the Search Topology:

$hostApp1 = Get-SPEnterpriseSearchServiceInstance -Identity "SPWFE"

$hostApp2 = Get-SPEnterpriseSearchServiceInstance -Identity "SPSearch"

Start-SPEnterpriseSearchServiceInstance -Identity $hostApp1

Start-SPEnterpriseSearchServiceInstance -Identity $hostApp2

#3. Keep running this command until the Status is Online:Get-SPEnterpriseSearchServiceInstance -Identity $hostApp1 Get-SPEnterpriseSearchServiceInstance -Identity $hostApp2

#4. Once the status is online, you can proceed with the following commands:$ssa = Get-SPEnterpriseSearchServiceApplication$active = Get-SPEnterpriseSearchTopology -SearchApplication $ssa –Active

$newTopology = New-SPEnterpriseSearchTopology -SearchApplication $ssa

New-SPEnterpriseSearchAdminComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchCrawlComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchAnalyticsProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchQueryProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchIndexComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1 –IndexPartition 0

New-SPEnterpriseSearchAdminComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchCrawlComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchAnalyticsProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchQueryProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchIndexComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2 –IndexPartition 1

#5. Activate the topology:

Set-SPEnterpriseSearchTopology -Identity $newTopology

Page 28: SharePoint Search Topology and Optimization

Medium Search Topology

Page 29: SharePoint Search Topology and Optimization

Tweaking Your results

Page 30: SharePoint Search Topology and Optimization

Challenges: Intent

Where is my talk Project Plan?

Are Documents held at the same place?

I wonder if there are references from

previous projects?Different people have different intentsQuery Rules help you handle intents

There is rarely a single right answer

Infrastructure Project

Page 31: SharePoint Search Topology and Optimization

Configuration in the Conceptual Relevance Flow

For all queries:

Authorities: Level 1: http://employment

Ranking model: {incorporate user ratings}

Query:HR Employmentquarterly report

Search Web Part

Query Processing Engine

Document Collection

Thesaurus: HR Human ResourcesBest bets: HR Employment /HR/employment

(WORDS HR, Human Resources) AND(WORDS employees, employed) AND (WORDS quarterly, quarterlies) AND(WORDS report, reports, reported)

Mixed Results for:• HR Employment best bet• HR Employment quarterly

report• HR Employment

ContentType=reports

Dynamic Reordering Rules: Quarterly Report {prefer docs from http://reports}

Query Rule: {Terms} Quarterly Report {Terms} ContentType=“reports”

Page 32: SharePoint Search Topology and Optimization

Authorities: SSA-level configuration

Sites that are important

Sites with low intrinsic relevance

Takes ~24hrs to propagate

Page 33: SharePoint Search Topology and Optimization

Authorities: Connected

Page 34: SharePoint Search Topology and Optimization

Authorities: Connected

1

0

1

1

2

4

3

2

4

Setting an authority affects all sites connected through hyperlinks

Sites are weighted

by distance to the authority

Page 35: SharePoint Search Topology and Optimization

Query Rules

Tune Search Results

Created at the SSA, Tenant, Site Collection or SiteSSA

Site Collection

Site

Page 36: SharePoint Search Topology and Optimization

Query Rules

ConditionWhen Do I apply the rule?

ActionWhat to do when the rule is matched?

PublishingWhen should the rule be active?

Page 37: SharePoint Search Topology and Optimization

Query Rules

Exact match, beginning or end Ad-hoc or term store dictionary Match a regex (advanced) Is this query more likely aimed at

the following source…? Do people mostly click on result of

the following type…?

Conditions Show a promoted result Show a block of results Replace the core results

with a different query

Actions

Page 38: SharePoint Search Topology and Optimization

Query Builder

Dynamically Ranking Change

Part of the query

Results Ranking

Page 39: SharePoint Search Topology and Optimization

Query Builder

Page 40: SharePoint Search Topology and Optimization

Session Objective and Takeaways

High Availability and Performance

Better Search Quality

Better management

Friendly results and tools

Page 41: SharePoint Search Topology and Optimization

Thank You / Merci

www.maadarani.com, [email protected] , @mikemaadarani

Q & A

Page 42: SharePoint Search Topology and Optimization

Join us for SharePint today!Date & Time: Nov 23rd, 2013 @6:00 pm

Location: The Observatory Pub, Algonquin Student’s Association

Address: A-170 on Algonquin Campus

Parking: No need to move your car!*

Site: http://www.algonquinsa.com/ob.aspx

*Please drive responsibly! We are happy to call you a cab

Remember to fill out your evaluation forms to win some great prizes!

&