“ leveraging sharepoint 2010 search technologies ” with: ivan neganov

Post on 17-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

“Leveraging SharePoint 2010 Search

Technologies”With: Ivan Neganov 

Sponsors

Agenda Open Discussion

Topic of the day QA 

Leveraging SharePoint 2010

Search Technologies

Mississauga SharePoint User Group, October 19, 2010

About the SpeakerIvan NeganovFounder of SoftForte, Inc. 11 years of experience in developing WCM solutions based on ASP.NET and SharePoint platforms. Focusing on SharePoint since 2007. Blog: neganov.blogspot.com

the Science of Quality

Agenda• Enterprise Search defined• Common search concepts and terms• Search architecture• SharePoint search technologies

What is Enterprise Search

• Why not use Google Appliance aka “Google Box”?

• Why not use open source engine like Lucene?

• Why SharePoint search isn’t enough?

• Do I need taxonomy & faceted search?

• Can users just go ahead and tag everything?

Enterprise is not just a large Intranet

• Large volumes of data• Usually there exists a “right” or highly relevant

document• Security is critical• Taxonomies and vocabularies are important• Dates are important• Corporate data does have structure

• Search is convenient for surfacing content• Search is promising for future BI applications

Search Scenarios• Two types of scenarios in an enterprise:

o Productivity search• Intranet/team collaboration search• People search/Social computing• Site search

o Search applications• Parts search (fuzzy search requirement)• Intelligence & Investigation (heavy use of entity extraction)• IP protection• Compliance/Records management• E-commerce• Knowledge management & Support• BI applications

Microsoft Search Technologies

• Desktop search, successor of Index Server• SQL Server Search – Full Text Search (FTS)• Exchange Search – uses same iFilters as

SharePoint• Bing (formerly live search)

o Bing + Yahoo = 9.5%

• SharePoint & FAST Search

SharePoint 2010 Search Technologies

• Microsoft SharePoint Foundation (Free)o Single site collection, 10 million itemso No external searcho Automatic configuration

• Microsoft Search Server 2010 Express (Free)o Enterprise-level search, 10 million items but single search server onlyo No people search

• Microsoft Search Server 2010o Enterprise-level, redundancy support, 100 million itemso No people search

• Microsoft SharePoint Server 2010o 100 million items, added people search, tagging

• Microsoft FAST Search Server for SharePointo Over 200 million itemso Improved and flexible relevancyo Entity extraction

• Microsoft FAST ESP Servero Advanced entity extractiono Standalone product

Relevancy• Google: PageRank algorithm

• Same approach is used in FAST and SharePoint 2010

• FAST provides ability to dynamically boost rank

Index

Linguistics• Word stemming

• Word lemmatization

• Word morphologyo Collapsing indices

Other Common Search Concepts

• Crawling• Querying• Crawled & Managed Properties• Best Bets• Refiners aka Facets• Linguistics: Stemma & Lemma• Entity Extraction

High Level Search Architecture

Demo: Search Experience

FAST Search Server 2010 for SharePoint

• Advanced scalability & performance• Advanced content processing• Extensibility

FAST Content Processing Pipeline:

FAST ESP• Essentially re-packaged FAST ESP 5.3

• Planned two SKUs (according to SPC 2009)o FAST Search Server for Internet Siteso Fast Search Server for Internal Applications

• Updates?

Planning Enterprise Search

• Search is redundant and scalable

Planning FAST Search

Which Search Technology Is Appropriate?

• FAST Search Server requires enterprise CALs

Estimating Costs

SharePoint Enterprise Search FAST Search Server for SharePoint

4 – 6 query and index servers 4 – 6 query and index servers, 0 – 2 content distributor & web analyzer servers

1 – 2 database clusters (share) 1 – 2 database clusters (share)

40 million documents, medium dedicated search farm

Search UI• Search Web Parts• Search Center• Thick clients

Extending Search• Federation - OpenSearch• Query Object Model• BCS Connectors• RANK & XRANK• Tapping in Document Processing Pipeline

Federation

MadridLos Angeles

Hong Kong

South Africa

Demo: Search Federation

Connector Framework• Leverage tooling (SPD, VS2010)

Entity Extraction in FAST

• Automatically create crawled properties for a given vocabulary

• Useful for advanced scenarios: for example 1. Extract property at crawl time,2. Enrich a property3. Index enriched property

Search in the Enterprise: Future

• Amount of content will continue to grow• Search will integrate with Business Intelligence

applications• Entity, Sentiment and Fact extraction• Search as navigation• Search visualization• Search as a service• Many more custom applications leveraging search

Resources• Microsoft Technet, MSDN• Professional Microsoft Search 2010

Questions

top related