cloud powered search

Post on 16-Jul-2015

47 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Search

Cloud powered

LIVIU MAZILURADU PINTILIE

April 25, 2015Cloud powered search

© EXPERT NETWORK

CODECAMP

Challenges in distributed applications

SQL Azure Federation

HDInsight

DocumentDB

Previous subjects

April 25, 2015Cloud powered search

© EXPERT NETWORK

Azure Search

The need for search

Search explained

Development

Case Scenarios

Agenda

April 25, 2015Cloud powered search

© EXPERT NETWORK

The need for search

Why do we search for data?

How do we store it to search efficiently?

What’s important?

April 25, 2015Cloud powered search

© EXPERT NETWORK

Is this a search engine?

where [field] like “%codecamp%”

April 25, 2015Cloud powered search

© EXPERT NETWORK

WHAT IS A SEARCH ENGINE?

Efficient indexing of data On all fields / combination of fields

Analyzing data Text Search

Tokenizing

Stemming

Filtering

Understanding locations

Relevance scoring

April 25, 2015Cloud powered search

© EXPERT NETWORK

Lucene

Document: collection of fields

Field: string based key-value pair

Collection: set of documents

Inverted index: a term can list the number of documents it contains

Score: relevancy for each document matching the query

April 25, 2015Cloud powered search

© EXPERT NETWORK

How searching works

Id Title UserId ViewCount Tags

1 Controller Action ambiguity

even with [HttpPost]

decoration? (ASP.NET MVC4)

5 352 asp.net asp.net-mvc

asp.net-mvc-4 f#

2 Why can't I use a scrollwheel

on a webpage?

6 109 c# javascript asp.net

asp.net-mvc-4 twitter-

bootstrap-3

3 Access session variable of one

site in another"

7 78 asp.net .net

4 Check if SIM card exists 5 209 c# windows-phone-8

April 25, 2015Cloud powered search

© EXPERT NETWORK

Inverted indexHow searching works

Title

Access session variable of one site in another" 3

Check if SIM card exists 4

Controller Action ambiguity even with [HttpPost] decoration? (ASP.NET

MVC4)

1

Why can't I use a scrollwheel on a webpage? 2

UserID

5 1, 4

6 2

7 3

ViewCount

78 3

109 2

209 4

352 1

April 25, 2015Cloud powered search

© EXPERT NETWORK

Inverted indexHow searching works

Title

Access session variable of one site in another" 3

Check if SIM card exists 4

Controller Action ambiguity even with [HttpPost] decoration? (ASP.NET

MVC4)

1

Why can't I use a scrollwheel on a webpage? 2

UserID

5 1, 4

6 2

7 3

ViewCount

78 3

109 2

209 4

352 1

Query: UserID = 5

April 25, 2015Cloud powered search

© EXPERT NETWORK

Full text search

Id Tags

1 asp.net asp.net-mvc asp.net-mvc-

4 f#

2 c# javascript asp.net asp.net-mvc-

4 twitter-bootstrap-3

3 asp.net .net

4 c# windows-phone-8

How searching works

Term Doc

.net 3

asp.net 1, 2, 3

asp.net-mvc-4 1, 2

c# 2, 4

f# 1

javascript 2

mvc 1

twitter-bootstrap-3 2

windows-phone-8 4

April 25, 2015Cloud powered search

© EXPERT NETWORK

Full text search

Id Tags

1 asp.net asp.net-mvc asp.net-mvc-

4 f#

2 c# javascript asp.net asp.net-mvc-

4 twitter-bootstrap-3

3 asp.net .net

4 c# windows-phone-8

How searching works

Term Doc

.net 3

asp.net 1, 2, 3

asp.net-mvc-4 1, 2

c# 2, 4

f# 1

javascript 2

mvc 1

twitter-bootstrap-3 2

windows-phone-8 4

Query: “javascript” in Tags

April 25, 2015Cloud powered search

© EXPERT NETWORK

Full text search

Id Tags

1 asp.net asp.net-mvc asp.net-mvc-

4 f#

2 c# javascript asp.net asp.net-mvc-

4 twitter-bootstrap-3

3 asp.net .net

4 c# windows-phone-8

How searching works

Term Doc

.net 3

asp.net 1, 2, 3

asp.net-mvc-4 1, 2

c# 2, 4

f# 1

javascript 2

mvc 1

twitter-bootstrap-3 2

windows-phone-8 4

Query: “asp.net” in Tags

April 25, 2015Cloud powered search

© EXPERT NETWORK

Auto-completionUses

April 25, 2015Cloud powered search

© EXPERT NETWORK

Auto-correction

PhrasingIframe security – Security in an Iframe

Word-level distancegrey/gray

color/colour

Uses

April 25, 2015Cloud powered search

© EXPERT NETWORK

Elasticsearch

Distributed: aggregated results of search performed on multiple shards/indices

Schema Less: is document oriented. Supports JSON format

RESTful: supports REST interface

Faceted Search: support for navigational search functionality

Replication: supports index replication

Fail over: replication and distributed nature provides inbuilt fail over.

Near Real time: supports near real time updates

April 25, 2015Cloud powered search

© EXPERT NETWORK

Distributed & highly available

• Multiple servers (nodes) running in a cluster • Acting as single service

• Nodes in cluster that store data or nodes that just help in speeding up search queries.

• Sharding• Indeces are sharded (# shards is configurable)

• Each shard can have zero or more replicas • Replicas on different servers (server pools) for failover

• One in the cluster goes down? No problem.

Elasticsearch

April 25, 2015Cloud powered search

© EXPERT NETWORK

Azure search

Elasticsearch as a managed service

Platform as a service (PaaS)

Admin by Rest API

Data exchange with JSON

April 25, 2015Cloud powered search

© EXPERT NETWORK

Where are we at

Service Ease of use Scalability Easy Administration

Manual search (SQL) No No Partial

Elasticsearch Yes Yes No

AzureSearch Yes Yes Yes

April 25, 2015Cloud powered search

© EXPERT NETWORK

Resource model

ServiceIndex (schema type 1)

Index (schema type 2)Document

DocumentField1

Field2

Field3

Field4

Indexers

Azure Search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Management PortalDemo

April 25, 2015Cloud powered search

© EXPERT NETWORK

Index creation

POST https://codecamp-en.search.windows.net/indexes

"name": "stackoverflow-posts",

"fields": [ {

"name": "name_of_field",

"type": “data_type",

"searchable": true (default where applicable) | false ,

"filterable": true (default) | false,

"sortable": true (default where applicable) | false

"facetable": true (default where applicable) | false ,

"key": true | false (default),

"retrievable": true (default) | false } ] …

Azure Search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Index documents

Indexers

Data sources: Azure SQL Database, DocumentDB

Connects data sources with target search indexes

An indexer can be used in the following ways:one-time copy of the data to populate an index

sync an index with changes from the data source on a schedule

invoke on-demand to update an index as needed

Azure Search

April 25, 2015Cloud powered search

© EXPERT NETWORK

CRUD Operations

Add, Update, Delete

POST https://codecamp-en.search.windows.net/indexes/stackoverflow/docs/index

{

"@search.action": "upload (default) | merge | mergeOrUpload | delete",

"key_field_name": "unique_key_of_document", (key/value pair for key field from index schema)

"field_name": field_value (key/value pairs matching index schema)

}

Azure Search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Searching through data

GET https://codecamp-en.search.windows.net/indexes/stackoverflow/docs?

search=[string] + (AND operator “code" and “camp")

| (OR operator “code" or “camp" or both)

- (NOT operator. “code–camp" “code" term and/or do not have “camp" )

* (Suffix operator. “cod*" - starts with “cod", ignoring case)

" (Phrase search operator)

( ) (Precedence operator - code+(camp|workshop)

searchMode=any|all

searchFields=[string]

Azure Search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Filtering results

$filter=[string] - Odata syntax

$skip=#

$top=#

$count=true|false

$orderby=[string]

$select=[string]

Azure Search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Emphasizing results

facet=[string] (field names)count

sort

values

interval

highlight=[string] (field names)

highlightPreTag=[string] (default is em)

highlightPostTag=[string]

Azure Search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Suggestions

GET https://codecamp-en.search.windows.net/indexes/stackoverflow/docs/suggest

search=[string]

suggesterName=[string]

fuzzy=[boolean]

searchFields=[string]

Azure Search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Stackoverflow Posts

5.215.584 records

212 MB in Title column

118 MB in Tags column

10,5 GB in Body column

Sample Data

Column Name Data Type

Id int

CreationDate datetime

Score float

ViewCount int

Body nvarchar

OwnerUserId int

Title nvarchar

Tags nvarchar

April 25, 2015Cloud powered search

© EXPERT NETWORK

Search APIDEMO

April 25, 2015Cloud powered search

© EXPERT NETWORK

Scaling

Capacity measured in Search Units

1 Search Unit1 Partition

1 Replica

Horizontal scaling by increasing the number of partitions and/or replicas

Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Storage

Partition limitations:15 million documents

25 GB data

Every Index is split by default in 12 shards

Each partition can store 1,2,3,4,6,12 shards

Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

SCENARIOS

Online retail/ecommerce

User generated/social content

Not just for the web

Hybrid Applications

USE CASE

April 25, 2015Cloud powered search

© EXPERT NETWORK

Conclusions

The need for search

Search explained

Development

Case Scenarios

April 25, 2015Cloud powered search

© EXPERT NETWORK

Questions

?

April 25, 2015Cloud powered search

© EXPERT NETWORK

THANK YOU

top related