full-text search: how it works and what it can do – couchbase connect 2016

72
©2016 Couchbase Inc. Couchbase Full Text Search (FTS)

Upload: couchbase

Post on 15-Feb-2017

148 views

Category:

Software


12 download

TRANSCRIPT

Page 1: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.

Couchbase Full Text Search (FTS)

Page 2: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 2©2016 Couchbase Inc.

about your speakers

Marty Schoch Steve Yen

Page 3: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

why?what is it? how does it work?how does it scale?demobest practicesstatus / roadmap / what’s next

Page 4: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

why?

Page 5: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

couchbase users need to search their documents

Page 6: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

dedicated search solutions

✗ Provision✗ Install✗ Integrate✗ Transfer data✗ Learn✗ Manage✗ Troubleshoot

Page 7: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

why Full Text Search?

why Full Text Search?

simpleintegrated

80/20 of features

Page 8: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

what is it?

Page 9: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

what’s Full Text Search?

Page 10: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

what’s Full Text Search?

Page 11: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

search results

Result Text Snippets

Page 12: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

search results

Result Text Snippets

Highlighted Search Terms

Page 13: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

how does it work?

Page 14: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it work?

• Inverted indexes• Language awareness• Scoring

Page 15: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

inverted index

Terms

my: Doc 1, Doc 2, Doc 3dog: Doc 1, Doc 2, Doc 81has: Doc 1, Doc 2, Doc 3fleas: Doc 1, Doc 81…

Where found

Page 16: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

language aware

Document contains…

Beauty

Indexed as…

beauti

stemmingstemming Text Analysis

✔Match!

User searches…

Beautiful

Searched as…

beauti

Page 17: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

scoring

Page 18: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

TF/IDF scoring

• TF = Term Frequency• How often does a term occur in a document?• More often yields a higher score

• IDF = Inverse Document Frequency• How many documents have this term?• More documents yields lower score • (because it means the term is more common)

Page 19: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

index mapping

Page 20: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

index mapping

•Exclude fields/sub-sections•Configure indexing behavior by type of document (beer vs brewery)•Configure indexing behavior per-field• Index Fields•Nested structures• Arrays

Page 21: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

precision vs. recall

• Precision – ratio of document matches that are actually relevant• Recall – ratio of relevant documents that are actually matched• High quality results depend on performing the right analysis for your text• Beware: increasing precision may reduce recall (and vice versa)

Page 22: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

how does it scale?

Page 23: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

Page 24: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning? (hash partitioning))

Page 25: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔ (replicas promoted)

Page 26: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes? (auto-placement)

✔ (replicas promoted)

Page 27: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔ (replicas promoted)

Page 28: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance? (add/swap/remove)

(replicas promoted)

Page 29: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance (add/swap/remove)

✔ (replicas promoted)

Page 30: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance (add/swap/remove)

✔scatter/gather queries?)

Page 31: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance (add/swap/remove)

✔scatter/gather queries (partial results ok)

✔ (replicas promoted)

Page 32: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance (add/swap/remove)

✔scatter/gather queries (partial results ok)

Page 33: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance (add/swap/remove)

✔scatter/gather queries (partial results ok)

✔replicas? (only primaries queried)

Page 34: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance (add/swap/remove)

✔scatter/gather queries (partial results ok)

✔replicas (only primaries queried)

Page 35: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance (add/swap/remove)

✔scatter/gather queries (partial results ok)

✔replicas (only primaries queried)

✔failover? (replicas promoted)

Page 36: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

how does it scale?

✔auto index partitioning (hash partitioning)

✔to multiple FTS nodes (auto-placement)

✔rebalance (add/swap/remove)

✔scatter/gather queries (partial results ok)

✔replicas (only primaries queried)

✔failover (replicas promoted)

Page 37: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

demo

Page 38: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

best practices

Page 39: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

only use explicit field mappings in production

{ “type” : ”brewery”, “random_number” : 4, “edible” : false}

Dynamic mappings are great, until…

Page 40: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

only use explicit field mappings in production

{ “type” : ”brewery”,

“comments”: 4k of text “random_number” : 4, “edible” : false}

Developer adds one small field

Page 41: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

only use explicit field mappings in production

{ “type” : ”brewery”,

“comments”: 4k of text “random_number” : 4, “edible” : false}

Developer adds one small field

Page 42: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

always use Index Aliases

Index Rebuilding

Page 43: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

always use Index Aliases

/users /usersV1

Page 44: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

always use Index Aliases

/users

/usersV1

/usersV2

Indexing 55%

Page 45: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

always use Index Aliases

/users

/usersV1

/usersV2

Atomic Switch to /usersV2

Page 46: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

always use Index Aliases

/users

/usersV2

Atomic Switch to /usersV2

Page 47: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 52©2016 Couchbase Inc.

go watch!

Dave Starling

seenit

Page 48: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

status / roadmap / what’s next

Page 49: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

project status

FTS is developer preview in 4.5, 4.6

planned GA in Spockplease help kick the tires

http://www.couchbase.com/download

Page 50: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.

Couchbase Full Text Search (FTS)

thanks!

Page 51: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

links & Q+A

http://NICE-URL-TODO-HEREdownloads, getting started, tech docs

and, where you can ask questions

and share your feedback!

Page 52: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 57©2016 Couchbase Inc.

EXTRA SLIDES

Page 53: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design

couchbase couchbase couchbase

FTS FTS FTS

cfg

DCP streamsfor incrementalindex updates

a cfg bucketholds metadata

about the indexes

Page 54: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 59

Transition Slide TitleTransition Slide Subtitle Goes Here

Page 55: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 60

Transition Slide TitleTransition Slide Subtitle Goes Here

Page 56: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 61©2016 Couchbase Inc.

Title of Slide Goes Here

• Heading 1• Heading 2

• Heading 3• Heading 4

Page 57: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 62

Title of Slide Goes Here

• Heading 1• Heading 2

• Heading 3• Heading 4

• Heading 1• Heading 2

• Heading 3• Heading 4

Page 58: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 63

Speaker NameSpeakers TitleContact information

IMAGE GOES HERE

Page 59: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc. 64

Thank You!

Page 60: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

agenda

design

Page 61: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / index partitioning

Page 62: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023 (1024 vbuckets)

FTS nodes:X Y Z

Page 63: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023 (1024 vbuckets)

index partitions: (groups of vbuckets)

FTS nodes:X Y Z

Page 64: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023 (1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

FTS nodes:X Y Z

Page 65: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023 (1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

assign to FTS nodes:

FTS nodes:X Y Z

Page 66: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023 (1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

assign to FTS nodes:

FTS nodes:X Y Z

Page 67: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023 (1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

assign to FTS nodes: replicas, too:

FTS nodes:X Y Z

Page 68: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023 (1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

assign to FTS nodes: replicas, too:

FTS nodes:X Y Z

Page 69: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / indexing

couchbase couchbase couchbase

FTS FTS FTS

DCP streamsfor incrementalindex updates

Page 70: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / indexing

couchbase couchbase couchbase

FTS FTS FTS

DCP streamsfor incrementalindex updates

Page 71: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / queries

a query sentto any FTSnode…

your application

REST

FTS FTS FTS

Page 72: Full-text search: how it works and what it can do – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

FTS design / queries

a query sentto any FTSnode…

…is scatter / gatheredto the other

FTS nodesRE

ST

your application

FTS FTS FTS