intro to elaticsearch - elasticsearch bucharest group @ softbinator

Post on 21-Oct-2014

353 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

A short intro about the Elasticsearch techonlogy for young programmers in Bucharest.

TRANSCRIPT

ElasticsearchScalable Full-Text Search Engine

Thursday, February 27, 14

Goals for this talk

Thursday, February 27, 14

Outline

• What’s full text search and why do we use it?

• What can you do with Elasticsearch?

• Why is Elasticsearch different?

• DEMO TIME!

Thursday, February 27, 14

Text Search do I really need to explain it?

Thursday, February 27, 14

%LIKE%

• In the beginning there was:

SELECT * FROM tweets WHERE content LIKE ‘%zuckerberg%’

Thursday, February 27, 14

But that’s not what you usually search for!

• You want:

Search by author

Search by time

Search by sentiment

Search by location

Search by everything!

Thursday, February 27, 14

That’s a lot of metadata!

• You can’t search through all that on the fly if you want realtime results

• You need to index it first!

Thursday, February 27, 14

Inverted Index

• Some documents:1: ‘Mark Zuckerberg sells Facebook’ [Monday]2: ‘Facebook buys WhatsApp’ [Tuesday]3: ‘Mark’s Facebook buys Instagram’[Monday]

• Inverted index for them:Facebook: { 1, 2, 3}

Mark: {1, 3}Instagram: {2}WhatsApp: {2}[Monday]: {1, 3}

Thursday, February 27, 14

Ok, now that we have data, we also want some numbers behind it!

• In our previous example:

• Facebook is mentioned 3 times

• There are 2 posts on [Monday]

• The most frequent words are Facebook and Mark

Thursday, February 27, 14

All 3 put together

Elasticsearch

=

Search(Content & Metadata) + Analytics

(oversimplified)

Thursday, February 27, 14

Let’s look at some search features of

Elasticsearch

Thursday, February 27, 14

Features: Complex Queries• Boolean Operators:

(apple OR pumpkin) AND pie

• Wildcards:

app*: apple, apples, appliance

appl?: apple, apply

• Fuzzy:

back~: back, pack, black, bank

• Ranged:

Thursday, February 27, 14

Features: Complex Queries

• Attribute filtering:

apple AND pie AND location:california

• Range filtering: apple AND published:[1393100055 TO 1393427055]

Thursday, February 27, 14

Features:Geo Queries

Bounding Box Queries Distance Range Queries

Thursday, February 27, 14

Feature: built in analytics

Thursday, February 27, 14

Feature: Built in tagcloud

Thursday, February 27, 14

What’s special about Elasticsearch?

Thursday, February 27, 14

Distributed

• Clustering data into multiple servers is easy and abstracted away from the developer

Thursday, February 27, 14

Performance/Scalability

• Add and take nodes on the fly without ever stopping the search service

Thursday, February 27, 14

Performance/Scalability

• Can scale independently both indexing and searching

Thursday, February 27, 14

Performance/Scalability

• With few nodes you can do complex queries on billions of documents

• 3 nodes: 20 mil documents with 2 replicas each

Thursday, February 27, 14

Easy to back up

• Elasticsearch has a built in backup solution so that you don’t have to worry about implementing one

Thursday, February 27, 14

Demo time!

Thursday, February 27, 14

top related