search and analytics (using elasticsearch)

Post on 06-May-2015

4.272 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Slides from Costin Leau's talk on Search and Analytics (using Elasticsearch) at the 18th Big Data London meetup

TRANSCRIPT

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search and Analytics

(using Elasticsearch)

Costin Leau

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Why search?

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – what’s the big deal?

Basic/Metadata retrieval

“Find banks with more then (x) accounts”

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – what’s the big deal?

Basic/Metadata retrieval

“Find banks near my location”

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – What we’re all about

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search categories

Basic/Metadata retrieval

Full-text search

Highlighting

Geolocation

Fuzzy search (“did-you-mean”)

Natural Language

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search categories

Basic/Metadata retrieval

Full-text search

Highlighting

Geolocation

Fuzzy search (“did-you-mean”)

Natural Language

data stores

search engines

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

‘Players’ in the search market

Search engines

- Google/Bing/Yahoo!/Ask.com/Yandex/Baidu

Open-Source

- Sphinx

- Apache Lucene

- Elasticsearch

- Solr

- Sensei

Enterprise Search

- Oracle Endeca / MDEX

- HP Autonomy

- Exalead

- IBM Enterprise Search

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Lightweight

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Lightweight

Popular: >200K downloads/month

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Users

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Users

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Platform Adoption

http://www.thoughtworks.com/radar#platforms 2013

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Platform Adoption

http://www.thoughtworks.com/radar#platforms 2013

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Searches 50,000,000 venues every day using

Elasticsearch

Use Case - Geolocation

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case – Support/Reporting

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case - Centralized Logging

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case - Pure Analytics

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search and Big Data

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

A Holistic View of a Big Data System

ETL

Real

Time

Streams

Unstructured Data (HDFS)

RT Semi

structured

Database

(hBase,

Cassandra,

Mongo)

Big SQL (Greenplum,

AsterData,

Etc…)

Batch Processing Real-Time

Processing

(s4, storm)

Analytics

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

A Holistic View of a Big Data System

ETL

Real

Time

Streams

Unstructured Data (HDFS)

RT Semi

structured

Database

(hBase,

Cassandra,

Mongo)

Big SQL (Greenplum,

AsterData,

Etc…)

Batch Processing

Analytics

Real-Time

Processing

(s4, storm)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Hadoop eco-system

Hadoop Distributed File System (HDFS)

Map Reduce Framework (MapRed)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Hadoop eco-system

Hadoop Distributed File System (HDFS)

Map Reduce Framework (MapRed)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch + Hadoop

0

10

20

30

40

50

60

M/R Pig Hive

Raw w/ ES

0

10

20

30

40

50

60

M/R Pig Hive

Raw w/ ES

Writing Reading / Querying

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Explore data through

(Elastic)Search

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Thank you! @costinl

http://www.elasticsearch.org/

top related