secondary indexing in phoenix jesse yates hbase committer software engineer sf hbase user group –...

Post on 01-Apr-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Secondary Indexing in Phoenix

Jesse YatesHBase CommitterSoftware Engineer

SF HBase User Group – September 26, 2013

James TaylorPhoenix LeadSoftware Engineer

2

Agenda

• About

• Indexes In Phoenix

• Immutable Indexes

• Mutable Indexes

• Demo!

• Roadmap

SF HUG – Sept 2013

https://www.madison.k12.wi.us/calendars

3

Phoenix

• Open Source– https://github.com/forcedotcom/phoenix

• “SQL-skin” on HBase– Everyone knows SQL!

• JDBC Driver– Plug-and-play

• Faster than HBase– in some cases

SF HUG – Sept 2013

4

Secondary Indexes

• Sort on ‘orthogonal’ axis

• Save full-table scan

• Expected database feature

• Hard in HBase b/c of ACID considerations

SF HUG – Sept 2013

5

• About

• Indexes In Phoenix

• Immutable Indexes

• Mutable Indexes

• Demo!

• Roadmap

Agenda

SF HUG – Sept 2013

6

Indexes In Phoenix

• Creating an index– DDL statement– Creates another HBase table behind the scenes

• Deciding when an index is used– Transparent to the user– (but user can override through hint)– No stats yet

• Knowing which table was used– EXPLAIN <query>

SF HUG – Sept 2013

7

Creating Indexes In Phoenix

• CREATE INDEX <index_name>ON <table_name>(<columns_to_index>…)INCLUDE (<columns_to_cover>…);

• Optionally add IMMUTABLE_ROWS=true property to CREATE TABLE statement

SF HUG – Sept 2013

8

Creating Indexes In Phoenix

CREATE TABLE baby_names ( name VARCHAR PRIMARY KEY, occurrences BIGINT);

CREATE INDEX baby_names_idx ON baby_names(occurrences DESC,

name);

SF HUG – Sept 2013

9

Deciding When To Use

• Transparent to the user• Query optimizer does the following:– Compiles query against data and index tables– Chooses “best” one (not yet stats driven)• Can index even be used?

– Active, Using columns contained in index (no join back to data table)

• Can ORDER BY be removed?• Which plan forms the longest start/stop scan key?

SF HUG – Sept 2013

10

Deciding When To Use

SELECT name, occurrences FROM baby_names ORDER BY occurrences DESC LIMIT 10;

SELECT name, occurrences FROM baby_names_idxLIMIT 10

SF HUG – Sept 2013

ORDER BY not necessary since rows in index table are already ordered this way

11

Deciding When To Use

SELECT name, occurrences FROM baby_names WHERE occurrences > 100;

SELECT name, occurrences FROM baby_names_idxWHERE occurrences > 100;

SF HUG – Sept 2013

Uses index, since we can form start row for scan based on filter of occurrences

12

Deciding When To Use

SELECT /* NO_INDEX */ nameFROM baby_names WHERE occurrences > 100;

SELECT /*+ INDEX (baby_names baby_names_idx other_baby_names_idx) */name,occurrences FROM baby_namesWHERE occurrences > 100;

SF HUG – Sept 2013

Override optimizer by telling it not to use any indexes

Tell optimizer priority in which it should consider using indexes`

13

Knowing which table was used

EXPLAIN SELECT name, occurrences FROM baby_names ORDER BY occurrences DESC LIMIT 10;

CLIENT PARALLEL 1-WAY FULL SCAN OVER BABY_NAMES_IDX SERVER FILTER BY PageFilter 10CLIENT 10 ROW LIMIT

SF HUG – Sept 2013

14

• About

• Indexes In Phoenix

• Immutable Indexes

• Mutable Indexes

• Demo!

• Roadmap

Agenda

SF HUG – Sept 2013

15

Immutable Indexes

• Immutable Rows

• Much easier to implement

• Client-managed

• Bulk-loadable

SF HUG – Sept 2013

16

• About

• Indexes In Phoenix

• Immutable Indexes

• Mutable Indexes

• Demo!

• Roadmap

Agenda

SF HUG – Sept 2013

17

Mutable Indexes

• Global Index

• Change row state– Common use-case– “expected” implementation

• Covered Columns/Join Index

SF HUG – Sept 2013

18

1.5 years*

SF HUG – Sept 2013

19

Internals

• Index Management– Build index updates– Ensures index is ‘cleaned up’

• Recovery Mechanism– Ensures index updates are “ACID”

SF HUG – Sept 2013

20

“There is no magic”

- Every programming hipster (chipster)

SF HUG – Sept 2013

21

Mutable Indexing: Standard Write Path

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

SF HUG – Sept 2013

22

Mutable Indexing: Standard Write Path

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

SF HUG – Sept 2013

23

Mutable Indexing

RegionCoprocessor

Host

WAL

RegionCoprocessor

Host

Indexer Builder

WAL Updater

Durable!

IndexerIndex Table

Index TableIndex Table

Codec

SF HUG – Sept 2013

24

Index Management

• Lives within a RegionCoprocesorObserver• Access to the local HRegion• Specifies the mutations to apply to the index

tables

public interface IndexBuilder {public void setup(RegionCoprocessorEnvironment env);public Map<Mutation, String> getIndexUpdate(Put put);public Map<Mutation, String> getIndexUpdate(Delete delete);

}

SF HUG – Sept 2013

25

Why not write my own?

• Managing Cleanup – Efficient point-in-time correctness– Performance tricks

• Abstract access to HRegion– Minimal network hops

• Sorting correctness– Phoenix typing ensures correct index sorting

SF HUG – Sept 2013

26

Example: Managing Cleanup

• Updates can arrive out of order– Client-managed timestamps

SF HUG – Sept 2013

ROW FAMILY QUALIFIER TS VALUE

Row1 Fam Qual 10 val1

Row1 Fam2 Qual2 12 val2

Row1 Fam Qual 13 val3

27

Example: Managing Cleanup

Index Table

SF HUG – Sept 2013

ROW FAMILY QUALIFIER TS

Val1|Row1 Index Fam:Qual 10

Val1|Val2|Row1 Index Fam:QualFam2:Qual2

12

Val3|Val2|Row1 Index Fam:QualFam2:Qual2

13

28

Example: Managing Cleanup

SF HUG – Sept 2013

ROW FAMILY QUALIFIER TS VALUE

Row1 Fam Qual 10 val1

Row1 Fam2 Qual2 12 val2

Row1 Fam Qual 13 val3

Row1 Fam Qual 11 val4

29

Example: Managing Cleanup

SF HUG – Sept 2013

ROW FAMILY QUALIFIER TS VALUE

Row1 Fam Qual 10 val1

Row1 Fam Qual 11 val4

Row1 Fam2 Qual2 12 val2

Row1 Fam Qual 13 val3

30

Example: Managing Cleanup

SF HUG – Sept 2013

ROW FAMILY QUALIFIER TS

Va1|Row1 Index Fam:Qual 10

Val4|Row1 Index Fam:Qual 11

Val4|Val2|Row1 Index Fam:QualFam2:Qual2

12

Va1l|Val2|Row1 Index Fam:QualFam2:Qual2

12

Val3|Val2|Row1 Index Fam:QualFam2:Qual2

13

31

Example: Managing Cleanup

SF HUG – Sept 2013

ROW FAMILY QUALIFIER TS

Va1|Row1 Index Fam:Qual 10

Val4|Row1 Index Fam:Qual 11

Val4|Val2|Row1 Index Fam:QualFam2:Qual2

12

Va1l|Val2|Row1 Index Fam:QualFam2:Qual2

12

Val3|Val2|Row1 Index Fam:QualFam2:Qual2

13

32

Managing Cleanup

• History “roll up”• Out-of-order Updates• Point-in-time correctness• Multiple Timestamps per Mutation• Delete vs. DeleteColumn vs. DeleteFamily

Surprisingly hard!SF HUG – Sept 2013

33

Phoenix Index Builder

• Much simpler than full index management• Hides cleanup considerations• Abstracted access to local state

SF HUG – Sept 2013

public interface IndexCodec{public void initialize(RegionCoprocessorEnvironment env);public Iterable<IndexUpdate> getIndexDeletes(TableState state);public Iterable<IndexUpdate> getIndexUpserts(TableState state);

}

34

Phoenix Index Codec

SF HUG – Sept 2013

35

Dude, where’s my data?

SF HUG – Sept 2013

Ensuring Correctness

36

HBase ACID

• Does NOT give you:– Cross-row consistency– Cross-table consistency

• Does give you:– Durable data on success– Visibility on success without partial rows

SF HUG – Sept 2013

Key Observation

“Secondary indexing is inherently an easier problem than full transactions… secondary index updates are idempotent.”

- Lars Hofhansl

37 SF HUG – Sept 2013

38

Idempotent Index Updates

• Doesn’t need full transactions

• Replay as many times as needed

• Can tolerate a little lag– As long as we get the order right

SF HUG – Sept 2013

39

Failure Recovery• Custom WALEditCodec– Encodes index updates– Supports compressed WAL

• Custom WAL Reader– Replay index updates from WAL

SF HUG – Sept 2013

<property><name>hbase.regionserver.wal.codec</name> <value>o.a.h.hbase.regionserver.wal.IndexedWALEditCodec</value>

</property><property>

<name>hbase.regionserver.hlog.reader.impl</name> <value>o.a.h.hbase.regionserver.wal.IndexedHLogReader</value>

</property>

40

Failure Situations

• Any time before WAL, client replay

• Any time after WAL, HBase replay

• All-or-nothing

SF HUG – Sept 2013

41

Failure #1: Before WAL

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

SF HUG – Sept 2013

42

Failure #1: Before WAL

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

No problem! No data is stored in the WAL, client just retries entire update.

SF HUG – Sept 2013

43

Failure #2: After WAL

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

SF HUG – Sept 2013

44

Failure #2: After WAL

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

WAL replayed via usual replay mechanisms

SF HUG – Sept 2013

45

“Magic”

• Server-short circuit• Lazy load columns• Skip-scan for cache• Parallel Writing• Custom MemStore in Indexer• Caching HTables• Pluggable Index Writing/Failure Policy• Minimize byte[] copy (ImmutableBytesPtr)

SF HUG – Sept 2013

46

• About

• Indexes In Phoenix

• Immutable Indexes

• Mutable Indexes

• Demo!

• Roadmap

Agenda

SF HUG – Sept 2013

47

Demo

SF HUG – Sept 2013

48

• About

• Indexes In Phoenix

• Immutable Indexes

• Mutable Indexes

• Demo!

• Roadmap

Agenda

SF HUG – Sept 2013

49

Roadmap

• Next release of Phoenix

• Performance improvements

• Functional Indexes

• Other indexing approaches (Huawei, SEP)

SF HUG – Sept 2013

50

Open Source!

• Main: https://github.com/forcedotcom/phoenix

• Indexing:https://github.com/forcedotcom/phoenix/tree/mutable-si

SF HUG – Sept 2013

(obligatory hiring slide)

We’re Hiring!

Questions? Comments?

jtaylor@salesforce.com@jamesplusplus

jyates@salesforce.com@jesse_yates

53

Appendix

• AsyncHBaseWriter– github.com/jyates/phoenix/tree/async-hbase– 2x+ slower*

* Written in 2hrs, not 100% correct either

SF HUG – Sept 2013

top related