edb 100k-trans-qcon-rev1

55
Achieving 100,000 Transactions Per Second with a NoSQL Database Eric David Bloch @eedeebee 19 jun 2012

Upload: eric-bloch

Post on 14-Dec-2014

969 views

Category:

Technology


2 download

DESCRIPTION

Achieving 100 transactions a second with MarkLogic

TRANSCRIPT

Page 1: Edb 100k-trans-qcon-rev1

Achieving 100,000 Transactions Per Second with a NoSQL Database

Eric David Bloch@eedeebee

19 jun 2012

Page 2: Edb 100k-trans-qcon-rev1

• I’ve written software used by millions of people.

Apps, libraries, compilers, device drivers, operating

systems

• This is my second QCon and my first talk

• I’m the Community Director at MarkLogic, last 2 years.

• Born here in NY in 1965; now in CA

• I survived having 3 kids in less than 2 years.

A bit about me

Page 3: Edb 100k-trans-qcon-rev1

Me, 1967

Page 4: Edb 100k-trans-qcon-rev1

Me, today

Page 5: Edb 100k-trans-qcon-rev1

A: Why?

B: How?

Whirl-wind database architecture tour

Melody from part A again

Techniques to get to 100K

Musical Form for the Talk

Page 6: Edb 100k-trans-qcon-rev1
Page 7: Edb 100k-trans-qcon-rev1

It’s about money.

Top 5 bank needed to manage trades Trades look more like documents than tables Schemas for trades change all the time Transactions Scale and velocity (“Big Data”)

Page 8: Edb 100k-trans-qcon-rev1

1 million trades per day

Followed by 1 million position reports at end of day Roll up trades of current date for each “book, instrument” pair

Group-by, with key = “date, book, instrument”

Day1

Day2

1M trades1M positions

Day3

Day4

Day5

. . .

Trades and Positions

Page 9: Edb 100k-trans-qcon-rev1

Trades and Positions

<trade> <quantity>8540882</quantity> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date></trade>

<position> <instrument>EAAFX</instrument> <book>679</book> <quantity>3</quantity> <business-date>2011-03-25Z</business-date> <position-date>2011-03-24Z</position-date></position>

Page 10: Edb 100k-trans-qcon-rev1

Now show us

• 15K inserts per second• Linear scalability

Page 11: Edb 100k-trans-qcon-rev1

Requirements

NoSQL flexibility, performance & scale

Enterprise-grade

transactional guarantees

Page 12: Edb 100k-trans-qcon-rev1

What NoSQL, OldSQL, or NewSQLdatabase out there can we use?

?

Page 13: Edb 100k-trans-qcon-rev1
Page 14: Edb 100k-trans-qcon-rev1
Page 15: Edb 100k-trans-qcon-rev1
Page 16: Edb 100k-trans-qcon-rev1

Your database has a large effect on how you see the

world

Page 17: Edb 100k-trans-qcon-rev1
Page 18: Edb 100k-trans-qcon-rev1

Slide 18 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 18 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

What is MarkLogic?

Non-relational, document-oriented, distributed database Shared nothing clustering, linear scale out Multi-Version Concurrency Control (MVCC) Transactions (ACID)

Search Engine Web scale (Big Data) Inverted indexes (term lists) Real-time updates Compose-able queries

Page 19: Edb 100k-trans-qcon-rev1

Slide 19 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 19 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Architecture

Data model

Indexing

Clustering

Query execution

Transactions

Question

What does data model have to do with scalability?

Page 20: Edb 100k-trans-qcon-rev1

Key (URI) Value (Document)

/trade/153748994 <trade> <id>8</id> <time>2012-02-20T14:00:00</time> <instrument>BYME AAA</instrument> <price cur=“usd”>600.27</price></trade>

/user/eedeebee { “name” : “Eric Bloch”, “age” : 47, “hair” : “gray”, “kids” : [ “Grace”, “Ryan”, “Owen” ]}

/book5293 It was the best of times, it was the worst of times, it was the age of wisdom, ...

/2012-02-20T14:47:53/01445 .mp3.avi[your favorite binary format]

Page 21: Edb 100k-trans-qcon-rev1

Slide 21 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 21 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Inverted Index

“which”

“uniquely”

“identify”

“each”

“uniquely identify”<article>

article/abstract/@author<product>IMS</product>

123, 127, 129, 152, 344, 791 . . .

122, 125, 126, 129, 130, 167 . . .

123, 126, 130, 142, 143, 167 . . .

123, 130, 131, 135, 162, 177 . . .

126, 130, 167, 212, 219, 377 . . .

. . .

. . .

Document References

126, 130, 167, 212, 219, 377 . . .

Page 22: Edb 100k-trans-qcon-rev1

Slide 22 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 22 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Range Index

Value

Docid

0 287

8 1129

13 531

… …

… …

Docid

Value

287 0

531 13

1129 8

… …

… …

<trade> <trader_id>8</trader_id> <time>2012-02-20T14:00:00</time> <instrument>IBM</instrument> …</trade><trade> <trader_id>13</trader_id> <time>2012-02-20T14:30:00</time> <instrument>AAPL</instrument> …</trade><trade> <trader_id>0</trader_id> <time>2012-02-20T15:30:00</time> <instrument>GOOG</instrument> …</trade>

• Column Oriented• Memory Mapped

Rows

Page 23: Edb 100k-trans-qcon-rev1

Slide 23 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 23 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Shared-Nothing Clustering

Host D1 Host D2 Host D3 Host Dj…

Host E1

Page 24: Edb 100k-trans-qcon-rev1

Slide 24 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 24 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Query Evaluation – “Map”

Host E1 Host E2 Host E3 Host Ei

Host D1 Host D2 Host D3 Host Dj…

F1 Fn… F1 Fn… F1 Fn… F1 Fn…

Q

Q Q Q Q

Page 25: Edb 100k-trans-qcon-rev1

Slide 25 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 25 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Query Evaluation – “Reduce”

Host E1 Host E2 Host E3 Host Ei

Host D1 Host D2 Host D3 Host Dj…

F1 Fn… F1 Fn… F1 Fn… F1 Fn…

Top 10

Top 10

Top 10

Top 10

Top 10

Page 26: Edb 100k-trans-qcon-rev1

Slide 26 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 26 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Queries/Updates with MVCC

Every query has a timestamp Documents do not change

Reads are lock-free Inserts – see next slide Deletes – mark as deleted Edits –

copy edit insert the copy mark the original as deleted

Page 27: Edb 100k-trans-qcon-rev1

Slide 27 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 27 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Insert Mechanics

1) New URI+Document arrive at E-node

2) URI Probe – determine whether URI exists in any forest

3) URI Lock – write locks taken on D node(s)

4) Forest Assignment – URI is deterministically placed in Forest

5) Indexing

6) Journaling

7) Commit – transaction complete

8) Release URI Locks – D node(s) are notified to release lock

Page 28: Edb 100k-trans-qcon-rev1

Slide 28 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 28 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Save-and-merge (Log Structured Tree Merge)

Database

Forest 1 Forest 2

S1 S2

S3

S1 S2

S3

S0

Insert/Update

Save

Merge

Disk

Memory

J

Journaled

Page 29: Edb 100k-trans-qcon-rev1

Back to the money

Page 30: Edb 100k-trans-qcon-rev1

Slide 30 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 30 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Trades and Positions

1 million trades per day followed by 1 million position reports at end of day

Roll up trades of the current “date” for each “book:instrument” pair

Group-by, with key = “book:date:instrument”

Day1

Day2

1M trades1M positions

Day3

Day4

Day5

. . .

Page 31: Edb 100k-trans-qcon-rev1
Page 32: Edb 100k-trans-qcon-rev1

Slide 32 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 32 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Naive Query Pseudocode

for each book

for each instrument in that book

position = position(yesterday, book, instrument)

for each trade of that instrument in this book

position += trade(today, book, instrument).quantity

insert(today, book, instrument, position)

Page 33: Edb 100k-trans-qcon-rev1

Slide 33 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 33 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Initial results

Single node – 19,000 inserts per second

Page 34: Edb 100k-trans-qcon-rev1

Slide 34 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 34 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Initial results with cluster

1DE 2DE 3DE0

10000

20000

30000

40000

50000

60000

70000

Report Query 2

# of nodes

doc/s

ec

Page 35: Edb 100k-trans-qcon-rev1

Slide 35 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 35 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Techniques to get to 100K

Insert Query for Computing New Positions Materialized compound key, Co-Occurrence Query and

Aggregation

Insert of New Positions Batching Optimized insert mechanics

Page 36: Edb 100k-trans-qcon-rev1

Slide 36 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 36 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Materializing a compound key

<trade> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date></trade>

Page 37: Edb 100k-trans-qcon-rev1

Slide 37 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 37 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Materializing a compound key

<trade> <roll-up book-date-instrument=“151333445566782303”/> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date></trade>

Page 38: Edb 100k-trans-qcon-rev1

Slide 38 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 38 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Co-Occurrence and Distributed Aggregation

V D

… …

8540882

1129

… …

… …

… …

D V

… …

… …

… 8540882

… …

… …

• Co-occurrences: Find pairings of range indexed values

• Aggregate on the D nodes (Map/Reduce):

Sum up the quantities above

• Similar to a Group-by, • in a column-oriented, in-memory

database

V D

… …

… 1129

… …

… …

… …

D V

… …

… …

15137…

… …

… …

<trade> <roll-up book-date-instrument=“151373445566703”/> <quantity>8540882</quantity> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date></trade>

Page 39: Edb 100k-trans-qcon-rev1

Slide 39 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 39 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Initial results with new query

> 30K inserts/second on a single node

Page 40: Edb 100k-trans-qcon-rev1

Slide 40 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 40 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Co-Occurrence + Aggregate versus Naïve approach

1DE 2DE 3DE0

10000

20000

30000

40000

50000

60000

70000

Report Query 3Report Query 2

# of nodes

doc/s

ec

Page 41: Edb 100k-trans-qcon-rev1

Slide 41 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 41 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Techniques

Computing Positions Materialized compound key, Co-Occurrence Query and

Aggregation

Updates Batching Optimized insert mechanics

Page 42: Edb 100k-trans-qcon-rev1

Slide 42 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 42 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Transaction Size and Throughput

1 2 4 8 16 32 100 500 1000 100000

5000

10000

15000

20000

25000

30000

35000

insert throughput

# doc inserts per transaction

#docs p

er

sec.

Page 43: Edb 100k-trans-qcon-rev1

Slide 43 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 43 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Techniques

Computing Positions Materialized compound key, Co-Occurrence Query and

Aggregation

Updates (Transaction) Batching Optimized insert mechanics

Page 44: Edb 100k-trans-qcon-rev1

Slide 44 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 44 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Insert Mechanics, Again

1) New URI+Document arrive at E-node

2) *URI Probe – determine whether URI exists in any forest

3) *URI Lock – write locks are created

4) Forest Assignment – URI is deterministically placed in Forest

5) Indexing

6) Journaling

7) Commit – transaction complete

8) *Release URI Locks – D nodes are notified to release lock

* Overhead of these operations increases with cluster size

x

Page 45: Edb 100k-trans-qcon-rev1

Slide 45 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 45 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Deterministic Placement

4) Forest Assignment – URI is deterministically placed in Forest

Hash function

URI64-bit number

Bucketed into a Forest

Fi

• Done in C++ within server• But…• Can also be done in the client• Server allows queries to be evaluated against

only one forest…

Page 46: Edb 100k-trans-qcon-rev1

Slide 46 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 46 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Optimized Insert

D1 D2

F3 F4F1 F2

Q

E

Q

Page 47: Edb 100k-trans-qcon-rev1

Slide 47 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 47 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Optimized Insert Mechanics

1) New URI+Document arrive at E-nodea) Compute Fi using

b) Ask server to evaluated the insert query only against Fi

2) URI Probe – Fi Only

3) URI Lock – Fi Only

4) Forest Assignment – Fi Only

5) Indexing

6) Journaling

7) Commit – transaction complete

8) Lock Release - Fi Only

Page 48: Edb 100k-trans-qcon-rev1

Slide 48 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 48 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Regular Insert Vs. Optimized Insert

1DE 2DE 3DE0

10000

20000

30000

40000

50000

60000

70000

regular insertin-forest-eval

# of nodes

docs/s

econd

In-Forest Eval

Page 49: Edb 100k-trans-qcon-rev1

Slide 49 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 49 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Tada! Achieving 100K Updates

In-Forest Eval

Page 50: Edb 100k-trans-qcon-rev1

Slide 50 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 50 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

We weren’t content with 15K

So we showed them…

100,000 INSERTS PER SECOND

A quote from the bank

“We threw everything we had at MarkLogic and it didn’t break a

sweat”

Page 51: Edb 100k-trans-qcon-rev1

Enterprise-grade NoSQL document-oriented database

with real-time full-text search and transactions

Doing 100K transactions per second

Page 52: Edb 100k-trans-qcon-rev1
Page 53: Edb 100k-trans-qcon-rev1
Page 54: Edb 100k-trans-qcon-rev1
Page 55: Edb 100k-trans-qcon-rev1

Slide 55 Copyright © 2012 MarkLogic® Corporation. All rights reserved.Slide 55 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Thank You!

@eedeebeehttp://[email protected]