transactional streaming: if you can compute it, you can probably stream it

65
Transactional Streaming If you can compute it, you can probably stream it. John Hugg March 30th, 2016 @johnhugg / [email protected]

Upload: jhugg

Post on 16-Apr-2017

907 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Transactional Streaming: If you can compute it, you can probably stream it

Transactional StreamingIf you can compute it, you can probably stream it.

John Hugg March 30th, 2016

@johnhugg / [email protected]

Page 2: Transactional Streaming: If you can compute it, you can probably stream it

Who Am I?• First developer on the VoltDB project. • Previously at Vertica and other data

startups. • Have made so many bad decisions

over the years, that now I almost know what I'm talking about.

[email protected] • @johnhugg • http://chat.voltdb.com

Page 3: Transactional Streaming: If you can compute it, you can probably stream it

Operations at Scale

Page 4: Transactional Streaming: If you can compute it, you can probably stream it

Operations at Scale

• Ingest data from several sources into a horizontally scalable system.

• Process data on arrival (i.e., transform, correlate, filter, and aggregate data).

• Understand, act, and record.

• Push relevant data to a downstream, big data system.

Page 5: Transactional Streaming: If you can compute it, you can probably stream it

Data Movement

Processing Logic

State Management

Page 6: Transactional Streaming: If you can compute it, you can probably stream it
Page 7: Transactional Streaming: If you can compute it, you can probably stream it
Page 8: Transactional Streaming: If you can compute it, you can probably stream it

Right Now

Page 9: Transactional Streaming: If you can compute it, you can probably stream it

Right Now

Page 10: Transactional Streaming: If you can compute it, you can probably stream it

Right Now

Page 11: Transactional Streaming: If you can compute it, you can probably stream it

One Size Fits All

• Analytics and operational stateful stores require different storage engines to be optimal.Columns vs. RowsVertica vs. VoltDB

• Machine LearningMulti-Dim MathSearch

• Microservices?

• Data Value?

Page 12: Transactional Streaming: If you can compute it, you can probably stream it

Specifically: Operational Stream Processing

and Operational State

Where integration makes sense:

Leading Edge Operations

Page 13: Transactional Streaming: If you can compute it, you can probably stream it

What’s the Difference?• Non-integrated systems means you write glue code, or you use

someone’s glue code.

• Operational glue code is different from batch-oriented glue code.

• Batch or OLAP has huge safety nets for glue code:

• HDFS, CSV, immutable data sets

• “Blow it away and reload”

• Much less time pressure

Page 14: Transactional Streaming: If you can compute it, you can probably stream it

Glue Glue

You wrote this. 1 User.

Tested Well 1000s of users

Tested Well 1000s of users Tested Well

1000s of users

Community Supplied Many Users

Page 15: Transactional Streaming: If you can compute it, you can probably stream it

But I’m not writing “glue code”

“I’m just using the well-tested Cassandra driver in my Storm code.”

• You’re using a computer network. They are not always reliable.

• Storm might fail in the middle of processing.

• Cassandra might fail in the middle of processing.

• Both systems are tested for this, but not together, using your glue code.

Page 16: Transactional Streaming: If you can compute it, you can probably stream it

Operational Glue Code is Hard

Main Point:

Minimize it

Page 17: Transactional Streaming: If you can compute it, you can probably stream it

Transactional Stream Processing

Page 18: Transactional Streaming: If you can compute it, you can probably stream it

Use the same system for state and processing.

Ensures they are tested together.

No independant failures.

Page 19: Transactional Streaming: If you can compute it, you can probably stream it

1 Transaction = 1 Event

ACID

• Atomic: Either 100% done or 0% done. No in-between.

• (Consistent)

• Isolated: Two concurrent operations can’t interfere with each other

• Durable: If it says it’s done, then it is done.

Page 20: Transactional Streaming: If you can compute it, you can probably stream it

Processing Code for a Single Event

Database / State

Page 21: Transactional Streaming: If you can compute it, you can probably stream it

Processing Code for a Single Event

Database / State

x x x x

Not Atomic

Page 22: Transactional Streaming: If you can compute it, you can probably stream it

Romeo And Juliet Explain “Atomicity”

Operation 1: Fake your death

Operation 2: Tell Romeo

Page 23: Transactional Streaming: If you can compute it, you can probably stream it

Processing Code for a Single Event

Database / State

Processing Code for a Single Event

Not Isolated

Page 24: Transactional Streaming: If you can compute it, you can probably stream it

“A good example is the best sermon.”

- Benjamin Franklin

Page 25: Transactional Streaming: If you can compute it, you can probably stream it

Call Center Management

http://www.publicdomainpictures.net/

3000 AgentsMillions of Customers

Dashboards & Alerts Billing

Actions

Events

Processing

State

Page 26: Transactional Streaming: If you can compute it, you can probably stream it

Call Center Management

Events

• “Begin Call” Calling Number, Agent Id, Start Time, etc…

• “End Call”Calling Number, Agent Id, End Time, etc…

Page 27: Transactional Streaming: If you can compute it, you can probably stream it

What Kind of Problems

• Correlation - Streaming Join

• Out-of-order delivery

• At least once delivery - How to dedup

• Generate new event on call completion - once

• Precise Accounting

• Precise Stats - Event time vs processing time

Page 28: Transactional Streaming: If you can compute it, you can probably stream it

Public Codehttps://github.com/VoltDB/app-callcenter

It’s not finished as of today…

Page 29: Transactional Streaming: If you can compute it, you can probably stream it

What’s the Hardest Part?

BeginCall code

EndCall code

State

Fake Call Generator

(Makes event pairs with delay)

Bad Network Transformer

(Duplicate & delay)

My Client Code

Page 30: Transactional Streaming: If you can compute it, you can probably stream it

Correlation Requires State

Page 31: Transactional Streaming: If you can compute it, you can probably stream it

Schema for Call Center ExampleCREATE TABLE opencalls( call_id BIGINT NOT NULL, agent_id INTEGER NOT NULL, phone_no VARCHAR(20 BYTES) NOT NULL, start_ts TIMESTAMP DEFAULT NULL, end_ts TIMESTAMP DEFAULT NULL, PRIMARY KEY (call_id, agent_id, phone_no));

CREATE TABLE completedcalls( call_id BIGINT NOT NULL, agent_id INTEGER NOT NULL, phone_no VARCHAR(20 BYTES) NOT NULL, start_ts TIMESTAMP NOT NULL, end_ts TIMESTAMP NOT NULL, duration INTEGER NOT NULL, PRIMARY KEY (call_id, agent_id, phone_no));

Unpaired call begin/end events Can arrive in any order

Any match transactionally moves to the completed

calls table

Page 32: Transactional Streaming: If you can compute it, you can probably stream it

Filtering Duplicates Requires Idempotence

Page 33: Transactional Streaming: If you can compute it, you can probably stream it

is the property of certain operations in mathematics and computer science, that can be

applied multiple times without changing the result beyond the initial application.

Idempotence

Page 34: Transactional Streaming: If you can compute it, you can probably stream it

Idempotent Not Idempotent

set x = 5;same as

set x = 5; set x = 5;

x++;not same as x++; x++;

if (x % 2 == 0) x++;same as

if (x % 2 == 0) x++; if (x % 2 == 0) x++;

if (x % 2 == 0) x *= 2;not same as

if (x % 2 == 0) x *= 2; if (x % 2 == 0) x *= 2;

spill coffee on brown pants eat whole plate of spaghetti

Page 35: Transactional Streaming: If you can compute it, you can probably stream it

Idempotent Operations

Exactly Once Semantics

At-Least-Once Delivery

+

=

Page 36: Transactional Streaming: If you can compute it, you can probably stream it

How to make BeginCall Idempotent?• If call record is in completed calls,

ignore.

• If the call record is in open calls and is missing end time, ignore.

• If call record is in open calls, check if this event completes the call. Yes, handle swapped begin & end

• Otherwise, create an new record in open calls table.

open calls

completed calls

Tables

Page 37: Transactional Streaming: If you can compute it, you can probably stream it

How to make BeginCall Idempotent?• If call record is in completed calls,

ignore.

• If the call record is in open calls and is missing end time, ignore.

• If call record is in open calls, check if this event completes the call. Yes, handle swapped begin & end

• Otherwise, create an new record in open calls table.

open calls

completed calls

TablesIdempotency

Page 38: Transactional Streaming: If you can compute it, you can probably stream it

• If call record is in completed calls, ignore.

• If the call record is in open calls and is missing end time, ignore.

• If call record is in open calls, check if this event completes the call. Yes, handle swapped begin & end

• Otherwise, create an new record in open calls table.

This thing to the left is a transaction.

Page 39: Transactional Streaming: If you can compute it, you can probably stream it

Actual Math

https://www.flickr.com/photos/kimmanleyort/13148718593

Accounting & Statistics May Require:

Page 40: Transactional Streaming: If you can compute it, you can probably stream it

Counting

• Counting is hard at scale.

• 2 Kinds of fail:

• Missed counts

• Extra counts

Page 41: Transactional Streaming: If you can compute it, you can probably stream it

Counting

Read

Read

x=27

Write 28

x=27 x=28

Write 28

x=28Value:

Incrementer 1:

Incrementer 2:

Page 42: Transactional Streaming: If you can compute it, you can probably stream it

Processing Code for a Single Event

Database / State

Processing Code for a Single Event

Not Isolated

Page 43: Transactional Streaming: If you can compute it, you can probably stream it

Counting

Systems with single-key consistency

Systems with special features to enable counters

ACID transactional systems

Systems that enforce a single writer

As we say in New England…

Performance is wicked variable.

Not “Read Committed”

Page 44: Transactional Streaming: If you can compute it, you can probably stream it

Accounting

• Accounting is just counting, but more so.

• Need to be able to increment by amount (or decrement).

• Often need to increment/decrement things in groups.

Page 45: Transactional Streaming: If you can compute it, you can probably stream it

Accounting• When gamer buys a Mystical Sword of Hegemony, update the following:

• Debit the gamer’s rubies or whatever.

• Update real-world region stats, like swords sold in gamer’s geo-region, total money spent in gamer’s geo-region etc…

• Update game region stats for the current game location, say the “Tar Shoals of Dintymoore”, like number of MSoHs in the region.

• Increment any offer-related stats, like record whether the MSoH was offered because of customer engagement algorithm X15 or B12.

Page 46: Transactional Streaming: If you can compute it, you can probably stream it

Processing Code for a Single Event

Database / State

x x x x

Not Atomic

Page 47: Transactional Streaming: If you can compute it, you can probably stream it

Accounting

Systems with single-key consistency

Systems with special features to enable counters

ACID transactional systems

Systems that enforce a single writer

As we say in New England…

Performance is wicked variable.

?

Page 48: Transactional Streaming: If you can compute it, you can probably stream it

Last Dollar Problem• Ad-Tech app wants to show a user an ad from a campaign.

• The price of the ad is $0.90.

• Advertiser has $1.00 campaign budget left.

• If the budget check and the display aren’t ACID, it’s possible to decide to show the ad twice.

• Ad-Tech app is forced to choose between over or under-billing.

Page 49: Transactional Streaming: If you can compute it, you can probably stream it

Aggregation

• Aggregation is just counting and accounting that the system does for you.

• Often this is counting chopped up by groups.

• Eg. Sword sales by region. % success by offer.

• In Call Center, it could be average call length by agent.

Page 50: Transactional Streaming: If you can compute it, you can probably stream it

Accounting Aggregation

Systems with single-key consistency

Systems with special features to enable counters

ACID transactional systems

Systems that enforce a single writer

As we say in New England…

Performance is wicked variable.

?

Page 51: Transactional Streaming: If you can compute it, you can probably stream it

How to Aggregate Without Consistency?

• Use a stand-alone stream processor.

• Best fit for aggregation by time, and specifically by processing time, not event time.

• Run a query on all the data every time you want the aggregation.

• BOO!

Page 52: Transactional Streaming: If you can compute it, you can probably stream it

Actual Math

What’s the mean and standard deviation of call length chopped up various ways?

Page 53: Transactional Streaming: If you can compute it, you can probably stream it

Running Varianceis my next band name.

Page 54: Transactional Streaming: If you can compute it, you can probably stream it

Running Standard Deviation

Page 55: Transactional Streaming: If you can compute it, you can probably stream it

The Details (mostly) Don’t Matter

• Still need to think about performance and likely horizontal partitioning of work.

• Integration of State & Processing + Full ACID Transactions => I can program this math without thinking about:

• Failure

• Interference from weak isolation.

• Partial Visibility to State

Page 56: Transactional Streaming: If you can compute it, you can probably stream it

Bonus Topics!

Page 57: Transactional Streaming: If you can compute it, you can probably stream it

Latency

Page 58: Transactional Streaming: If you can compute it, you can probably stream it

Low Latency Can Affect the Decision

500ms

Want to be here You lose money here

Page 59: Transactional Streaming: If you can compute it, you can probably stream it

Get Into the “Fast Path”• Policy Enforcement in Telco

• Fraud Detection “Smoke Tests”

• Change what a user sees in response to action:

• Change the next webpage content based on recent website actions.

• Pick what’s behind the magic door based on how the game is going.

Page 60: Transactional Streaming: If you can compute it, you can probably stream it

Does your data matter?

Page 61: Transactional Streaming: If you can compute it, you can probably stream it

ProblemFactory full of robots

Sometimes they break

They log metadata

Page 62: Transactional Streaming: If you can compute it, you can probably stream it

When Imperfect is Enough

• Before: No metadata. Maintenance works on stuff based on their experience, schedules and visual inspection.

• Now: Basic stream processing system is up 99% of the time, and provides a much richer guidance to maintenance. Robots fail less often and cost less to operate.

• Possible Future: More sophisticated stream processing is up 99.99% of the time and offers even more insight. Robots fail a tiny bit less often and costs are a tiny bit down.

Page 63: Transactional Streaming: If you can compute it, you can probably stream it

When Imperfect Isn’t Worth It

Probability of Failure (under system X)

Expected Average Failure Cost# of Operations x xCost of System X +

• I’ve worked on Ad-Tech use cases => High # Operations

• Complex Multi-Cluster/System Monsters => High % failure

• Billing systems and fraud systems => High cost per failure

Licenses Hardware

Engineering (Switching Tech)

Page 64: Transactional Streaming: If you can compute it, you can probably stream it

More consistent systems don’t have to

be more expensive

Easier to develop => Less Engineering More Efficient => Less Hardware

Page 65: Transactional Streaming: If you can compute it, you can probably stream it

Conclusion - Thank You!

• Operations => Integration WinsAnalytics, Batch => Use Specialized Tools

• With transactions, complex math becomes mostly typing.

• Many of these problems can be solved without transactional streaming, but…

• It’s going to be harder • It might be less accurate

BS

Stuff I Don't Know

Stuff I Know

T H I S TA L K

http://chat.voltdb.com

@johnhugg [email protected]

all images from wikimedia w/ cc license unless otherwise noted