the hive think tank: "stream processing systems" by karthik ramasamy of twitter

8
Stream Processing Systems Karthik Ramasamy Twitter @karthikz

Upload: the-hive

Post on 09-Feb-2017

351 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter

Stream Processing Systems

Karthik RamasamyTwitter

@karthikz

Page 2: The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter

2

Value of Real Time DataIt’s contextual

[1] Courtesy Michael Franklin, BIRTE, 2015.

Page 3: The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter

3

Heron

Batching of tuplesAmortizing the cost of transferring tuples

Task isolation

Ease of

debug-ability/isolation/profiling

Fully API compatible with StormDirected acyclic graph

Topologies, Spouts and Bolts

Support for back pressureTopologies should self adjustingg

Use of main stream languagesC++, Java and Python

EfficiencyReduce resource consumption G

Design: Goals

Page 4: The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter

4

Better Storm

Twitter Heron

Container Based Architecture\Separate Monitoring and Scheduling-Simplified Execution Model2Much Better Performance

Page 5: The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter

5

HeronSample Topologies

Page 6: The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter

6

Heron@TwitterStorm is decommissioned

LARG

EST

CLUS

TER

100’

s of T

OPO

LOGI

ES

BILL

IONS

OF M

ESSA

GES

100’s

OF T

ERAB

YTES

REDU

CED

INCI

DENT

S

GOO

D N

IGHT

SLE

EP

3X reduction in resource usage

Page 7: The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter

Auto scaling the system in the presence of unpredictability

7

Technology Challenges

The Road Ahead

Auto tuning of real time analytics jobs/queries

Exploiting faster networks for efficiently moving data

ÄÜ

J

Page 8: The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter

8

@karthikz Get in Touch