alexander kolb – flink. yet another streaming framework?

35
Alexander Kolb, Otto Group BI Hamburg, Germany, 2015 Flink, Yet another Streaming Framework?

Upload: flink-forward

Post on 08-Jan-2017

5.948 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Alexander Kolb – Flink. Yet another Streaming Framework?

Alexander Kolb, Otto Group BI Hamburg, Germany, 2015

Flink, Yet another Streaming Framework?

Page 2: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Alexander Kolb otto group BI

@lofifnc

Page 3: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Introduction

Page 4: Alexander Kolb – Flink. Yet another Streaming Framework?

Eval

uatio

n

Usability

Functionality

Architecture

Support

Non-Functional-Requirements

Page 5: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Approach

5

Rating based on: - Research - Hands-on

Page 6: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Use-case

Page 7: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Use-case

7

Page 8: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Frameworks

Page 9: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Frameworks

9

SQLStream

Pulsar

SPQR

Apache Spark

Apache Flink

Page 10: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

SQLStream

Page 11: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

SQLStream

11

Architecture

source: sqlstream.com

Page 12: Alexander Kolb – Flink. Yet another Streaming Framework?

SQLS

trea

m

Page 13: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

SQLStream

13

Window Aggregation

1 SELECT STREAM 2 B.pagetype,3 B.ecid,4 productid,5 SUM(QUANTITY) AS "views"6 FROM VIEWS AS B7 GROUP BY FLOOR((B.ROWTIME - TIMESTAMP '1970-01-01 00:00:00')8 MINUTE / 5 to MINUTE),9 PRODUCTID, PAGETYPE, ECID;

Page 14: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Pulsar

Page 15: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Pulsar

15

Architecture

source: github.com/pulsarIO

Page 16: Alexander Kolb – Flink. Yet another Streaming Framework?

Puls

ar

Page 17: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Pulsar

17

Window Aggregation

1 create context MCContext start @now end after 60 seconds; 2 3 context MCContext 4 insert into ViewAgg select count(*) as views, prid 5 from PageView group by prid output snapshot when terminated;

Page 18: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

SPQR

Page 19: Alexander Kolb – Flink. Yet another Streaming Framework?

SPQ

R

source: github.com/ottogroup/SPQR

Page 20: Alexander Kolb – Flink. Yet another Streaming Framework?

SPQ

R

Page 21: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

SPQR

21

Window Aggregation

1 select productid, ecid, sum(quantity)2 from views.win:time_batch(5 min) 3 group by productid, ecid

Page 22: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Apache Spark

Page 23: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Apache Spark

23

Architecture

source: spark.apache.org

Page 24: Alexander Kolb – Flink. Yet another Streaming Framework?

Apa

che

Spar

k

Page 25: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Apache Spark

25

Aggregation

1 val aggViews = views.reduceByKeyAndWindow({ 2 case ((pageType, ecid, sum, price),(_,_,quant,_)) => 3 (pageType, ecid, sum + quant, price) 4 }, Minutes(5), Minutes(5))

Page 26: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Apache Flink

26

Architecture

source: spark.apache.org

source: flink.apache.org

Page 27: Alexander Kolb – Flink. Yet another Streaming Framework?

Apa

che

Flin

k

Page 28: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Apache Flink

28

Window Aggregation

1 val aggViews = input.window(Time.of(5, TimeUnit.MINUTES)) 2 .groupBy(“productId”).sum(“quantity”);

Page 29: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Result Evaluation

Page 30: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Summary

30

Use-case

Topic UnitFramework

Pulsar.io SQLStream SPQR Flink Spark

Time for building the stream hours 40 35+

(POC)8+

(POC) 13 4

Time for adding missing connector

hours 3 8 1 3 0.5

Points 3.14 2.06 3.44 4.16 4.45

Page 31: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

List of Rating Aspects

31

DSL/DDL/UI for creating Pipelines / Required know-how to define new Pipelines / Project documentation /Workflow / Testing Workflows / hot deploying / redeploying of pipelines / dynamic topology changes / Monitoring / Deployment / Dashboard for data visualization / Ease of defining udf's / Merge / Sum / Count / Min/max/avg / Aggregate / Transform / Parsing (xml/json/csv) / Group-by / Join / Ease of defining new connectors / Kafka / WebSocket / JDBC / JMS / HDFS / File / Effort for cluster deployment / Configuration effort / Supports YARN / Supports Mesos / Scalability / Resilience /Predefined communication framework / Dependencies / Flexibility / Expandability / Buffering/Pressure handling / Partitioning/Parallelism / Strategy for Partitioning/ Parallelism? / Ordering / Guarantees / State-Management / Fault tolerance / Licensing model / Professional support available / Community Activity / License / Maturity / Manageable code-base / Community Size

Page 32: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Summary

32

Topic Framework weightSQLStream Pulsar.io SPQR Spark Flink

Usability 2.6 3 2.2 3.6 3.9 15Flexibility 2.5 4 3 1.5 1.5 8User-Interface 3.3 1.8 1 3 3.3 6Operators 4.8 4.3 4.3 3.9 4.7 10Connectors 4.7 1.9 1.9 2.4 2.6 6Deployment 2.4 3.2 2.8 3.8 3.7 10Architecture/ Concepts 2 3.2 3.3 4 4 12

Functional Requirements 2 3.2 2.4 4 4 14

Costs 0 5 5 5 5 5Service/ Support 2.5 0.5 1 4.5 3.5 8Project 1.8 3.3 2.5 4.3 3.8 6

Sum 2.58 3.11 2.74 3.74 3.80 100

Page 33: Alexander Kolb – Flink. Yet another Streaming Framework?

Sum

mar

yUsability

Flexibility

User-Interface

Operators

Connectors

Deployment

Architecture/ Concepts

Functional Requirements

SQLStreamPulsar.ioSPQRSparkFlink

Page 34: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

Final Scores

34

SQLStream Pulsar.io SPQR Spark Flink

Evaluation 2.58 3.11 2.74 3.74 3.8

Use-case 2.06 3.14 3.44 4.45 4.16

Page 35: Alexander Kolb – Flink. Yet another Streaming Framework?

Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015

ottogroup.comWE ARE HIRING!