apache flink big data stream processing · pdf fileapache flink big data stream processing...
Post on 17-Mar-2018
259 Views
Preview:
TRANSCRIPT
1 © DIMA 2017© 2013 Berlin Big Data Center • All Rights Reserved1 © DIMA 2017
Apache FlinkBig Data Stream Processing
Tilmann RablBerlin Big Data Center
www.dima.tu-berlin.de | bbdc.berlin | rabl@tu-berlin.deXLDB – 11.10.2017
2 © DIMA 20172
2 © DIMA 2017
Agenda
Disclaimer: I am neither a Flink developer nor affiliated with data Artisans.
3 © DIMA 20173
3 © DIMA 2017
AgendaFlink Primer• Background & APIs (-> Polystore functionality)• Execution Engine• Some key features
Stream Processing with Apache Flink• Key features
With slides from data Artisans, Volker Markl, Asterios Katsifodimos
4 © DIMA 20174 © 2013 Berlin Big Data Center • All Rights Reserved
4 © DIMA 2017
Flink Timeline
5 © DIMA 2017
• Relational Algebra• Declarativity• Query Optimization• Robust Out-of-core
• Scalability• User-defined
Functions • Complex Data Types• Schema on Read
• Iterations• Advanced Dataflows• General APIs• Native Streaming
Draws onDatabase Technology
Draws onMapReduce Technology
Adds
Stratosphere: General Purpose Programming + Database Execution
6 © DIMA 20176 © 2013 Berlin Big Data Center • All Rights Reserved
6 © DIMA 2017
The APIs
6
Process Function (events, state, time)
DataStream API (streams, windows)
Table API (dynamic tables)
Stream SQL
Stream- &Batch Processing
Analytics
StatefulEvent-DrivenApplications
7 © DIMA 20177 © 2013 Berlin Big Data Center • All Rights Reserved
7 © DIMA 2017
Process Function
7
class MyFunction extends ProcessFunction[MyEvent, Result] {
// declare state to use in the programlazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…)
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = {// work with event and state(event, state.value) match { … }
out.collect(…) // emit eventsstate.update(…) // modify state
// schedule a timer callbackctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[Result]): Unit = {// handle callback when event-/processing- time instant is reached
}}
8 © DIMA 20178 © 2013 Berlin Big Data Center • All Rights Reserved
8 © DIMA 2017
Data Stream API
8
val lines: DataStream[String] = env.addSource(new FlinkKafkaConsumer09<>(…))
val events: DataStream[Event] = lines.map((line) => parse(line))
val stats: DataStream[Statistic] = stream.keyBy("sensor").timeWindow(Time.seconds(5)).sum(new MyAggregationFunction())
stats.addSink(new RollingSink(path))
9 © DIMA 20179 © 2013 Berlin Big Data Center • All Rights Reserved
9 © DIMA 2017
Table API & Stream SQL
9
10 © DIMA 201710 © 2013 Berlin Big Data Center • All Rights Reserved
10 © DIMA 2017
What can I do with it?
An engine that can natively support all these workloads.
Flink
Stream processing
Batchprocessing
Machine Learning at scale
Graph AnalysisComplex event processing
11 © DIMA 201711 © 2013 Berlin Big Data Center • All Rights Reserved
11 © DIMA 2017
Flink in the Analytics Ecosystem
11
MapReduce
Hive
Flink
Spark Storm
Yarn Mesos
HDFS
Mahout
Cascading
Tez
Pig
Data processing engines
App and resource management
Applications &Languages
Storage, streams KafkaHBase
Crunch
…
Giraph
12 © DIMA 2017
Where in my cluster does Flink fit?
- Gather and backup streams- Offer streams for consumption- Provide stream recovery
- Analyze and correlate streams- Create derived streams and state- Provide these to upstream systems
Serverlogs
Trxnlogs
Sensorlogs
Upstreamsystems
Gathering Integration Analysis
13 © DIMA 201713
13 © DIMA 2017
Architecture• Hybrid MapReduce and MPP database runtime
• Pipelined/Streaming engine– Complete DAG deployed
Worker 1
Worker 3 Worker 4
Worker 2
Job Manager
14 © DIMA 201714
14 © DIMA 2017
Flink Execution Model• Flink program = DAG* of operators and intermediate streams• Operator = computation + state• Intermediate streams = logical stream of records
15 © DIMA 201715 © 2013 Berlin Big Data Center • All Rights Reserved
15 © DIMA 2017
Technology inside Flinkcase class Path (from: Long, to:Long)val tc = edges.iterate(10) {
paths: DataSet[Path] =>val next = paths
.join(edges)
.where("to")
.equalTo("from") {(path, edge) =>
Path(path.from, edge.to)}.union(paths).distinct()
next}
Cost-based optimizer
Type extraction stack
Task scheduling
Recoverymetadata
Pre-flight (Client)
MasterWorkers
DataSource
orders.tbl
Filter
Map DataSource
lineitem.tbl
JoinHybrid Hash
buildHT
probe
hash-part [0] hash-part [0]
GroupRed
sort
forward
Program
DataflowGraph
Memory manager
Out-of-core algorithms
Batch & streaming
State & checkpoints
deployoperators
trackintermediate
results
16 © DIMA 201716
16 © DIMA 2017
Rich set of operators
16
Map, Reduce, Join, CoGroup, Union, Iterate, Delta Iterate, Filter, FlatMap, GroupReduce, Project, Aggregate, Distinct, Vertex-Update, Accumulators, …
17 © DIMA 201717
17 © DIMA 2017
Effect of optimization
17
Run on a sampleon the laptop
Run a month laterafter the data evolved
Hash vs. SortPartition vs. BroadcastCachingReusing partition/sortExecution
Plan A
ExecutionPlan B
Run on large fileson the cluster
ExecutionPlan C
18 © DIMA 201718
18 © DIMA 2017
Flink Optimizer Transitive Closure
HDFS
newPaths
paths
Join Union Distinct
Step function
IterateIteratereplace
• What you write is not what is executed• No need to hardcode execution strategies
• Flink Optimizer decides:– Pipelines and dam/barrier placement– Sort- vs. hash- based execution– Data exchange (partition vs. broadcast)– Data partitioning steps– In-memory caching
Hash Partition on [0]Hash Partition on [1]
Hybrid Hash Join
Loop-invariant data cached in memory
Group Reduce (Sorted (on [0]))
Co-locate JOIN + UNIONHash Partition on [1]
ForwardCo-locate DISTINCT + JOIN
19 © DIMA 201719
19 © DIMA 2017
Scale Out
19
20 © DIMA 201720 © DIMA 2017
Stream Processing with Flink
21 © DIMA 201721
21 © DIMA 2017
8 Requirements of Big Streaming• Keep the data moving
– Streaming architecture
• Declarative access– E.g. StreamSQL, CQL
• Handle imperfections– Late, missing, unordered items
• Predictable outcomes– Consistency, event time
• Integrate stored and streaming data– Hybrid stream and batch
• Data safety and availability– Fault tolerance, durable state
• Automatic partitioning and scaling– Distributed processing
• Instantaneous processing andresponse
The 8 Requirements of Real-Time Stream Processing – Stonebraker et al. 2005
22 © DIMA 201722
22 © DIMA 2017
8 Requirements of Streaming Systems• Keep the data moving
– Streaming architecture
• Declarative access– E.g. StreamSQL, CQL
• Handle imperfections– Late, missing, unordered items
• Predictable outcomes– Consistency, event time
• Integrate stored and streaming data– Hybrid stream and batch – see StreamSQL
• Data safety and availability– Fault tolerance, durable state
• Automatic partitioning and scaling– Distributed processing
• Instantaneous processing andresponse
The 8 Requirements of Real-Time Stream Processing – Stonebraker et al. 2005
23 © DIMA 201723
23 © DIMA 2017
How to keep data moving?
Streamdiscretizer
Job Job Job Jobwhile (true) {// get next few records// issue batch computation
}
while (true) {// process next record
}
Long-standing operators
Discretized Streams (mini-batch)
Native streaming
24 © DIMA 201724 © 2013 Berlin Big Data Center • All Rights Reserved
24 © DIMA 2017
Declarative Access – Stream SQL
24
Stream / Table Duality
Table with Primary KeyTable without Primary Key
25 © DIMA 201725
25 © DIMA 2017
Handle Imperfections - Event Time et al.
• Event time– Data item production time
• Ingestion time– System time when data item is received
• Processing time– System time when data item is processed
• Typically, these do not match!• In practice, streams are unordered!
Image: Tyler Akidau
26 © DIMA 201726 © 2013 Berlin Big Data Center • All Rights Reserved
26 © DIMA 2017
Time: Event Time Example
26
1977 1980 1983 1999 2002 2005 2015
Processing Time
EpisodeIV
EpisodeV
EpisodeVI
EpisodeI
EpisodeII
EpisodeIII
EpisodeVII
Event Time
27 © DIMA 201727
27 © DIMA 2017
Flink’s Windowing• Windows can be any combination of (multiple) triggers & evictions
– Arbitrary tumbling, sliding, session, etc. windows can be constructed.
• Common triggers/evictions part of the API– Time (processing vs. event time), Count
• Even more flexibility: define your own UDF trigger/eviction
• Examples:dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5)));dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5)));
• Flink will handle event time, ordering, etc.
28 © DIMA 2017
Example Analysis: Windowed Aggregation
val windowedStream = stockStream.window(Time.of(10, SECONDS)).every(Time.of(5, SECONDS))val lowest = windowedStream.minBy("price")val maxByStock = windowedStream.groupBy("symbol").maxBy("price")val rollingMean = windowedStream.groupBy("symbol").mapWindow(mean _)
(1)
(2)
(4)
(3)
(1)(2)
(4)(3)
StockPrice(SPX, 2113.9)StockPrice(FTSE, 6931.7)StockPrice(HDP, 23.8)StockPrice(HDP, 26.6)
StockPrice(HDP, 23.8)
StockPrice(SPX, 2113.9)StockPrice(FTSE, 6931.7)StockPrice(HDP, 26.6)
StockPrice(SPX, 2113.9)StockPrice(FTSE, 6931.7)StockPrice(HDP, 25.2)
29 © DIMA 201729
29 © DIMA 2017
Data Safety and Availability
• Ensure that operators see all events– “At least once”– Solved by replaying a stream from a checkpoint– No good for correct results
• Ensure that operators do not perform duplicate updates to their state– “Exactly once”– Several solutions
• Ensure the job can survive failure
29
30 © DIMA 201730
30 © DIMA 2017
Lessons Learned from Batch
• If a batch computation fails, simply repeat computation as a transaction• Transaction rate is constant• Can we apply these principles to a true streaming execution?
30
batch-1batch-2
31 © DIMA 201731
31 © DIMA 2017
Taking Snapshots – the naïve way
31
Initial approach (e.g., Naiad)• Pause execution on t1,t2,..• Collect state• Restore execution
t2t1
execution snapshots
32 © DIMA 201732
32 © DIMA 2017
Asynchronous Snapshots in Flinkt2t1
snap - t1 snap - t2
snapshotting snapshotting
Propagating markers/barriers
Full or incremental
33 © DIMA 201733
33 © DIMA 2017
ConclusionApache Flink!
The case for Flink as a stream processor• Ideal basis for polystore computations• Full feature big data streaming engine
34 © DIMA 2017© 2013 Berlin Big Data Center • All Rights Reserved34 © DIMA 2017
Thank You
Contact:Tilmann Rablrabl@tu-berlin.de
top related