stateful distributed stream processing

Stateful distributed stream processing Gyula Fóra [email protected] @GyulaFora

Upload: gyula-fora

Post on 09-Feb-2017



Data & Analytics

4 download


Page 1: Stateful Distributed Stream Processing

Stateful distributed stream processing

Gyula Fó[email protected]


Page 2: Stateful Distributed Stream Processing

This talk

§ Stateful processing by example

§Definition and challenges

§ State in current open-source systems

§ State in Apache Flink


2Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 3: Stateful Distributed Stream Processing

Stateful processing by example

§Window aggregations• Total number of customers

in the last 10 minutes• State: Current aggregate

§Machine learning• Fitting trends to the evolving

stream• State: Model

3Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 4: Stateful Distributed Stream Processing

Stateful processing by example

§ Pattern recognition• Detect suspicious financial

activity• State: Matched prefix

§ Stream-stream joins• Match ad views and

impressions• State: Elements in the window

4Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 5: Stateful Distributed Stream Processing

Stateful operators

§ All these examples use a common processing pattern

§ Stateful operator (in essence):𝒇:   𝒊𝒏, 𝒔𝒕𝒂𝒕𝒆 ⟶ 𝒐𝒖𝒕, 𝒔𝒕𝒂𝒕𝒆.

§ State hangs around and can be read and modified as the stream evolves

§ Goal: Get as close as possible while maintaining scalability and fault-tolerance

5Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 6: Stateful Distributed Stream Processing

State-of-the-art systems

§Most systems allow developers to implement stateful programs

§ Trick is to limit the scope of 𝒇 (state access) while maintaining expressivity

§ Issues to tackle:• Expressivity• Exactly-once semantics• Scalability to large inputs• Scalability to large states

6Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 7: Stateful Distributed Stream Processing

§ States available only in Trident API§Dedicated operators for state updates and

queries§ State access methods• stateQuery(…)• partitionPersist(…)• persistentAggregate(…)

§ It’s very difficult toimplement transactionalstates

Exactly-­‐‑once  guarantee

7Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 8: Stateful Distributed Stream Processing

Storm Word Count

8Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 9: Stateful Distributed Stream Processing

§ Stateless runtime by design• No continuous operators• UDFs are assumed to be stateless

§ State can be generated as a stream of RDDs: updateStateByKey(…)

𝒇:   𝑺𝒆𝒒[𝒊𝒏𝒌], 𝒔𝒕𝒂𝒕𝒆𝒌 ⟶ 𝒔𝒕𝒂𝒕𝒆.𝒌§ 𝒇 is scoped to a specific key

§ Exactly-once semantics

9Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 10: Stateful Distributed Stream Processing

val stateDstream = wordDstream.updateStateByKey[Int](newUpdateFunc,new HashPartitioner(ssc.sparkContext.defaultParallelism),true,initialRDD)

val updateFunc = (values: Seq[Int], state: Option[Int]) => {val currentCount = values.sumval previousCount = state.getOrElse(0)Some(currentCount + previousCount)


Spark Streaming Word Count

10Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 11: Stateful Distributed Stream Processing

§ Stateful dataflow operators(Any task can hold state)

§ State changes are storedas a log by Kafka

§Custom storage engines canbe plugged in to the log

§ 𝒇 is scoped to a specific task§At-least-once processing


11Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 12: Stateful Distributed Stream Processing

Samza Word Count public class WordCounter implements StreamTask, InitableTask {

//Some omitted details…

private KeyValueStore<String, Integer> store;

public void process(IncomingMessageEnvelope envelope,MessageCollector collector, TaskCoordinator coordinator) {

//Get the current countString word = (String) envelope.getKey();Integer count = store.get(word);if (count == null) count = 0;

//Increment, store and sendcount += 1;store.put(word, count);collector.send(

new OutgoingMessageEnvelope(OUTPUT_STREAM, word ,count));}

}12Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 13: Stateful Distributed Stream Processing

What can we say so far?§ Trident

+ Consistent state accessible from outside– Only works well with idempotent states– States are not part of the operators

§ Spark+ Integrates well with the system guarantees– Limited expressivity– Immutability increases update complexity

§ Samza+ Efficient log based state updates+ States are well integrated with the operators– Lack of exactly-once semantics– State access is not fully transparent

13Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 14: Stateful Distributed Stream Processing

§ Take what’s good, make it work + add some more

§Clean and powerful abstractions• Local (Task) state• Partitioned (Key) state

§ Proper API integration• Java: OperatorState interface• Scala: mapWithState, flatMapWithState…

§ Exactly-once semantics by checkpointing

14Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 15: Stateful Distributed Stream Processing

Flink Word Count

words.keyBy(x => x).mapWithState {(word, count: Option[Int]) =>{

val newCount = count.getOrElse(0) + 1val output = (word, newCount)(output, Some(newCount))


15Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 16: Stateful Distributed Stream Processing

Local State

§ Task scoped state access§Can be used to implement

custom access patterns§ Typical usage:• Source operators (offset)• Machine learning models• Use cyclic flows to simulate

global state access

16Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 17: Stateful Distributed Stream Processing

Local State Example (Java)

public class MySource extends RichParallelSourceFunction {// Omitted detailsprivate OperatorState<Long> offset;

@Overridepublic void run(SourceContext ctx) {

Object checkpointLock = ctx.getCheckpointLock();isRunning = true;while (isRunning) {

synchronized (checkpointLock) {offset.update(offset.value() + 1);// ctx.collect(next);



17Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 18: Stateful Distributed Stream Processing

Partitioned State

§ Key scoped state access§Highly scalable§Allows for incremental

backup/restore§ Typical usage:• Any per-key operation• Grouped aggregations• Window buffers

18Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 19: Stateful Distributed Stream Processing

Partitioned State Example (Scala)

// Compute the current average of each city's temperaturetemps.keyBy("city").mapWithState {

(in: Temp, state: Option[(Double, Long)]) =>{val current = state.getOrElse((0.0, 0L))val updated = (current._1 + in.temp, current._2 + 1)val avg = Temp(, updated._1 / updated._2)(avg, Some(updated))


case class Temp(city: String, temp: Double)

19Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 20: Stateful Distributed Stream Processing

Exactly-once semantics

§ Based on consistent global snapshots§Algorithm designed for stateful dataflows

20Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Detailed  mechanism

Page 21: Stateful Distributed Stream Processing

Exactly-once semantics

§ Low runtime overhead§Checkpointing logic is separated from

application logic

21Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Blogpost  on  streaming  fault-­‐‑tolerance

Page 22: Stateful Distributed Stream Processing


§ State is essential to many applications§ Fault-tolerant streaming state is a hard

problem§ There is a trade-off between expressivity vs

scalability/fault-tolerance§ Flink tries to hit the sweet spot with…• Providing very flexible abstractions• Keeping good scalability and exactly-once


22Apache  Flink  Meetup  @  MapR2015-­‐‑08-­‐‑27

Page 23: Stateful Distributed Stream Processing

Thank you!