real time recommendations using spark streaming · filter consume filter impressions consume plays...
TRANSCRIPT
![Page 1: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/1.jpg)
Real Time Recommendations using Spark Streaming
Elliot Chow
![Page 2: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/2.jpg)
Why?
- React more quickly to changes in interest- Time-of-day effects- Real-world events
![Page 3: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/3.jpg)
![Page 4: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/4.jpg)
Data Systems
Stream Processing
Recommendation Systems
UI
Feedback Loop
![Page 5: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/5.jpg)
Trends Data
- What people browse: impressions- What people watch: plays
![Page 6: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/6.jpg)
Trends Data - Impressions
Appearance of a video in the viewport
![Page 7: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/7.jpg)
Trends Data - Plays
Member plays a video
![Page 8: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/8.jpg)
Why Spark Streaming?
- Existing Spark infrastructure- Experience with Spark- Batch and Streaming
![Page 9: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/9.jpg)
Components
![Page 10: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/10.jpg)
Join CassandraAggregate S3Transform
Filter
FilterConsume Impressions
Consume Plays
Design
![Page 11: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/11.jpg)
Join CassandraAggregateS3
Transform
Filter
FilterConsume Impressions
Consume Plays
Design
![Page 12: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/12.jpg)
Join Key
“Request Id” - a unique identifier of the source of a play or impression
![Page 13: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/13.jpg)
Design
JoinCassandraAggregate S3Transform
Filter
FilterConsume Impressions
Consume Plays
![Page 14: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/14.jpg)
Output
Video Epoch Plays Impressions
Stranger Things 1 (00:00-00:30) 4 5
Stranger Things 1 (00:00-00:30) 3 6
House Of Cards 2 (00:30-01:00) 8 10
Marseille 2 (00:30-01:00) 3 3
![Page 15: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/15.jpg)
- Instead of raw counts, output sets of request ids- Count = cardinality of the set of request ids
- Idempotent counting
Output
![Page 16: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/16.jpg)
Join CassandraAggregateS3Transform
Filter
FilterConsume Impressions
Consume Plays
Design
![Page 17: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/17.jpg)
Programming with Spark Streaming
![Page 18: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/18.jpg)
Streaming Joins
![Page 19: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/19.jpg)
Streaming Joins - Time
- Time to browse and select a video- Batched logging from client application- Delays in data sources
![Page 20: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/20.jpg)
- Window both plays and impressions by epoch duration- Join the two windows together- Slide by epoch duration
Streaming Joins - Attempt I
t
Plays
Impressions
![Page 21: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/21.jpg)
Streaming Joins - Attempt I
- Easy to implement- Tight coupling with processing time- Does not mesh well with absolute time windows- Failure can mean loss of all data for the entire window
Epoch 1
Window Start00:15
Epoch 2
Window End00:45
00:00 00:30 01:00
![Page 22: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/22.jpg)
Streaming Joins - Attempt II
- Join using mapWithState- Join key is the mapWithState key - State is the plays and impressions sharing the same join key- Use timeouts to expire unjoined data
![Page 23: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/23.jpg)
Streaming Joins - Attempt II
R1, I1
Plays & Impressions
MapWithStateRDD
![Page 24: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/24.jpg)
Streaming Joins - Attempt II
R1 => { I1 }R1, I1
Plays & Impressions
MapWithStateRDD
![Page 25: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/25.jpg)
Streaming Joins - Attempt II
R1 => { I1 }R2, I8
Plays & Impressions
MapWithStateRDD
![Page 26: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/26.jpg)
Streaming Joins - Attempt II
R1 => { I1 }R2 => { I8 }R2, I8
Plays & Impressions
MapWithStateRDD
![Page 27: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/27.jpg)
Streaming Joins - Attempt II
R1 => { I1 }R2 => { I8 }R1, P1
Plays & Impressions
MapWithStateRDD
![Page 28: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/28.jpg)
Streaming Joins - Attempt II
R1 => { I1, P1 }R2 => { I8 }
R1, P1R1, I1R1, P1
Plays & Impressions
MapWithStateRDD
![Page 29: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/29.jpg)
Streaming Joins - Attempt II
R1 => { I1, P1 }R2 => { I8 }R3, I5
Plays & Impressions
MapWithStateRDD
![Page 30: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/30.jpg)
Streaming Joins - Attempt II
R1 => { I1, P1 }R2 => { I8 }R3, I5R3 => { I5 }
Plays & Impressions
MapWithStateRDD
![Page 31: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/31.jpg)
Streaming Joins - Attempt II
R1 => { I1, P1 }R1, I6
R3 => { I5 }
Plays & Impressions
MapWithStateRDD
![Page 32: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/32.jpg)
Streaming Joins - Attempt II
R1 => { I1, P1, I6 }R1, I6
R3 => { I5 }
R1, I6
Plays & Impressions
MapWithStateRDD
![Page 33: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/33.jpg)
Streaming Joins - Attempt II
R1 => { I1, P1, I6 }...
R3 => { I5 }
Plays & Impressions
MapWithStateRDD
![Page 34: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/34.jpg)
Streaming Joins - Attempt II
- Make progress every batch- Too much “uninteresting” data- High memory usage- Large checkpoints
![Page 35: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/35.jpg)
Streaming Joins - An Observation
t
Plays
Impressions
![Page 36: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/36.jpg)
Streaming Joins - An Observation
t
Plays
Impressions
![Page 37: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/37.jpg)
Streaming Joins - An Observation
t
Plays
Impressions
Join incoming batch of plays to windowed impressions, and vice versa
![Page 38: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/38.jpg)
Streaming Joins - An Observation
t
Plays
Impressions
Slide by batch interval...
![Page 39: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/39.jpg)
Streaming Joins - An Observation
t
Plays
Impressions
Slide by batch interval again...
![Page 40: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/40.jpg)
Streaming Joins - Attempt III
- Counts are updated every batch- Uses Spark’s windowing - No checkpoints
![Page 41: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/41.jpg)
mapWithState
![Page 42: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/42.jpg)
mapWithState
- Can be used for more than sessionization
![Page 43: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/43.jpg)
mapWithState
- Can be used for more than sessionization- Be aware of cache evictions
- Lots of state may need to be recomputed
![Page 44: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/44.jpg)
mapWithState
val input: DStream[(VideoId, RequestId)] = // ...
val spec: StateSpec[VideoId, RequestId, Set[RequestId], (VideoId, Set[RequestId])] = // ...
val output: DStream[(VideoId, Set[RequestId])] = { input. mapWithState(spec)}
![Page 45: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/45.jpg)
mapWithStateval input: DStream[(VideoId, RequestId)] = // ...
val spec: StateSpec[VideoId, RequestId, Set[RequestId], (VideoId, Set[RequestId])] = // ...
val output: DStream[(VideoId, Set[RequestId])] = { input. mapWithState(spec). groupByKey. mapValues(_.maxBy(_.size))}
![Page 46: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/46.jpg)
mapWithStateval input: DStream[(VideoId, RequestId)] = // ...
val spec: StateSpec[VideoId, Iterable[RequestId], Set[RequestId], (VideoId, Set[RequestId])] = // ...
val output: DStream[(VideoId, Set[RequestId])] = { input. groupByKey. mapWithState(spec)}
![Page 47: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/47.jpg)
mapWithStateval input: DStream[(VideoId, RequestId)] = // ...
val spec: StateSpec[VideoId, RequestId, Set[RequestId], Unit] = // ...
val output: DStream[(VideoId, Set[RequestId])] = { input. mapWithState(spec). stateSnapshots}
![Page 48: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/48.jpg)
Productionizing Spark Streaming
![Page 49: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/49.jpg)
Metrics
- Monitoring system health- Aid in diagnosis of issues- Needs to be performant and accurate
![Page 50: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/50.jpg)
Metrics - Option I
- Use “traditional” stream processing metrics - Events/second, bytes/second, …
- Batching can make numbers hard to interpret- Susceptible to recomputation
![Page 51: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/51.jpg)
Metrics - Option II
- Spark Accumulators- Used internally by Spark- Susceptible to recomputation- Unclear when to report the metric
- Can make use of SparkListener & StreamingListener
![Page 52: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/52.jpg)
Metrics - Option III
- Explicit counts on RDDs- Counts will be accurate- Additional latency- Use caching to prevent duplicate work*
![Page 53: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/53.jpg)
Metrics
- Processing time < Batch interval- Time the different parts of the job
- Spark is lazy - may require forcing evaluation- Use Spark UI metrics
![Page 54: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/54.jpg)
Error Handling
- What exceptions cause the streaming job to crash?
![Page 55: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/55.jpg)
Error Handling
- What exceptions cause the streaming job to crash?- Most seem to be caught to keep the job running
- Exception handling is application-specific- Stop-gap: track the elapsed time since the batch started
![Page 56: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/56.jpg)
Future Work
![Page 57: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/57.jpg)
Future Work
- Red/Black deployment with zero data-loss
![Page 58: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/58.jpg)
Future Work
- Red/Black deployment with zero data-loss- Auto-scaling
![Page 59: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/59.jpg)
Future Work
- Red/Black deployment with zero data-loss- Auto-scaling- Improved back pressure per topic
![Page 60: Real Time Recommendations using Spark Streaming · Filter Consume Filter Impressions Consume Plays Design. Join Cassandra Aggregate Transform S3 Filter Filter ... - Idempotent counting](https://reader034.vdocument.in/reader034/viewer/2022050323/5f7c4b4f90b3dd25e87611e6/html5/thumbnails/60.jpg)
Future Work
- Red/Black deployment with zero data-loss- Auto-scaling- Improved back pressure per topic- Updating broadcast variables