qcon london 2018 foundations of streaming sql · reconciling streams & tables w/ the beam model...
TRANSCRIPT
![Page 1: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/1.jpg)
1
Foundations of streaming SQLor: how I learned to love stream & table theory
Slides: https://s.apache.org/streaming-sql-qcon-londonTyler AkidauApache Beam PMCSoftware Engineer at Google@takidau
Covering ideas from across the Apache Beam, Apache Calcite, Apache Kafka, and Apache Flink communities, with thoughts and contributions from Julian Hyde, Fabian Hueske, Shaoxuan Wang, Kenn Knowles, Ben Chambers, Reuven Lax, Mingmin Xu, James Xu, Martin Kleppmann, Jay Kreps and many more, not to mention that whole database community thing...
QCon London 2018
![Page 2: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/2.jpg)
2
Table of Contents
01
02
Stream & Table TheoryA BasicsB The Beam Model
Streaming SQLA Time-varying relationsB SQL language extensions
Chapter 7
Chapter 9
![Page 3: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/3.jpg)
3
01 Stream & Table TheoryTFW you realize everything you do was invented by the database community decades ago...
A BasicsB The Beam Model
![Page 4: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/4.jpg)
4
Stream & table basics
https://www.confluent.io/blog/making-sense-of-stream-processing/ https://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/
![Page 5: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/5.jpg)
5
Special theory of stream & table relativity
streams → tables:
tables → streams:
The aggregation of a stream of updates over time yields a table.
The observation of changes to a table over time yields a stream.
![Page 6: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/6.jpg)
6
Non-relativistic stream & table definitions
Tables are data at rest.
Streams are data in motion.
![Page 7: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/7.jpg)
7
01 Stream & Table TheoryTFW you realize everything you do was invented by the database community decades ago...
A BasicsB The Beam Model
![Page 8: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/8.jpg)
8
The Beam Model
What results are calculated?
Where in event time are results calculated?
When in processing time are results materialized?
How do refinements of results relate?
![Page 9: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/9.jpg)
9
Reconciling streams & tables w/ the Beam Model
● How does batch processing fit into all of this?
● What is the relationship of streams to bounded and unbounded datasets?
● How do the four what, where, when, how questions map onto a streams/tables world?
![Page 10: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/10.jpg)
10
ReduceMap
MapReduce
input
output
![Page 11: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/11.jpg)
11
MapReduce
input
output
MapRead
Map
MapWrite
ReduceRead
Reduce
ReduceWrite
![Page 12: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/12.jpg)
12
MapReduce
MapRead
Map
MapWrite
ReduceRead
Reduce
ReduceWrite
?
?
?
?
?
? ?
![Page 13: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/13.jpg)
13
MapReduce
MapRead
Map
MapWrite
ReduceRead
Reduce
ReduceWrite
?
?
?
?
?
table
table
![Page 14: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/14.jpg)
14
Map phase
MapRead
Map
MapWrite
table
?
?
?
![Page 15: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/15.jpg)
15
Map phase API
void map(K1 key, V1 value, Emit<K2, V2>);
![Page 16: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/16.jpg)
16
Map phase API
void map(K1 key, V1 value, Emit<K2, V2>);
![Page 17: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/17.jpg)
17
Map phase
MapRead
Map
MapWrite
table
stream
?
?
![Page 18: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/18.jpg)
18
Map phase API
void map(K1 key, V1 value, Emit<K2, V2>);
![Page 19: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/19.jpg)
19
Map phase
MapRead
Map
MapWrite
table
stream
stream
?
![Page 20: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/20.jpg)
20
Map phase API
void map(K1 key, V1 value, Emit<K2, V2>);
void reduce(K2 key, Iterable<V2> value, Emit<V3>);
![Page 21: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/21.jpg)
21
Map phase
MapRead
Map
MapWrite
table
stream
stream
table
![Page 22: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/22.jpg)
22
MapReduce
MapRead
Map
MapWrite
ReduceRead
Reduce
ReduceWrite
table
stream
stream
table
?
?
table
![Page 23: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/23.jpg)
23
Map phase API
void map(K1 key, V1 value, Emit<K2, V2>);
void reduce(K2 key, Iterable<V2> value, Emit<V3>);
![Page 24: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/24.jpg)
24
Map phase API
void map(K1 key, V1 value, Emit<K2, V2>);
void reduce(K2 key, Iterable<V2> value, Emit<V3>);
![Page 25: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/25.jpg)
25
table
MapReduce
MapRead
Map
MapWrite
ReduceRead
Reduce
ReduceWrite
table
stream
stream
table
stream
stream
![Page 26: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/26.jpg)
26
Reconciling streams & tables w/ the Beam Model
● How does batch processing fit into all of this?
● What is the relationship of streams to bounded and unbounded datasets?
● How do the four what, where, when, how questions map onto a streams/tables world?
1. Tables are read into streams.
2. Streams are processed into new streams until a grouping operation is hit.
3. Grouping turns the stream into a table.
4. Repeat steps 1-3 until you run out of operations.
![Page 27: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/27.jpg)
27
Reconciling streams & tables w/ the Beam Model
● How does batch processing fit into all of this?
● What is the relationship of streams to bounded and unbounded datasets?
● How do the four what, where, when, how questions map onto a streams/tables world?
Streams are the in-motion form of data
both bounded and unbounded.
![Page 28: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/28.jpg)
28
Reconciling streams & tables w/ the Beam Model
● How does batch processing fit into all of this?
● What is the relationship of streams to bounded and unbounded datasets?
● How do the four what, where, when, how questions map onto a streams/tables world?
![Page 29: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/29.jpg)
29
The Beam Model
What results are calculated?
Where in event time are results calculated?
When in processing time are results materialized?
How do refinements of results relate?
![Page 30: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/30.jpg)
30
The Beam Model
What results are calculated?
Where in event time are results calculated?
When in processing time are results materialized?
How do refinements of results relate?
![Page 31: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/31.jpg)
31
Example data: individual user scores
![Page 32: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/32.jpg)
32
What is calculated?
PCollection<KV<Team, Score>> input = IO.read(...) .apply(ParDo.of(new ParseFn()); .apply(Sum.integersPerKey());
![Page 33: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/33.jpg)
What is calculated?
![Page 34: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/34.jpg)
34
The Beam Model
What results are calculated?
Where in event time are results calculated?
When in processing time are results materialized?
How do refinements of results relate?
![Page 35: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/35.jpg)
35
Windowing divides data into event-time-based finite chunks.
Often required when doing aggregations over unbounded data.
Fixed Sliding1 2 3
54
Sessions
2
431
Key 2
Key 1
Key 3
Time
1 2 3 4
Where in event time?
![Page 36: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/36.jpg)
36
Where in event time?
PCollection<KV<User, Score>> input = IO.read(...) .apply(ParDo.of(new ParseFn()); .apply(Window.into(FixedWindows.of(Duration.standardMinutes(2))) .apply(Sum.integersPerKey());
![Page 37: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/37.jpg)
Where in event time?
![Page 38: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/38.jpg)
38
The Beam Model
What results are calculated?
Where in event time are results calculated?
When in processing time are results materialized?
How do refinements of results relate?
![Page 39: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/39.jpg)
39
• Triggers control when results are emitted.
• Triggers are often relative to the watermark.
Proc
essi
ng T
ime
Event Time
~Watermark
Ideal
Skew
When in processing time?
![Page 40: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/40.jpg)
40
When in processing time?
PCollection<KV<User, Score>> input = IO.read(...) .apply(ParDo.of(new ParseFn()); .apply(Window.into(FixedWindows.of(Duration.standardMinutes(2)) .triggering(AtWatermark()) .apply(Sum.integersPerKey());
![Page 41: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/41.jpg)
When in processing time?
![Page 42: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/42.jpg)
42
The Beam Model: asking the right questions
What results are calculated?
Where in event time are results calculated?
When in processing time are results materialized?
How do refinements of results relate?
![Page 43: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/43.jpg)
43
How do refinements relate?
PCollection<KV<User, Score>> input = IO.read(...) .apply(ParDo.of(new ParseFn()); .apply(Window.into(FixedWindows.of(Duration.standardMinutes(2)) .triggering(AtWatermark().withLateFirings(AtCount(1))) .accumulatingFiredPanes()) .apply(Sum.integersPerKey());
![Page 44: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/44.jpg)
How do refinements relate?
![Page 45: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/45.jpg)
45
What/Where/When/How Summary
3. Streaming 4. Streaming + Late Data Handling
1. Classic Batch 2. Windowed Batch
![Page 46: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/46.jpg)
46
Reconciling streams & tables w/ the Beam Model
● How does batch processing fit into all of this?
● What is the relationship of streams to bounded and unbounded datasets?
● How do the four what, where, when, how questions map onto a streams/tables world?
![Page 47: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/47.jpg)
47
General theory of stream & table relativityPipelines : tables + streams + operationsTables : data at restStreams : data in motionOperations : (stream | table) → (stream | table) transformations
● stream → stream: Non-grouping (element-wise) operations Leaves stream data in motion, yielding another stream.
● stream → table: Grouping operations Brings stream data to rest, yielding a table. Windowing adds the dimension of time to grouping.
● table → stream: Ungrouping (triggering) operations Puts table data into motion, yielding a stream. Accumulation dictates the nature of the stream (deltas, values, retractions).
● table → table: (none) Impossible to go from rest and back to rest without being put into motion.
![Page 48: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/48.jpg)
48
02 Streaming SQLContorting relational algebra for fun and profit
A Time-varying relationsB SQL language extensions
![Page 49: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/49.jpg)
49
Relational algebra
User Score Time
Julie 7 12:01
Frank 3 12:03
Julie 1 12:03
Julie 4 12:07
Score Time
7 12:01
3 12:03
1 12:03
4 12:07
πScore,Time(UserScores)πUserScoresπ SELECT Score, Time FROM UserScores;-----------------| Score | Time |-----------------| 7 | 12:01 || 3 | 12:03 || 1 | 12:03 || 4 | 12:07 |-----------------
Relational algebra SQLRelation
![Page 50: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/50.jpg)
50
Relations evolve over time
12:07> SELECT * FROMUserScores;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 || Frank | 3 | 12:03 || Julie | 1 | 12:03 || Julie | 4 | 12:07 |-------------------------
12:03> SELECT * FROMUserScores;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 || Frank | 3 | 12:03 || Julie | 1 | 12:03 |-------------------------
12:00> SELECT * FROMUserScores;-------------------------| Name | Score | Time |--------------------------------------------------
12:01> SELECT * FROMUserScores;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |-------------------------
![Page 51: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/51.jpg)
51
Classic SQL vs Streaming SQL
Classic SQL classic relations single point in time:: ::
Streaming SQL time-varying relations every point in time:: ::
![Page 52: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/52.jpg)
52
Classic SQL vs Streaming SQL
Classic SQL classic relations single point in time:: ::
Streaming SQL time-varying relations every point in time:: ::
![Page 53: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/53.jpg)
53
Classic relations
12:07> SELECT * FROM UserScores;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || | | | | | | | | | | | Julie | 1 | 12:03 | | | Julie | 1 | 12:03 | || | | | | | | | | | | | | | | | | Julie | 4 | 12:07 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:07> SELECT * FROMUserScores;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 || Frank | 3 | 12:03 || Julie | 1 | 12:03 || Julie | 4 | 12:07 |-------------------------
12:03> SELECT * FROMUserScores;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 || Frank | 3 | 12:03 || Julie | 1 | 12:03 |-------------------------
12:00> SELECT * FROMUserScores;-------------------------| Name | Score | Time |--------------------------------------------------
12:01> SELECT * FROMUserScores;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |-------------------------
Time-varying relation
![Page 54: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/54.jpg)
54
Closure property of relational algebra remains intact with time-varying relations.
![Page 55: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/55.jpg)
55
Time-varying relations: variations12:07> SELECT * FROM UserScores;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || | | | | | | | | | | | Julie | 1 | 12:03 | | | Julie | 1 | 12:03 | || | | | | | | | | | | | | | | | | Julie | 4 | 12:07 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
![Page 56: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/56.jpg)
56
Time-varying relations: filtering12:07> SELECT * FROM UserScores;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || | | | | | | | | | | | Julie | 1 | 12:03 | | | Julie | 1 | 12:03 | || | | | | | | | | | | | | | | | | Julie | 4 | 12:07 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:07> SELECT * FROM UserScores WHERE Name = “Julie”;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | || | | | | | | | | | | | Julie | 1 | 12:03 | | | Julie | 1 | 12:03 | || | | | | | | | | | | | | | | | | Julie | 4 | 12:07 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
![Page 57: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/57.jpg)
57
Time-varying relations: grouping12:07> SELECT * FROM UserScores;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | | | Julie | 7 | 12:01 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || | | | | | | | | | | | Julie | 1 | 12:03 | | | Julie | 1 | 12:03 | || | | | | | | | | | | | | | | | | Julie | 4 | 12:07 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
![Page 58: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/58.jpg)
58
How does this relate to streams & tables?
![Page 59: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/59.jpg)
59
Time-varying relations: tables
12:07> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 12 | 12:07 || Frank | 3 | 12:03 |-------------------------
12:03> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 8 | 12:03 || Frank | 3 | 12:03 |-------------------------
12:01> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |-------------------------
12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:00> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |--------------------------------------------------
![Page 60: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/60.jpg)
60
Time-varying relations: tables
12:01> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |-------------------------
12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:07> SELECT TABLE Name, SUM(Score), MAX(Time) AS OF SYSTEM TIME ‘12:01’ FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |-------------------------
![Page 61: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/61.jpg)
61
Time-varying relations: tables12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:00> SELECT STREAM Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-------------------------| Name | Score | Time |-------------------------...
12:00
12:00> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |--------------------------------------------------
![Page 62: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/62.jpg)
62
Time-varying relations: streams12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:00> SELECT STREAM Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |...
12:01
12:00> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |--------------------------------------------------
![Page 63: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/63.jpg)
63
Time-varying relations: streams12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:00> SELECT STREAM Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |...
12:01
12:01> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |-------------------------
![Page 64: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/64.jpg)
64
Time-varying relations: streams12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:00> SELECT STREAM Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 || Frank | 3 | 12:03 || Julie | 8 | 12:03 |...
12:03
12:01> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 |-------------------------
![Page 65: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/65.jpg)
65
Time-varying relations: streams12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:00> SELECT STREAM Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 || Frank | 3 | 12:03 || Julie | 8 | 12:03 |...
12:03
12:03> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 8 | 12:03 || Frank | 3 | 12:03 |-------------------------
![Page 66: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/66.jpg)
66
Time-varying relations: streams12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:00> SELECT STREAM Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 || Frank | 3 | 12:03 || Julie | 8 | 12:03 || Julie | 12 | 12:07 |...
12:07
12:03> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 8 | 12:03 || Frank | 3 | 12:03 |-------------------------
![Page 67: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/67.jpg)
67
Time-varying relations: streams12:07> SELECT Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-----------------------------------------------------------------------------------------------------------------| [-inf, 12:01) | [12:01, 12:03) | [12:03, 12:07) | [12:07, now) || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | | | Name | Score | Time | || ------------------------- | ------------------------- | ------------------------- | ------------------------- || | | | | | | Julie | 7 | 12:01 | | | Julie | 8 | 12:03 | | | Julie | 12 | 12:07 | || | | | | | | | | | | | Frank | 3 | 12:03 | | | Frank | 3 | 12:03 | || ------------------------- | ------------------------- | ------------------------- | ------------------------- |-----------------------------------------------------------------------------------------------------------------
12:00> SELECT STREAM Name, SUM(Score), MAX(Time) FROM USER_SCORES GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 7 | 12:01 || Frank | 3 | 12:03 || Julie | 8 | 12:03 || Julie | 12 | 12:07 |...
12:07
12:07> SELECT TABLE Name, SUM(Score), MAX(Time) FROM UserScores GROUP BY Name;-------------------------| Name | Score | Time |-------------------------| Julie | 12 | 12:07 || Frank | 3 | 12:03 |-------------------------
![Page 68: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/68.jpg)
68
How does this relate to streams & tables?
capture a point-in-time snapshot of a time-varying relation.
capture the evolution of a time-varying relation over time.
Tables
Streams
![Page 69: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/69.jpg)
69
02 Streaming SQLContorting relational algebra for fun and profit
A Time-varying relationsB SQL language extensions
![Page 70: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/70.jpg)
70
When do you need SQL extensions for streaming?
As a table: As a stream:
SQL extensions rarely needed. SQL extensions sometimes needed.
How is output consumed?
good defaults = often not needed
![Page 71: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/71.jpg)
71
When do you need SQL extensions for streaming?*
Explicit table / stream selection● SELECT TABLE * from X;● SELECT STREAM * from X;
Timestamps and windowing● Event-time columns● Windowing. E.g.,
SELECT * FROM X GROUP BY SESSION(<COLUMN> INTERVAL '5' MINUTE);
○ Grouping by timestamp○ Complex multi-row transactions
inexpressible in declarative SQL (e.g., session windows)
Sane default table / stream selection● If all inputs are tables, output is a table● If any inputs are streams, output is a stream
Simple triggers● Implicitly defined by characteristics of the sink● Optionally be configured outside of query.● Per-query, e.g.: SELECT * from X EMIT <WHEN>; ● Focused set of use cases:
○ Repeated updates... EMIT AFTER <TIMEDELTA>
○ Completeness... EMIT WHEN WATERMARK PAST <COLUMN>
○ Repeated updates + completeness(e.g., early/on-time/late pattern)... EMIT AFTER <TIMEDELTA> AND WHEN WATERMARK PAST <COLUMN>
* Most of these extensions are theoretical at this point; very few have concrete implementations.
![Page 72: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/72.jpg)
72
Summary
streams ⇄ tables
streams & tables : Beam Model
time-varying relations
SQL language extensions
![Page 73: QCon London 2018 Foundations of streaming SQL · Reconciling streams & tables w/ the Beam Model How does batch processing fit into all of this? What is the relationship of streams](https://reader035.vdocument.in/reader035/viewer/2022070917/5fb737b7ced85a278a79637f/html5/thumbnails/73.jpg)
73
Thank you!
In early release nowstreamingsystems.net
These slides: http://s.apache.org/streaming-sql-big-data-spain
Streaming SQL spec (WIP: Apex, Beam, Calcite, Flink): http://s.apache.org/streaming-sql-specStreaming in Calcite (Julian Hyde): https://calcite.apache.org/docs/stream.htmlStreams, joins & temporal tables (Julian Hyde): http://s.apache.org/streams-joins-and-temporal-tables
Streaming 101: http://oreilly.com/ideas/the-world-beyond-batch-streaming-101Streaming 102: http://oreilly.com/ideas/the-world-beyond-batch-streaming-102
Apache Beam: http://beam.apache.orgApache Calcite: http://calcite.apache.orgApache Flink: http://flink.apache.org
Twitter: @takidau