sfscon17 - stefano pampaloni: "big data streaming analysis without code"
TRANSCRIPT
![Page 2: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/2.jpg)
2
Let’s take a trip back in time. Each application has its
own database for storing information. But we want
that information elsewhere for analytics and
reporting.
![Page 3: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/3.jpg)
3
We don't want to query the transactional system, so
we create a process to extract from the source to a
data warehouse / lake
![Page 4: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/4.jpg)
4
Let’s take a trip back in time
We want to unify data from multiple systems, so
create conformed dimensions and batch processes
to federate our data. This is all batch driven, so
latency is built in by design.
![Page 5: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/5.jpg)
5
Let’s take a trip back in time
As well as our data warehouse, we want to use our
transactional data to populate search replicas,
Graph databases, noSQL stores…all introducing
more point-to-point dependencies in our system
![Page 6: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/6.jpg)
6
Let’s take a trip back in time
Ultimately we end up with a spaghetti architecture. It
can't scale easily, it's tightly coupled, it's generally
batch-driven and we can't get data when we want it
where we want it.
![Page 7: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/7.jpg)
7
But…there's hope!
![Page 8: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/8.jpg)
8
Apache Kafka, a distributed streaming platform,
enables us to decouple all our applications creating
data from those utilising it. We can create low-
latency streams of data, transformed as necessary.
![Page 9: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/9.jpg)
Kafka concepts
![Page 10: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/10.jpg)
Before
![Page 11: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/11.jpg)
After
![Page 12: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/12.jpg)
9
But…to use stream processing, we need to be Java
coders…don't we?
![Page 13: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/13.jpg)
10
Happy days! We can actually build streaming data
pipelines using just our bare hands, configuration
files, and SQL.
![Page 14: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/14.jpg)
![Page 15: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/15.jpg)
![Page 16: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/16.jpg)
![Page 17: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/17.jpg)
![Page 18: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/18.jpg)
![Page 19: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/19.jpg)
![Page 20: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/20.jpg)
A Developer Preview of
KSQL An Open Source Streaming SQL
Engine for Apache Kafka
![Page 21: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/21.jpg)
• Enables stream processing with zero coding required • The simplest way to process streams of data in real-time • Powered by Kafka: scalable, distributed, battle-tested • All you need is Kafka–No complex deployments
![Page 22: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/22.jpg)
![Page 23: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/23.jpg)
![Page 24: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/24.jpg)
![Page 25: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/25.jpg)
![Page 26: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/26.jpg)
![Page 27: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/27.jpg)
![Page 28: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/28.jpg)
![Page 29: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/29.jpg)
![Page 30: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/30.jpg)
![Page 31: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/31.jpg)
![Page 32: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/32.jpg)
![Page 33: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"](https://reader031.vdocument.in/reader031/viewer/2022022415/5a64c4187f8b9a735d8b48c3/html5/thumbnails/33.jpg)