![Page 1: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/1.jpg)
Let’s introduce Amazon KinesisInaugural meetup of the
Amazon Kinesis - London User Group
![Page 2: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/2.jpg)
This evening
• Introducing Amazon Kinesis, Ian Meyers, AWS
• Pizza and drinks break
• Kinesis and Snowplow, Alex Dean, Snowplow Analytics
• Drinks
• All courtesy of our hosts:
![Page 3: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/3.jpg)
Introducing Amazon Kinesis
![Page 4: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/4.jpg)
Snowplow and Kinesis
1. Snowplow – who we are
2. Why are we excited about Kinesis?
3. Adding Kinesis support to Snowplow
4. Live demo!
5. Questions
![Page 5: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/5.jpg)
Snowplow – who we are
![Page 6: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/6.jpg)
Today, Snowplow is primarily an open source web analytics platform
Website / webappSnowplow: data pipeline
Collect Transform and enrich
Amazon Redshift /
PostgreSQL
Amazon S3
• Your granular, event-level and customer-level data, in your own data warehouse
• Connect any analytics tool to your data• Join your web analytics data with any other data set
![Page 7: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/7.jpg)
Snowplow was born out of our frustration with traditional web analytics tools…• Limited set of reports that don’t answer business questions
• Traffic levels by source• Conversion levels• Bounce rates• Pages / visit
• Web analytics tools don’t understand the entities that matter to business• Customers, intentions, behaviours, articles, videos, authors,
subjects, services… • …vs pages, conversions, goals, clicks, transactions
• Web analytics tools are siloed• Hard to integrate with other data sets incl. digital (marketing
spend, ad server data), customer data (CRM), financial data (cost of goods, customer lifetime value)
![Page 8: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/8.jpg)
…and out of the opportunities to tame big data new technologies presented
These tools make it possible to capture, transform, store and analyse all your granular, event-level data, to you can perform any analysis
![Page 9: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/9.jpg)
Snowplow is composed of a set of loosely coupled subsystems, architected to be robust and scalable
1. Trackers 2. Collectors 3. Enrich 4. Storage 5. AnalyticsA B C D
A D Standardised data protocols
Generate event data
Examples:• Javascript
tracker• Python /
Lua / No-JS / Arduino tracker
Receive data from trackers and log it to S3
Examples:• Cloudfront
collector• Clojure
collector for Amazon EB
Clean and enrich raw data
Built on Scalding / Cascading / Hadoop and powered by Amazon EMR
Store data ready for analysis
Examples:• Amazon
Redshift• PostgreSQL• Amazon S3
• Batch-based• Normally run overnight; sometimes
every 4-6 hours
![Page 10: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/10.jpg)
Why are we excited about Kinesis?
![Page 11: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/11.jpg)
A quick history lesson: the three eras of business data processing
1. The classic era, 1996+
2. The hybrid era, 2005+
3. The unified era, 2013+
For more see http://snowplowanalytics.com/blog/2014/01/20/the-three-eras-of-business-data-processing/
![Page 12: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/12.jpg)
The classic era, 1996+
OWN DATA CENTER
Data warehouse
HIGH LATENCY
Point-to-point connections
WIDE DATA COVERAGE
CMS
Silo
CRM
Local loop Local loop
NARROW DATA SILOES LOW LATENCY LOCAL LOOPS
E-comm
SiloLocal loop
Management reporting
ERP
SiloLocal loop
Silo
Nightly batch ETL process
FULL DATA HISTORY
![Page 13: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/13.jpg)
The hybrid era, 2005+
CLOUD VENDOR / OWN DATA CENTER
Search
SiloLocal loop
LOW LATENCY LOCAL LOOPS
E-comm
SiloLocal loop
CRM
Local loop
SAAS VENDOR #2
Email marketing
Local loop
ERP
SiloLocal loop
CMS
SiloLocal loop
SAAS VENDOR #1
NARROW DATA SILOES
Stream processing
Productrec’s
Micro-batch processing
Systems monitoring
Batch processing
Data warehouse
Management reporting
Batch processing
Ad hoc analytics
Hadoop
SAAS VENDOR #3
Web analytics
Local loop
Local loop Local loop
LOW LATENCY LOW LATENCY
HIGH LATENCY HIGH LATENCY
APIs
Bulk exports
![Page 14: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/14.jpg)
The unified era, 2013+CLOUD VENDOR / OWN DATA CENTER
Search
Silo
SOME LOW LATENCY LOCAL LOOPS
E-comm
Silo
CRM
SAAS VENDOR #2
Email marketing
ERP
Silo
CMS
Silo
SAAS VENDOR #1
NARROW DATA SILOES
Streaming APIs / web hooks
Unified log
LOW LATENCY WIDE DATA
COVERAGE
Archiving
Hadoop
< WIDE DATA
COVERAGE >
< FULL DATA
HISTORY >
FEW DAYS’ DATA HISTORY
Systems monitoring
Eventstream
HIGH LATENCY LOW LATENCY
Product rec’sAd hoc analytics
Management reporting
Fraud detection
Churn prevention
APIs
![Page 15: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/15.jpg)
CLOUD VENDOR / OWN DATA CENTER
Search
Silo
SOME LOW LATENCY LOCAL LOOPS
E-comm
Silo
CRM
SAAS VENDOR #2
Email marketing
ERP
Silo
CMS
Silo
SAAS VENDOR #1
NARROW DATA SILOES
Streaming APIs / web hooks
Unified log
Archiving
Hadoop
< WIDE DATA
COVERAGE >
< FULL DATA
HISTORY >
Systems monitoring
Eventstream
HIGH LATENCY LOW LATENCY
Product rec’sAd hoc analytics
Management reporting
Fraud detection
Churn prevention
APIs
The unified log is Kinesis (or Kafka)
![Page 16: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/16.jpg)
CLOUD VENDOR / OWN DATA CENTER
Search
Silo
SOME LOW LATENCY LOCAL LOOPS
E-comm
Silo
CRM
SAAS VENDOR #2
Email marketing
ERP
Silo
CMS
Silo
SAAS VENDOR #1
NARROW DATA SILOES
Streaming APIs / web hooks
Unified log
Archiving
Hadoop
< WIDE DATA
COVERAGE >
< FULL DATA
HISTORY >
Systems monitoring
Eventstream
HIGH LATENCY LOW LATENCY
Product rec’sAd hoc analytics
Management reporting
Fraud detection
Churn prevention
APIs
Can we implement Snowplow on top of Kinesis?
![Page 17: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/17.jpg)
Adding Kinesis support to Snowplow
![Page 18: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/18.jpg)
Where we are heading with our Kinesis architecture
Scala Stream Collector
Raw event stream
Enrich Kinesis app
Bad raw events stream
Enriched event
stream
S3
Redshift
S3 sink Kinesis app
Redshift sink Kinesis
app
Snowplow Trackers
![Page 19: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/19.jpg)
We took an important first step in our last release…
hadoop-etl
Record-level enrichment functionality
scala-common-enrich
scala-hadoop-enrich scala-kinesis-enrich
0.8.12pre-0.8.12
![Page 20: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/20.jpg)
… and the next release should get us much closer
Scala Stream Collector
Raw event stream
Enrich Kinesis app
Bad raw events stream
Enriched event
stream
S3
Redshift
S3 sink Kinesis app
Redshift sink Kinesis app
Snowplow Trackers
![Page 21: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/21.jpg)
Live demo!
![Page 22: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/22.jpg)
Questions?
http://snowplowanalytics.comhttps://github.com/snowplow/snowplow
@snowplowdata
![Page 23: Snowplow and Kinesis - Presentation to the inaugural Amazon Kinesis London User Group](https://reader038.vdocument.in/reader038/viewer/2022102804/54c6cf1e4a79593b028b45bd/html5/thumbnails/23.jpg)
And finally…
Huge thanks to our hosts!