simplifying event streaming: tools for location transparency and data evolution, paul osman
TRANSCRIPT
![Page 1: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/1.jpg)
Simplifying Event StreamingTools for Location Transparency & Data Evolution
Paul Osman - @paulosman - [email protected] - Staff Software Engineer, Under Armour Connected Fitness - Kafka Summit 2016
![Page 2: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/2.jpg)
2
Introduction
• Paul Osman • Staff Software Engineer - Under Armour Connected Fitness • Formerly at PagerDuty, 500px, SoundCloud • @paulosman • [email protected]
Paul Osman - @paulosman - [email protected] - Staff Software Engineer, Under Armour Connected Fitness - Kafka Summit 2016
![Page 3: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/3.jpg)
3
![Page 4: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/4.jpg)
4
![Page 5: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/5.jpg)
5
![Page 6: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/6.jpg)
6
Under Armour Connected Fitness
• November 2013 - Under Armour acquires MapMyFitness Inc
• February 2015 - Under Armour acquires MyFitnessPal
• February 2015 - Under Armour acquires Endomondo
• January 2016 - Announce HealthBox , Gemini 2 RE
![Page 7: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/7.jpg)
7
Under Armour Connected Fitness
![Page 8: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/8.jpg)
8
MyFitnessPal and Kafka
• MFP started as a Rails monolith
• Broken into microservices written in Scala and Ruby
• Data integration challenges
• Service dependencies difficult to manage
![Page 9: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/9.jpg)
9
Solution
![Page 10: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/10.jpg)
10
Pushing a data migration…
![Page 11: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/11.jpg)
11
You broke my consumer!
![Page 12: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/12.jpg)
12
My bad!
![Page 13: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/13.jpg)
13
Fix it?
![Page 14: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/14.jpg)
14
Other Challenges…
• Client libraries for non-JVM languages were of varying quality
• Developers needed to know about Kafka
• Wanted to federate Kafka clusters - no one team should have to maintain all clusters
![Page 15: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/15.jpg)
15
MyFitnessPal joined Under Armour
![Page 16: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/16.jpg)
16
![Page 17: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/17.jpg)
17
Project Golden Gate
![Page 18: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/18.jpg)
18
![Page 19: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/19.jpg)
19
Challenges Recap
• Engineers needed to know a lot about Kafka clusters
• Data migrations broke consumer contracts
• Client libraries for non-JVM languages
• Management of Kafka clusters
• Data retention policies
![Page 20: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/20.jpg)
20
Location Transparency
• A publishing client needn’t be concerned with things like clusters, topics, etc
• Need some kind of source of truth for event locations
![Page 21: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/21.jpg)
21
Topology Service
• Each event has a namespace and event type (globally unique)
• The topology service instructs clients where to publish or consume those messages
• Introduces concept of “zones” which represent one or more clusters
![Page 22: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/22.jpg)
22
Data Migrations
• Solved problem - use Schemas
• Confluent Schema Registry + Small Service to capture Metadata (event type and namespace)
• Confluent Schema Registry uses Avro, so we do too
![Page 23: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/23.jpg)
23
Data Migrations
pending available
{ event_type: "ActivityFeedStoryUpdate", namespace: "mmf", status: "pending", confluent_subject: "mmf_activityfeedstoryupdate", schema_id: "bb68e5381e88d52574b0f50a000fbe9b"}
![Page 24: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/24.jpg)
24
Registering a Schema
$> schema-registry register schema --event-type FoodEntryCreated --file-name FoodEntryCreated.avsc \ --namespace mfp -p
$> schema-registry activate schema 20afc5a8f9c017c1f4e82757a7a88f5b -p
![Page 25: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/25.jpg)
25
![Page 26: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/26.jpg)
26
Publishing
![Page 27: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/27.jpg)
27
JVM Languages
• Java and Scala client libraries
![Page 28: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/28.jpg)
28
Non-JVM Languages
projects/GoldenGate $ curl -D - -H'Content-Type: application/json' http://localhost:3005/golden-gate-proxy/produce -d'@schemas/integ-message.json'
HTTP/1.1 202 Accepted Server: spray-can/1.3.3 Date: Tue, 19 Apr 2016 17:34:21 GMT Content-Type: application/json; charset=UTF-8 Content-Length: 419
{ "items": [{ "producer_id": "foo", "schema_id": "5635ce15a15213105c091d5e0945b0c2", "zone": "mfp", "payload": { "context" : null, "email_address" : "vneo", "email_source" : "fadipwxfmotvav", "first_name" : null, "last_name" : null, "country" : {"string" : "US"}, }] }
![Page 29: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/29.jpg)
29
Consuming
![Page 30: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/30.jpg)
30
JVM Languages
![Page 31: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/31.jpg)
31
Non-JVM Languages
projects/GoldenGate $ ./gg-consumer-proxy —subscriptions=mmf/activity_feed_updated=http://localhost:3000/callback
![Page 32: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/32.jpg)
32
Federation of Kafka Clusters
Publisher
Topology Service
Kafka
Consumer
![Page 33: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/33.jpg)
33
Data Retention
• Archiving is now just a job for a specialized consumer
• Archiving is done “per-zone”. Some data shouldn’t be archived, it only gets published to zones that are not archived (as per event type)
• In our case, data is stored in S3 and then accessed through a variety of tools for analysis, batch processing, etc.
![Page 34: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/34.jpg)
Adoption Pains
![Page 35: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/35.jpg)
35
Adoption Pains
Leaky Abstractions
![Page 36: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/36.jpg)
36
Adoption Pains
Publishers Consumers
Schema Designers
![Page 37: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/37.jpg)
37
Help Publishers - Avro Helper Library
• helpful-avro Scala library
• Adds a layer of robustness
• Tries a few tricks to make a payload validate against a schema
![Page 38: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/38.jpg)
38
{ "name": "Person", "namespace": "org.example", "type": "record", "fields": [ {"name": "first_name", "type": "string"}, {"name": "last_name", "type": "string"}, {"name": "age", "type": ["null", "int"]}, {"name": "username", "type": ["null", "string"]} ] }
Example Schema
![Page 39: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/39.jpg)
39
{ "name": "Person", "namespace": "org.example", "type": "record", "fields": [ {"name": "first_name", "type": "string"}, {"name": "last_name", "type": "string"}, {"name": "age", "type": ["null", "int"]}, {"name": "username", "type": ["null", "string"]} ] }
Optional Fields
Nullable / Optional Fields
![Page 40: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/40.jpg)
40
Nullable / Optional Fields
{ "first_name": "Paul", "last_name": "Osman", "username": "paulosman" }
{ "first_name": "Paul", "last_name": "Osman", "age": {"null":null}, "username": {"string": "paulosman"} }
age omitted
not type annotated
![Page 41: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/41.jpg)
• Browser and CLI based tools that allow people to observe activity being published to a specific zone
• Give people a way to see their event go through the system
• End to end monitoring, monitoring of consumer lag
41
Observability
![Page 42: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/42.jpg)
42
Future Plans
• Make schema authoring and registration easier and more automated
• Extend helpful-avro to work with Case Classes and POJOs
• Further hide implementation details
![Page 43: Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman](https://reader035.vdocument.in/reader035/viewer/2022070606/58a9ad5f1a28ab9c758b5b6d/html5/thumbnails/43.jpg)
43CONFIDENTIAL & BUSINESS PROPRIETARY INFORMATION OF UNDER ARMOUR, INC. COPYRIGHT (C)2015
Thank You http://underarmour.jobs