rebuilding web tracking infrastructure for scale
TRANSCRIPT
Rebuilding Web Tracking Infrastructure for ScaleStephen OakleyPrincipal EngineerMarketo
What is Marketo?
Page 3Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
What is Web Tracking at Marketo?• Ingest web page visits and clicks on customer’s website• Trigger campaigns in response to web activity• Trigger real-time personalization of web experience• Provide lead level analytics for known leads• Provide aggregate analytics for all lead activity• Typically known leads < 10 % of all traffic
Page 4Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Legacy Web Tracking Infrastructure
Page 5Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Legacy Web Tracking Infrastructure
Page 6Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Legacy Problems• Throughput limitations – 2 million activities per day• Processing delays can be on the order of hours
• Large customers cause web server brownouts• Web reporting does not scale• Fixed-sized clusters prohibit horizontal scaling• Brittle infrastructure prevents feature development
The Vision
Page 8Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Orion Initiative• Increase scale to support IoT for Marketers• Support billions of marketing activities each day• Trigger on activities in near real time (< 2 minute @ 99th %)
• Reduce operational costs• Improve multitenancy and QoS
Requirements
Page 10Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Business Requirements• 200 MM activities per customer per day• Near real-time web activity processing (SLA of < 1
minute lag)• Improve cost efficiency• Improve flexibility for feature enhancements
Page 11Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Technical Requirements• Multitenancy support with brownout protections• Infrastructure must scale horizontally• Decouple web processing from downstream processing• Anonymous leads should cost next to nothing to track
Architecture & Design
Page 13Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Page 14Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Page 15Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Why Hbase + Phoenix?• Horizontally scalable• Leverages the Hadoop cluster for storage and scaling• Provides secondary indices for query patterns through
Phoenix• Natural integration with JDBC and Spark JDBC RDDs
Page 16Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Page 17Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Page 18Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Why Spark Streaming?• Micro-batching provides sink-side efficiencies• This is especially important with MySQL touchpoints
• Great integration with Kafka • No strict real-time processing requirements• Great community and industry adoption
Page 19Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Multitenancy• One topic per customer (sized by volume)• Traffic storms are isolated to a single customer
• Fairness/throttling is easy to control
• Spark Streaming job consumes from many topics• Allows us to turn a customer off under error conditions
• See “Elastic Streaming” by Neelesh Shastry – Spark Summit
Page 20Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Making Spark Streaming Performant• Coalesce small partitions for the same customer• Aggressive caching of metadata (mostly from MySQL)• Heavily leverage Scala future composition for parallelism• Persist RDDs that are used for multiple outputs• e.g. write to Kafka and Activity Service
Page 21Marketo Proprietary and Confidential | © Marketo, Inc. 05/02/2023
Making Anonymous Traffic Cheap• High costs of web traffic in legacy system• MySQL storage for all traffic• Down streaming processing of all events (even anonymous)
• V2 only processes and stores known traffic in MySQL• Defer triggering for anonymous data until promotion
• Rolled out to our highest volume customers• Processing latencies < 30s (at 99.9th %)• Allowed key customers to scale from ~2MM/day to > 20
MM/day
Impact and Results
• Mitigations of straggler effects on processing delays• Adding sessionization for web reporting• Scaling Kafka topics as customers increase volume• Globally distributed ingestion for a single customer
Future Work
We’re Hiring! Http://Marketo.Jobs
Q & A