use of emr for marketing segmentation
DESCRIPTION
by Mandhir Gidda, Technical Director, RazorfishTRANSCRIPT
© 2009 Razorfish. All rights reserved.
Razorfish: Use of EMR for Marketing
Segmentation
Page 2 © 2012 Razorfish. All rights reserved.
Agenda
• Who we are. • Razorfish, ATLAS, Microsoft • ATLAS What is it?, Problems • AWS – EMR – Why move? • EMR Solution Outline • Benefits gained, Opportunities
Who we are – Razorfish London is a full-service digital agency. – Founded in London in 1996 – We are now 250 people strong and experts at creative, design, social
media, digital media, analytics, technology, service operations and user experience.
– We are part of one of the world's largest interactive agency networks with more than 2,800 people.
– According to LinkedIn, Razorfish is the 31st most desirable employer in the world (even beating Starbucks).
– For the last three years we’ve been the only agency recognised by Forrester Research as a ‘leader’ in both the Media & Interactive Marketing and Experience Design & Technology categories.
– We are Adobe’s ‘Digital Marketing Global Partner of the Year, 2012’ – We are No. 4 in the last Ad Age ‘Agency A-List’ - the highest ranked
digital agency.
Page 3 © 2009 Razorfish. All rights reserved.
RF – Atlas - Microsoft
• Razorfish: Developed the ATLAS ad serving engine
• Atlas was seperated from Razorfish, but had a symbiotic relationship
• Google bought DoubleClick
• Microsoft bought Aquantive Group
• Microsoft incorporated Atlas into MS Advertising and Publishing
• Sold Razorfish to Publicis group
• RF continue to have a strong relationship with Atlas, but have gone on to develop Razorfish Edge, Insight On Demand (IoD), that use Atlas data extensively.
Page 4 © 2009 Razorfish. All rights reserved.
Atlas
•Razorfish: Developed the ATLAS ad serving engine
• Single cookie & atlas tags
• 90% of Browsers
• Clickstream analysis of data, current and historical, log file data. User are placed into buckets - segmented
• Segmentation used to serve ads and cross sell
Page 5 © 2009 Razorfish. All
rights reserved.
Problem
Page 6 © 2009 Razorfish. All rights reserved.
45 Terabytes of raw clickstream (log) data
Business logic and metrics against loosely structured data
Custom user profiling
• ROI • Custom ROI base on complex, client specific business rules • Rich Media and Analytics
Custom analysis of web surfing activity
Targeting
45 Terabytes of raw clickstream and log data
Problem
• Giant Datasets
• Build infrastructure requires large
continuous investment
• Building for peak/holiday traffic
• Data mining apps / Physical DB’s at or
near limit
• Client expectations/data volumes
increasing
Page 7 © 2009 Razorfish. All rights reserved.
Previously 2009
•Custom Distributed Log Processing Engine
• Sorted by cookie_id by time
• Need to segment granularly across larger no/ segments (Cust || Prospect)
•SQL
• 60 SQL Server boxes
• Shared resources (contention issues)
• In a DR configuration
•OLAP
• In house constrained
By the end of 2009 (x-mas holiday season), RF needed $500k to keep up with data
processing needs.
Page 8 © 2009 Razorfish. All rights reserved.
AWS + EMR
• Efficient: Elastic infrastructure from AWS allows capacity to be provisioned as needed based on load, reducing cost and the risk of processing delays.
• Configuration: Amazon Elastic MapReduce and Cascading lets Razorfish focus on application development without having to worry about time-consuming set-up, management, or tuning of clusters or the compute capacity upon which they sit.
• Ease of integration: Amazon Elastic MapReduce with Cascading allows data processing in the cloud without any changes to the underlying algorithms.
• Flexible: EMR with Cascading is flexible enough to allow “agile” implementation and unit testing of sophisticated algorithms.
• Adaptable: Cascading simplifies the integration of Hadoop with external ad systems.
• Scalable: AWS infrastructure helps Razorfish reliably store and process huge (Petabytes) data sets.
Page 9 © 2009 Razorfish. All rights reserved.
AWS + EMR
Page 10 © 2009 Razorfish. All rights reserved.
AWS EMR Segmentation
• Actionable
• Rules flexible / customizable
• Measurement of customer value
• Measurement of customer affinity
• Joining 2.8 billion transactions against known site categorization information
• Unbalanced so there is a hit to the reducers
•S3 Storage 45tb of log data
We import a lot of Atlas Data
( ½ Trillion ICA records )
24 servers
Upload 200 + GB of data per day
Cloud Storage
( about 71 million unique cookies a day)
Cloud Storage
100 Machine Cluster Created on demand. We filter for only the transactions that we need to process (more than 3.5 billion)
Elastic Mapreduce
We filter out the relevant cookies
Filter by behavior
( Match these cookies to 100,000’s of skus )
Filt
ered
Tra
nsa
ctio
ns
Generate list of products that have been seen SKU Table
Match to their affinity
( Cookies are matched to 3.5 billion ICA records )
Filt
ered
Tra
nsa
ctio
ns
Determinee profile information by the types of sites the user has visited
70 million placements
Join transactions to site genre information Sport
Enthusiast
…and run custom business rules
( super–computing power determines some key categories )
Determine the types of products the user is interested from what they have done on the site
Join site behavior to
product info In market Gamer
Filt
ered
Tra
nsa
ctio
ns
SKU Table
category
We bring it all together
( 1 of N “Personalization” segments )
In market Gamer
Sport Enthusiast Purchaser Home
Theater + +
affinity generation
Drive a personalized message
( 1.7 million per day )
User recently purchased a home theater system and is now looking for
sports games Target Ad
This all happens in about 8 hours every day
Each and every day
( not bad )
AWS + EMR
– Perfect clarity of cost
– No upfront infrastructure investment
– No client processing contention
– We couldn’t have done it.
– Without EMR/Hadoop process takes 3 days and heavy reliance on manual processes. Now 5-8hrs
– Elasticity to complete a job faster if it’s worth the cost.
– We can meet our SLA’s
Page 19 © 2009 Razorfish. All rights reserved.
Expanding Data Landscape
• EMR allows us to deal with the ever expanding number of channels and user interactions with sites and data:
• Clickstream data available from tools like Atlas and Doubleclick—who have cookied over 90% of the Internet
• Digital experience tracked through tools like Omniture, Webtrends and Google Analytics
• Other channel data across touchpoints (email, call center, mobile)
• Client Data • Transactional data • Survey-based (Nielsen’s) • Social data available through open APIs (hosepipes)
Page 20 © 2009 Razorfish. All rights reserved.
Thank you
© 2009 Razorfish. All rights reserved.
•Mandhir Gidda