![Page 1: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/1.jpg)
How Criteo Scaled and Supported Massive Growth with MongoDB
Julien SIMONVice President, [email protected] @julsimon
![Page 2: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/2.jpg)
CRITEO
2
• R&D EFFORT• RETARGETING• CPC
PHASE 1 : 2005-2008CRITEO CREATION
• MORE THAN 3000 CLIENTS• 35 COUNTRIES, 15 OFFICES• R&D: MORE THAN 300 PEOPLE
PHASE 2 : 2008-2012GLOBAL LEADER : + 700 EMPLOYEES!
2007
15EMPLOYEES 2009
84EMPLOYEES
6EMPLOYEES
2005
2010
203EMPLOYEES
2012
+700EMPLOYEES SO FAR
2006
2011
395EMPLOYEES
2008
33EMPLOYEES
![Page 3: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/3.jpg)
3
GLOBAL PRESENCE
SYDNEY
PARIS
LONDON
BARCELONA
MILAN
MUNICH
BOSTON
NEW YORK
SAO PAULO
PALO ALTOTOKYO
SEOUL
STOCKHOLM
AMSTERDAM
15 OFFICES, 30+ COUNTRIES
CHICAGO
![Page 4: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/4.jpg)
GO GOGO
Powered by
PERFORMANCE DISPLAY
Copyright © 2013 Criteo. Confidential
A user sees products
on your website…
… and sees
personalized banners
After clicking on the banner, the user goes back to the product
page.
...then browses the internet1
2
3
4
4
![Page 5: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/5.jpg)
5
REAL-TIME PERSONALIZATION
Copyright © 2013 Criteo. Confidential.
Boutons
all original #represent
SHOP NOW
CouleursFond Disposition
WARM MEETS LIGHT
SWEET NOTHING
ADDIDAS IS ALL IN
ALL ORIGINALS #REPRESENT
Slogans
JOIN NOW
SEE MORE
CLICK HERE
“Call to action”
Lien opt-out
SEE MORE
JOIN NOW
SEE MORE
CLICK HERE
SHOP NOWSHOP NOW
JOIN NOW JOIN NOW
![Page 6: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/6.jpg)
PREDICTION & RECOMMENDATION
2 CORE TECHNOLOGIES
choose the right product to display
choose the right users / advertiser / publisher to display
RECOMMENDATION ENGINE CTR + CR
increase
PREDICTION ENGINE
![Page 7: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/7.jpg)
7
INFRASTRUCTURE
Copyright © 2013 Criteo. Confidential.
DAILY TRAFFIC
- HTTP REQUESTS: 30+ BILLION
- BANNERS SERVED: 1+ BILLION
PEAK TRAFFIC (PER SECOND)
- HTTP REQUESTS: 500,000+
- BANNERS: 25,000+
7 DATA CENTERS
SET UP AND MANAGED IN-HOUSE
AVAILABILITY > 99.95%
![Page 8: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/8.jpg)
8Copyright © 2013 Criteo. Confidential.
HIGH PERFORMANCE COMPUTING
FETCH, STORE, CRUNCH, QUERY 20 additional TB EVERY DAY ?
…SUBTITLED « HOW I LEARNED TO STOP WORRYING AND LOVE HPC »
Storm Kafka
![Page 9: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/9.jpg)
9
PRODUCT CATALOGUES
• Catalogue = product feed provided by advertisers (product id, description, category, price, URL, etc)
• 3000+ catalogues, ranging from a few MB to several tens of GB• About 50% of products change every day
• Imported at least once a day by an in-house application• Data replicated within a geographical zone• Accessed through a cache layer by web servers• Microsoft SQL Server used from day 1• Running fine in Europe, but…
– Number of databases (1 per advertiser)… and servers– Size of databases – SQL Server issues hard to debug and understand
• Running kind of fine in the US, until dead end in Q1 2011 – transactional replication over high latency links
Copyright © 2010 Criteo. Confidential.
![Page 10: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/10.jpg)
Copyright © 2010 Criteo. Confidential.
10
REQUIREMENTS FOR A NEW DB
• Scale-out architecture running on commodity hardware(aka « Intel CPUs in metal boxes »)
• No transactions needed, eventual consistency OK • High availability• Distributed clusters, with replication over high latency links• Requestable (key-value not enough)• Open source
… with active user community
… backed by a stable organization with long-term commitment (not one guy in a garage)
… no licence fees for production use
… commercial support available at reasonable cost
• Easy to learn, (re)deploy, monitor and upgrade• « Low maintenance » (don’t need a 10-people team just to run it)• Multi-language support • Ability to export everything to Hadoop multiple times per day
![Page 11: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/11.jpg)
Copyright © 2010 Criteo. Confidential.
11
FROM SQL SERVER TO MONGODB
• Ah, database migrations… everyone loves them
• 1st step: solve replication issue– Import and replicate catalogues in MongoDB– Push content to SQL Server, still queried by web servers
• 2nd step: prove that MongoDB can survive our web traffic– Modify web applications to query MongoDB– C-a-r-e-f-u-l-l-y switch web queries to MongoDB for a small set of catalogues– Observe, measure, A/B test… and generally make sure that the system still works
• 3rd step: scale !– Migrate thousands of catalogues away from SQL Server– Monitor and tweak the MongoDB clusters– Add more MongoDB servers… and more shards– Update ops processes (monitoring, backups, etc)
![Page 12: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/12.jpg)
12
OUR MONGODB DEPLOYMENT
• Europe– 18 3-server shards (1+1+1)– 800M products, 1TB– 1B requests/day (peak at 40K/s)– 350M updates/day (peak at 11K/s)
• US– 14 4-server shards (2+2)– 400M products, 650GB
• APAC– 12 3-server shards (2+1)– 300M products, 500GB
• 146 servers total : 2.0 (+ Criteo patches) 2.2 2.4.3
Copyright © 2010 Criteo. Confidential.
![Page 13: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/13.jpg)
13
MONGODB, 2+ YEARS LATER
• Stable (2.4.3 much better)• Easy to (re)install and administer• Great for small datasets (i.e. smaller than server RAM)• Good performance if read/write ratio is high• Failover and inter-DC replication work (but shard early!)
• Performance suffers when :– dataset much larger than RAM– read/write ratio is low– Multiple applications coexist on the same cluster
• Some scalability issues remain (master-slave, connections)• Criteo is very interested in the 10gen roadmap
Copyright © 2010 Criteo. Confidential.
![Page 14: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/14.jpg)
14
THANKS A LOT FOR YOUR ATTENTION!
Copyright © 2013 Criteo. Confidential.
www.criteo.comengineering.criteo.com
![Page 15: Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB](https://reader036.vdocument.in/reader036/viewer/2022081403/554a5e8bb4c90531228b53f1/html5/thumbnails/15.jpg)