![Page 1: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/1.jpg)
How Do you Scale for both Predictable and
Unpredictable Events on such a Large Scale?
Surge 2013
![Page 2: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/2.jpg)
We’re going to talk about this:
Whitney Houston Death: February 11, 2012
![Page 3: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/3.jpg)
… and this:
![Page 4: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/4.jpg)
Without your site going down…
![Page 5: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/5.jpg)
Who Am I?
• Team Lead of CBC.ca System Administration team.
• Been with CBC for over 11 years (since 2002).
• @blakecrosby
![Page 6: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/6.jpg)
![Page 7: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/7.jpg)
![Page 8: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/8.jpg)
Let’s go back in time……way back
![Page 9: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/9.jpg)
2010
![Page 10: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/10.jpg)
2008
![Page 11: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/11.jpg)
2007
![Page 12: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/12.jpg)
2006
![Page 13: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/13.jpg)
2005
![Page 14: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/14.jpg)
2004
![Page 15: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/15.jpg)
2003
![Page 16: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/16.jpg)
“News stories must appear on the site as fast as possible!”
- Every Journalist at CBC
![Page 17: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/17.jpg)
![Page 18: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/18.jpg)
![Page 19: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/19.jpg)
![Page 20: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/20.jpg)
![Page 21: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/21.jpg)
This architecture doesn’t work for news websites.
![Page 22: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/22.jpg)
This was an important lesson for CBC
![Page 23: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/23.jpg)
Breaking news trafficIt’s unpredictable and short lived.
![Page 24: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/24.jpg)
From 12k hit/s to 30k hit/s
Royal Baby: July 22, 2013
![Page 25: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/25.jpg)
From 1Gbps to 2.5Gbps in ~7min
Boston Marathon Bombing: April 15, 2013
![Page 26: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/26.jpg)
From 1 Gbps to 14 Gbps in ~10 minutes.
Whitney Houston Death: February 11, 2012
![Page 27: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/27.jpg)
Challenges we (or you) face
![Page 28: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/28.jpg)
Too expensive to build out infrastructure for traffic levels that are sustained < 1% of the year.
![Page 29: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/29.jpg)
Content must be flexible to changing traffic conditions
![Page 30: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/30.jpg)
We have valuable information that users need in a crisis.
![Page 31: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/31.jpg)
“News stories must appear on the site as fast as possible!”
- Every Journalist at CBC
![Page 32: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/32.jpg)
How we fixed this problem(back in 2003, remember?)
![Page 33: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/33.jpg)
![Page 34: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/34.jpg)
Save everything to
disk.
![Page 35: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/35.jpg)
Advantages
• Observes the principal of least surprise.
• Fast
• Takes advantages of OS and FS caches
• Easy to turn off certain site features.
![Page 36: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/36.jpg)
![Page 37: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/37.jpg)
Using SSIs (Server Side Includes)
• Primitive, but fast and secure.
• Can turn off site features or change look and feel by editing one file.
• All pages are updated instantly, without having to wait for pages to be republished.
![Page 38: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/38.jpg)
Use a Content Delivery Network
![Page 39: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/39.jpg)
Use Conditional GETs (If-Modified-Since)
![Page 40: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/40.jpg)
![Page 41: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/41.jpg)
Using Expiry and Validation
• Object has a TTL of 30 Seconds.
• Object hast a last modified time of Jan 1, 2013 00:00:00
• Once TTL has expired, cache/CDN will check if object is updated.
• Origin will return "304 Not Modified" and cache will reset TTL and serve object from cache store.
• The 30 second TTL protects the origin from a deluge of "If modified since" requests.
![Page 42: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/42.jpg)
![Page 43: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/43.jpg)
Use Last Mile Acceleration (GZIP Compression)
![Page 44: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/44.jpg)
Use persistent HTTP connections
![Page 45: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/45.jpg)
Use Appropriate Cache TTLs. Keep them simple!
![Page 46: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/46.jpg)
Keep tunable options at the origin
![Page 47: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/47.jpg)
Move personalization to the client
![Page 48: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/48.jpg)
Outcomes(Where we are now in 2013)
![Page 49: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/49.jpg)
Outcomes
• 2003 to 2010 – No need to grow origin
• 2010 to today – 9 origin web servers• HP DL360 G7
• Average 45-50% CPU utilization
• Capital cost for hardware? $15,000!
![Page 50: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/50.jpg)
Our secret sauce.(or how to serve 800M requests a day from 9 webservers)
![Page 51: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/51.jpg)
Offload (Bandwidth)
![Page 52: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/52.jpg)
Offload (Hits)
![Page 53: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/53.jpg)
Scaling for Unpredictable Events
![Page 54: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/54.jpg)
Checking the last time a file has changed is faster than delivering that file to a user.
![Page 55: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/55.jpg)
Conditional GETs (304s) will save you.
![Page 56: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/56.jpg)
Make sure users don’t have to search for content
![Page 57: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/57.jpg)
Increase your TTLs
![Page 58: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/58.jpg)
Turn off dynamic components
![Page 59: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/59.jpg)
Scaling for predictable events
![Page 60: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/60.jpg)
Predicting traffic levels is impossible
![Page 61: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/61.jpg)
Some (loose) rules.
• Scheduled events don't peak has high as unpredictable ones.
• Scheduled events last longer, so increase in traffic is spread out over hours, days, or weeks.
• Scheduled events are more "niche". Unlike breaking news where everyone wants to know what's going on.
• Might have to worry about 95/5 and bandwidth overages.
![Page 62: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/62.jpg)
How do you scale for write operations?
![Page 63: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/63.jpg)
We let someone else deal with that:
![Page 64: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/64.jpg)
In Summary…
![Page 65: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/65.jpg)
• Ensure your TTLs are appropriate
• Make sure your applications/content return last modified headers.
• Don't be afraid to change your site to turn off components that aren't critical during high traffic periods.
• Keep tunables at the Origin. This allows you to make changes quickly without waiting for CDN propagation.
• A CDN will not replace or fix bad origin infrastructure!
![Page 66: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/66.jpg)
• Predicting the scale of a scheduled event is impossible. You will either over estimate or under estimate.
• Use previous traffic levels during unscheduled events as a high water mark.
• Don't be afraid to ask someone else (SaaS provider) to implement a feature that is not your core business/expertise.
![Page 67: How Do you Scale for both Predictable and Unpredictable Events on such a Large Scale?](https://reader035.vdocument.in/reader035/viewer/2022070322/559436d41a28ab9a628b45c4/html5/thumbnails/67.jpg)
Usenix Paper
http://tinyurl.com/lisa-paper