design for scale / surge 2010
DESCRIPTION
Christopher Brown's surgecon2010 talk on resilient, scalable systems based on his work on Amazon's EC2 and the Opscode Platform.TRANSCRIPT
Copyright © 2010 Opscode, Inc - All Rights Reserved
‣ [email protected]‣ @skeptomai‣ www.opscode.com
Christopher Brown VP, Engineering
1
Design for Scale
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
•Microsoft Edge Computing Network
Copyright © 2010 Opscode, Inc - All Rights Reserved 2
Who am I?
•Amazon EC2
•Microsoft Edge Computing Network
•Opscode
Google, Amazon, Microsoftbuilt their own tools
Copyright © 2010 Opscode, Inc. – Confidential – Do Not Redistribute
P
almost everyone else is here...
... inexperienced or poorly equipped for the world in which we now operate.
4
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
Command & Control
The Method
http://www.flickr.com/photos/wonderlane/2090966628/sizes/l/
Bootstrapping
Configuration
Command & ControlNanite!
Copyright © 2010 Opscode, Inc - All Rights Reserved 6
Got it?
Copyright © 2010 Opscode, Inc - All Rights Reserved 6
Got it?Defining the cloud is like this...
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Copyright © 2010 Opscode, Inc - All Rights Reserved 7
Origin Myth of EC2
Dynamism
Dynamism...not about excess capacity...
Dynamism
Dynamism• Disintermediation• Developers can freely experiment
Dynamism• Disintermediation• Developers can freely experiment
• Isolation• Applications safely co-exist
Dynamism• Disintermediation• Developers can freely experiment
• Isolation• Applications safely co-exist
• Utilization• Best use of expensive resources
Dynamism• Disintermediation• Developers can freely experiment
This is what you are paying for
• Isolation• Applications safely co-exist
• Utilization• Best use of expensive resources
Scale
ScaleYou are not this BIG
ScaleYou are not this BIG
You are not that BIG
• LAMP can scale on generic architecture
• 2008 - Facebook has over 800 memcached servers, with 28 terabytes of RAM
• 2010 - Github has 16 physical machines, 128 cores, 288 GB RAM
• Don’t design for A Million Users
• Ship early, Ship ugly, Ship often!
You are not that BIG
• LAMP can scale on generic architecture
• 2008 - Facebook has over 800 memcached servers, with 28 terabytes of RAM
• 2010 - Github has 16 physical machines, 128 cores, 288 GB RAM
• Don’t design for A Million Users
• Ship early, Ship ugly, Ship often!
EC2 Design Principles• Minimize management footprint
• Run in VMs just like customers.
• Forced to analyze what must run in privileged space
• “Harden everything” means separate network traffic inside the datacenter – customers and management run there
• True multi-tenancy - Customers run side-by-side
• Design by Fight Club
• "You are not a beautiful and unique snowflake“
• “On a large enough time line, the survival rate for everyone will drop to zero.”
http://www.flickr.com/photos/europedistrict/4058066840/
Copyright © 2010 Opscode, Inc - All Rights Reserved 13
• Simple API, single unit of work
• think of early Unix tools (MH)
• Can compose with other APIs
• Does not define policy / coupling
• Customers will surprise youPrimitives
Copyright © 2010 Opscode, Inc - All Rights Reserved 14
APIs, Mashups
Copyright © 2010 Opscode, Inc - All Rights Reserved 15
http://www.flickr.com/photos/jfseesthings/4293062294/sizes/l/
Simplify
• Move complexity “up the stack”
• Easier to debug
• “Simple and Open” wins
• OAuth, OpenID
• ATOM, REST
• Example: EC2 Metadata - HTTP
Cost
Cost• CapEx versus OpEx
Cost• CapEx versus OpEx
• The Cloud is not “Cheaper”
Cost• CapEx versus OpEx
• The Cloud is not “Cheaper”
• Do you have money, time, or experience?
Cost
What are you willing to pay for?
• CapEx versus OpEx
• The Cloud is not “Cheaper”
• Do you have money, time, or experience?
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Copyright © 2010 Opscode, Inc - All Rights Reserved 17
Power
Nobody ever imagined a band of Orcs would steal a database table
Charles Stross - Halting State
MTTF & MTTRUnderstanding how, when and why things fail is great ... but
http://www.flickr.com/photos/dierken/948171048/sizes/z/
MTTF & MTTRUnderstanding how, when and why things fail is great ... but
If your Mean Time to Recover exceeds the time value of your data, your business is
DEAD
http://www.flickr.com/photos/dierken/948171048/sizes/z/
Testing
• Test with production-like dataset and performance
• Don’t do “Design by Laptop”
• A/B Testing
• API versioning
Pull the Plug
•Create test environment
•Pull the plug
•Document
•Pull the plug again!
http://www.flickr.com/photos/rosipaw/5033284534/sizes/m/in/photostream/
Pull the Plug
•Create test environment
•Pull the plug
•Document
•Pull the plug again!
http://www.flickr.com/photos/rosipaw/5033284534/sizes/m/in/photostream/
vs
Theo Morpheus
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
You are not Theo
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
You are not Theo You’re probably not Morpheus either
Free your mind...
• Vertical vs Horizontal Scale
• Availability
• Reliability
• 99% vs 99.x% per unit?
vs
Theo Morpheus
You are not Theo You’re probably not Morpheus either
Availability• For a distributed system to be continuously
available, every request received by a non-failing node in the system must result in a response.
• “Read globally, Write locally" with inconsistent cache
• Service Level Agreements, even (especially?) internally
Think Globally, Act Locally
• Global but inconsistent aggregate view
• Local action where data is authoritative
• Autonomy
• “Rightsizing” your failure domain
http://www.flickr.com/photos/28634332@N05/3872137437/sizes/m/in/photostream/
Distributed Systems Design• Avoid execution caching
• “Don’t lie, don’t retry”
• Embrace failure
• Don’t block the client
• Avoid internal policy
• Ensure the system makes forward progress
Copyright © 2010 Opscode, Inc - All Rights Reserved 26
• It’s OK to apologize
• It’s better to completely fail for some users than penalize all of them
• The Web is all about “Hit Refresh”
Embrace Failure
Apologize...to Pat Helland
• Distributed Throttling
• Staged / Pipeline with back pressure
• Measure scalability at each stage
• Degraded performance
• Make progress for admitted requests
• At odds with “stateless” / session-less
Admission Control
http://www.flickr.com/photos/jayneandd/4450623309/sizes/m/in/photostream/
• Distributed Throttling
• Staged / Pipeline with back pressure
• Measure scalability at each stage
• Degraded performance
• Make progress for admitted requests
• At odds with “stateless” / session-less
Admission Control
http://www.flickr.com/photos/jayneandd/4450623309/sizes/m/in/photostream/
Make Forward Progress• MVCC, vector clocks, & reconciliation
• Don’t resurrect objects
• always go forward, never go back
• "name" is a property of an object, not its unique key
• Break the link, garbage collect later
• Model “degraded service” performance
Request Signing
• Stateless - no session tracking to lose or to purge later
• X509 - only public information on front-end boxes. More secure against exploit
• Shared secret - faster, smaller signature but requires secret info close to request front-end
Measure Monitor
Respond• Save *everything* *forever*
• Histograms / Pareto Chart
• tp99.9, tp99, and tp90
• ignore tp50, “average”
• http://en.wikipedia.org/wiki/Control_chart
• http://www.newrelic.com/
• http://www.splunk.com/
• skewness, kurtosis
Control Chart
• Day over Day
• Same Day, Year over Year
• Confidence Intervals
“Shewhart stressed that bringing a production process into a state of statistical control, where there is only common-cause variation, and keeping it in control, is necessary to predict future output and to manage a process economically.”
• http://en.wikipedia.org/wiki/Control_chart
Characteristic Curves
Periodicity
SLA, Variance, Troubleshooting
Data Taxonomy
• Precious
• Cachable
• Expensive
• Cheap
Consistency
• Authoritative vs. Consultative
• is_authorized? vs list group
Performance
• Call length
• Cyclomatic Complexity
• Request ID flow
• Vertical vs Horizontal Scale
• tension between unit performance and scalability
Failure Domains
• EC2 “droplets”
• EC2 DNS
• Coordinator zones
Copyright © 2010 Opscode, Inc - All Rights Reserved 39
Still with me?
Successes
•Sharable “AMI”s•Metadata (Simple and open again)•Open API ( think Eucalyptus)•No API throttling•Primitives•Pay-as you go•Free traffic between S3 and EC2•Data and Compute together
Failures• SOAP makes little girls cry
• Amazon Web Services, circa 2006 was > 75% REST or Query
• SOAP well supported by commercial vendors, with their libraries
• Still *Way* too hard to use.
• Commodity business. Driving the bottom out of cost causes quality to suffer.
• API vs UI?, User Experience in general
• IaaS (Infrastructure as a Service) is insufficient by itself
• a hangman's noose. EC2, and the other offerings,
Where are we going?