jez humble qcon london 2011gotocon.com/dl/qcon-london-2011/slides/jezhumble... · 2011. 3. 11. ·...
TRANSCRIPT
-
remediation patterns Jez Humble
QCon London 2011
[email protected] @jezhumble #continuousdelivery
1Thursday, March 10, 2011
-
ITIL: “Recovery to a known state after a failed Change or Release.”
Recovery: “Returning a Configuration Item or an IT Service to a working state.”
Jez: “Fixing shit when it breaks”
remediation
2Thursday, March 10, 2011
http://www.knowledgetransfer.net/dictionary/ITIL/en/Recovery.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Recovery.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Change.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Change.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Release.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Release.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Configuration_Item.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Configuration_Item.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/IT_Service.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/IT_Service.htm
-
prevention
patterns for low-risk release
patterns for incremental delivery
strategies for remediation
3Thursday, March 10, 2011
-
1oz of prevention
4Thursday, March 10, 2011
-
deployment pipeline
5Thursday, March 10, 2011
-
testing on production environments
creating maintainable acceptance tests
testing cross-functional requirements
the hard bits
6Thursday, March 10, 2011
-
automate provisioning and deployment
ensure devs, testers and ops collaborate throughout
reducing release risk
7Thursday, March 10, 2011
-
canary releasing
8Thursday, March 10, 2011
-
canary releasing
9Thursday, March 10, 2011
-
reduce risk of release
multivariant testing
performance testing
canary releasing
10Thursday, March 10, 2011
-
immune system
what if someone replaced your “buy” button with spacer.gif?
T cells http://www.flickr.com/photos/gehealthcare/3326186490/11Thursday, March 10, 2011
http://www.flickr.com/photos/gehealthcare/3326186490/http://www.flickr.com/photos/gehealthcare/3326186490/
-
Business metrics - revenue, # orders, # users
Ops metrics - changes, incidents, TTD, TTR, TBF
Technical metrics - TPS, response time, hits
monitoring
http://www.flickr.com/photos/wwarby/3296379139/
12Thursday, March 10, 2011
http://www.flickr.com/photos/wwarby/3296379139/http://www.flickr.com/photos/wwarby/3296379139/
-
root cause analysis
collboration
data
the hard bits
13Thursday, March 10, 2011
-
incremental delivery
John Allspaw: “Ops Metametrics” http://slidesha.re/dsSZIr
14Thursday, March 10, 2011
http://slidesha.re/dsSZIrhttp://slidesha.re/dsSZIr
-
incremental deployments
develop on mainline
feature toggles and branch by abstraction
dark launching
incremental delivery
15Thursday, March 10, 2011
-
feature toggles
[featureToggles]wobblyFoobars: trueflightyForkHandles: false
Config File
... various UI elements
some.jsp
forkHandle = (featureConfig.isOn(‘flightlyForkHandles)) ? new FlightyForkHander(aCandle) : new ForkHandler(aCandle)
other.java
Stolen from Martin Fowler
16Thursday, March 10, 2011
-
branch by abstraction
Component A
Component B
Seam
Component A
17Thursday, March 10, 2011
-
branch by abstraction
Component AComponent A
Component B’
Abstraction layer
Component B
18Thursday, March 10, 2011
-
incremental deployment
STATIC CONTENT
/static/1.1
/static/1.0
DEPENDENT SERVICE
1.0 1.1
Abstraction layer Abstraction layer
APPLICATION
Database
Router /Load balancer
Interwebs
19Thursday, March 10, 2011
-
dark launching
20Thursday, March 10, 2011
-
dark launching
21Thursday, March 10, 2011
-
How long would it take you to release a change to a single line of code?
Ops metrics - changes, incidents, TTD, TTR, TBF
If your data center blew up, how long would you take to restore service?
measuring effectiveness
22Thursday, March 10, 2011
-
questionsJez Humble
[email protected] @jezhumble #continuousdelivery
23Thursday, March 10, 2011
mailto:[email protected]:[email protected]