software operability and run book collaboration london feb 2014
TRANSCRIPT
![Page 1: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/1.jpg)
#u
nid
ev
op
s
Software Operability,
Run Book Collaboration,
and DevOps
Matthew Skelton
27th February 2014
DevOps Summit,
London, UK
www.devopssummit.com
@matthewpskelton
softwareoperability.com
![Page 2: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/2.jpg)
#u
nid
ev
op
s
Agenda
• Software Operability
• Run Book Collaboration
• Making Operability Work
• Questions
![Page 3: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/3.jpg)
#u
nid
ev
op
s
Background
• Software systems since 1998
• Continuous Delivery specialist, DevOps enthusiast, Operability nut
• London Continuous Delivery meetupgroup - londoncd.org.uk
• Experience DevOps workshops
• PIPELINE Conference
![Page 4: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/4.jpg)
#u
nid
ev
op
s
Software
Operability
![Page 5: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/5.jpg)
#u
nid
ev
op
s
Software Operability
• Definitions
• Examples
• Why focus on operability?
• How DevOps can help
![Page 6: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/6.jpg)
#u
nid
ev
op
s
Operability?
![Page 7: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/7.jpg)
#u
nid
ev
op
s
Etymology of Operability?
• Cognates:
– Opera
– Operate
– Operational
– Inter-operability
![Page 8: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/8.jpg)
#u
nid
ev
op
s
![Page 9: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/9.jpg)
#u
nid
ev
op
s
Software Operability
• Operability: the properties of a
system which make it work well in
Production
![Page 10: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/10.jpg)
#u
nid
ev
op
s
Operable Systems
Since 1929,
Mallorca, Spain
![Page 11: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/11.jpg)
#u
nid
ev
op
s
Software Operability
• David Copeland (@davetron5000):
“How your software runs in
production is all that matters. The
most amazing abstractions, cleanest
code, or beautiful algorithms are
meaningless if your code doesn’t run
well on production.”
• http://www.naildrivin5.com/blog/2013/06/16/production-is-all-that-matters.html
![Page 12: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/12.jpg)
#u
nid
ev
op
s
Operational Criteria
• Deploy
• Monitor
• Diagnose
• Debug
• Query
• Control
• Inspect
• Clear
• ...
![Page 13: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/13.jpg)
#u
nid
ev
op
s
“Non-Functional”
![Page 14: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/14.jpg)
#u
nid
ev
op
s
Shaped by Operability
• Hooks (internal APIs) for:
– Logging
– Monitoring
– Diagnostics
– Health checks
– Data clear-down
– Service / daemon / container control
![Page 15: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/15.jpg)
#u
nid
ev
op
s
Ops Folk are Users Too!
![Page 16: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/16.jpg)
#u
nid
ev
op
s
![Page 17: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/17.jpg)
#u
nid
ev
op
s
Why focus on Operability?
• Deploy more rapidly, frequently
• High cost of Production outage
• Systems now more complicated
![Page 18: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/18.jpg)
#u
nid
ev
op
s
Outages are Embarrassing!
![Page 19: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/19.jpg)
#u
nid
ev
op
s
Operational considerations
![Page 20: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/20.jpg)
#u
nid
ev
op
s
Operational considerations
![Page 21: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/21.jpg)
#u
nid
ev
op
s
Operational considerations
![Page 22: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/22.jpg)
#u
nid
ev
op
s
How DevOps can help
• DevOps is one way to address poor operability
• Improved collaboration and communication between Dev teams and Ops teams
• Example: Run Book Collaboration
![Page 23: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/23.jpg)
#u
nid
ev
op
s
Run Book
Collaboration
![Page 24: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/24.jpg)
#u
nid
ev
op
s
Run Book Collaboration
• Feedback loops and learning
• What is a run book?
• How can run book collaboration
help operability?
![Page 25: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/25.jpg)
#u
nid
ev
op
s
Feedback Loops
Gene Kim:
http://itrevolution.com/the-three-ways-principles-underpinning-devops/
![Page 26: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/26.jpg)
#u
nid
ev
op
s
Run Book
![Page 27: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/27.jpg)
#u
nid
ev
op
s
Templates
![Page 28: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/28.jpg)
#u
nid
ev
op
s
Example
• 1 Table of Contents
• 2 System Overview – 2.1 Service Overview
– 2.2 Contributing Applications, Daemons, and Windows Services
– 2.3 Hours of Operation
– 2.4 Execution Design
– 2.5 Infrastructure and Network Design
– 2.6 Resilience, Fault Tolerance and High-Availability
– 2.7 Throttling and Partial Shutdown
– 2.8 Required Resources
– 2.9 Expected Traffic and Load • 2.9.1 Hot or Peak Periods• 2.9.2 Warm Periods• 2.9.3 Cool or Quiet Periods
– 2.10 Environmental Differences
– 2.11 Tools
• 3 Security and Access Control
• 4 System Configuration – 4.1 Configuration Management
• 5 System Backup and Restore – 5.1 Backup Requirements
• 5.1.1 Special Files
– 5.2 Backup Procedures
– 5.3 Restore Procedures
• 6 Monitoring and Alerting – 6.1 Error Messages
– 6.2 Events
– 6.3 Health Checks
– 6.4 Other Messages
• 7 Operational Tasks – 7.1 Deployment
– 7.2 Batch Processing
– 7.3 Power Procedures
– 7.4 Routine Checks • 7.4.1 System Rebuilds
– 7.5 Troubleshooting
• 8 Maintenance Tasks – 8.1 Maintenance Procedures
• 8.1.1 Patching – 8.1.1.1 Normal Cycle
– 8.1.1.2 Zero-Day Vulnerabilities
• 8.1.2 GMT/BST time changes• 8.1.3 Cleardown Activities
– 8.1.3.1 Log Rotation
– 8.2 Testing • 8.2.1 Technical Testing• 8.2.2 Post-Deployment
• 9 Failure and Recovery Procedures – 9.1 Failover
– 9.2 Recovery
– 9.3 Troubleshooting Failover and Recovery
• 10 Contact Details
![Page 29: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/29.jpg)
#u
nid
ev
op
s
Example
• 1 Table of Contents
• 2 System Overview – 2.1 Service Overview
– 2.2 Contributing Applications, Daemons, and Windows Services
– 2.3 Hours of Operation
– 2.4 Execution Design
– 2.5 Infrastructure and Network Design
– 2.6 Resilience, Fault Tolerance and High-Availability
– 2.7 Throttling and Partial Shutdown
– 2.8 Required Resources
– 2.9 Expected Traffic and Load
• 3 Security and Access Control
• 4 System Configuration
• 5 System Backup and Restore
• 6 Monitoring and Alerting
• 7 Operational Tasks
• 8 Maintenance Tasks
• 9 Failure and Recovery Procedures
• 10 Contact Details
![Page 30: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/30.jpg)
#u
nid
ev
op
s
Example
2.1 Service Overview
2.2 Contributing Applications, Daemons, and Windows Services
2.3 Hours of Operation
2.4 Execution Design
2.5 Infrastructure and Network Design
2.6 Resilience, Fault Tolerance and High-Availability
2.7 Throttling and Partial Shutdown
2.8 Required Resources
2.9 Expected Traffic and Load
![Page 31: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/31.jpg)
#u
nid
ev
op
s
It‟s Not Documentation
![Page 32: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/32.jpg)
#u
nid
ev
op
s
Focus on Collaboration
![Page 33: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/33.jpg)
#u
nid
ev
op
s
Outcomes
• Better understanding
• Better cross-team working
• Reduction in operational problems
• Fewer outages
• Reduced long-term cost-of-
ownership
![Page 34: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/34.jpg)
#u
nid
ev
op
s
Run Book as Collaboration
• Focus on the collaboration
• Run book is a means, not an end
• Throw it away when complete (?)
• Aim to automate more over time
• See http://runbookcollab.info/
![Page 35: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/35.jpg)
#u
nid
ev
op
s
Making Operability
Work
![Page 36: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/36.jpg)
#u
nid
ev
op
s
Making Operability Work
• NFRs vs Operational Features
• Budget changes
• Organisation changes
• Responsibility changes
• Avoid on-call anti-patterns
![Page 37: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/37.jpg)
#u
nid
ev
op
s
“Non-Functional”
![Page 38: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/38.jpg)
#u
nid
ev
op
s
Operational Features
Features
![Page 39: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/39.jpg)
#u
nid
ev
op
s
Taking Operability Seriously
• Single product backlog
– End-user + Operational features
– New features + bugs
• Product Owner on call
– Accountable for operational failures
– Seriously!
![Page 40: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/40.jpg)
#u
nid
ev
op
s
![Page 41: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/41.jpg)
#u
nid
ev
op
s
Budget changes
• “What is your budget code?”
• Capex vs. Opex?
• Remove budget barriers to
regular, effective communication
![Page 42: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/42.jpg)
#u
nid
ev
op
s
Niek Bartholomeus (@niekbartho) - http://niek.bartholomeus.be/https://speakerdeck.com/niekbartho/self-organization-vs-global-optimization-a-comparison-between-
traditional-and-modern-organizations
![Page 43: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/43.jpg)
#u
nid
ev
op
s
Organisation changes
• “I‟ll need to ask my manager first”
• Lack of autonomy
• Remove reporting barriers to regular, effective communication
• More at http://bit.ly/DevOpsTopologies
![Page 44: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/44.jpg)
#u
nid
ev
op
s
“I just want to write code”
![Page 45: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/45.jpg)
#u
nid
ev
op
s
Mysterious Coding Tricks
![Page 46: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/46.jpg)
#u
nid
ev
op
s
On-call for Responsibility
![Page 47: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/47.jpg)
#u
nid
ev
op
s
On-call Anti-Patterns
• Too much overtime pay
• Too little overtime pay
• Rota team too small
• No training in incident response
• No team ownership of product
• No team autonomy for changes
![Page 48: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/48.jpg)
#u
nid
ev
op
s
On call - Goal
• Team members want to help
make things better
• Empowered to fix problems
• Reduce the times they are woken
up
![Page 49: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/49.jpg)
#u
nid
ev
op
s
The operability of operability
• Operational Features, not “NFRs”
• Sustainable collaboration
• Sensible, fair on-call rotas
• Over-compensate in time off
• Avoid burn-out
![Page 50: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/50.jpg)
#u
nid
ev
op
s
Recapitulation
![Page 51: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/51.jpg)
#u
nid
ev
op
s
Software Operability
Making software
systems work well
in Production
![Page 52: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/52.jpg)
#u
nid
ev
op
s
Run Book Collaboration
Shared focus on operability throughout the delivery cycle
![Page 53: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/53.jpg)
#u
nid
ev
op
s
Making Operability Operable
Use DevOps team patterns for sustainable operability
![Page 54: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/54.jpg)
#u
nid
ev
op
s
What‟s Next?
![Page 55: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/55.jpg)
#u
nid
ev
op
s
Further Reading
• Patterns for
Performance and
Operability
– Ford, Gileadi, Purba, Moerman
• http://whoownsmyoperability.com/
– Recommended reading lists
![Page 56: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/56.jpg)
#u
nid
ev
op
s
Further Reading
• Release It!– Michael Nygard
(@mnygard)
• http://www.michaelnygard.com/
![Page 57: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/57.jpg)
#u
nid
ev
op
s
Operability Book
• Software Operability – How to make software work well in Production– Due early late 2014
• Sign up at OperabilityBook.com
• Discount code for DevOps Summit attendees
![Page 58: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/58.jpg)
#u
nid
ev
op
s
Experience DevOps
• A hands-on workshop for DevOps
culture
• Forthcoming dates:
– London: 28th February 2014
• http://experiencedevops.org/
![Page 59: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/59.jpg)
#u
nid
ev
op
s
PIPELINE
• Continuous Delivery
• „Unconference‟ format
• Tuesday 8th April 2014
• London, UK
• http://pipelineconf.info/
• @PipelineConf
![Page 60: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/60.jpg)
#u
nid
ev
op
s
Questions &
Discussion
Matthew Skelton
@matthewpskelton
softwareoperability.com
operabilitybook.com
bit.ly/DevOpsTopologies
![Page 61: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/61.jpg)
#u
nid
ev
op
s
Acknowledgements
http://pianofortekeys.files.wordpress.com/ 2013/04/ariadnne_wideweb__470x3300.jpg
http://www.blinkenlights.nl/images/ blinkenlights-big.jpeg
http://www.danatronics.com/s db_apps.html
http://riverbankoftruth.com/ wp-content/uploads/2013/07/embarrassed-chimp22.jpg
http://www.thinkgeek.com/edm/ 20040709.html
http://indianaohindiana.com/wp-content/uploads/2013/10/Tome.jpg
http://www.guavaworks.com/company-blog/guava-doesnt-do-cookie-cutter.html
http://www.carpages.co.uk/ford/ford-sand-sculptures-05-09-11.asp
http://www.thisismoney.co.uk/money/experts/ article-2324270/Take-smaller-pension-pots-tax-free-leave-final-salary-untouched.html
http://paranoidnews.org/wp-content/uploads/2010/10/Alien-Hunt-Alarm-Clock.jpg
http://particulations.blogspot.co.uk/ 2010/08/headingley-hole.html
http://marvel.wikia.com/ Stephen_Strange_(Earth-616)
![Page 62: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/62.jpg)
#u
nid
ev
op
s
Further Slides
![Page 63: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/63.jpg)
#u
nid
ev
op
s
The Phoenix Project
![Page 64: Software operability and run book collaboration London Feb 2014](https://reader033.vdocument.in/reader033/viewer/2022060109/55534e6cb4c90503618b51f0/html5/thumbnails/64.jpg)
#u
nid
ev
op
s
Continuous Delivery