how to make dev ops work @ netlight edge x berlin
TRANSCRIPT
How to Make DevOps WorkTill Krullmann
About MeTill Krullmann
from Munich7 years in IT consultingwith Netlight since 2013Software development / architecture background1 year in DevOps-related roles
About this Talk
Structure
About DevOpsContinuous DeliveryService-Level ObjectivesOrganizing WorkDriving DevOps Culture
DevOps
DevOps
is ... isn't ...
DevOps
is ... isn't ...
a direction a destination
DevOps
is ... isn't ...
a direction a destinationtearing down the "fence"between dev & ops
moving or repainting the fence
DevOps
is ... isn't ...
a direction a destinationtearing down the "fence"between dev & ops
moving or repainting the fence
a cultural meme a person you hire
DevOps
is ... isn't ...
a direction a destinationtearing down the "fence"between dev & ops
moving or repainting the fence
a cultural meme a person you hireapplying Agile ideas toOperations
"Operations 2.0"
DevOps
is ... isn't ...
a direction a destinationtearing down the "fence"between dev & ops
moving or repainting the fence
a cultural meme a person you hireapplying Agile ideas toOperations
"Operations 2.0"
enabling devs to do "self-service" operations
putting extra work ondevelopers' shoulders
5 DevOps Tenets
5 DevOps TenetsOperationsPerspective
Take an operations perspective when looking atrequirements. Consider service-level requirements likeavailability, latency, and security.
5 DevOps TenetsOperationsPerspective
Take an operations perspective when looking atrequirements. Consider service-level requirements likeavailability, latency, and security.
DeveloperResponsibility
You build it, you run it - developers' responsibilities do notend when code is committed. Devs are involved in allsteps including deployment and incident handling.
5 DevOps TenetsOperationsPerspective
Take an operations perspective when looking atrequirements. Consider service-level requirements likeavailability, latency, and security.
DeveloperResponsibility
You build it, you run it - developers' responsibilities do notend when code is committed. Devs are involved in allsteps including deployment and incident handling.
Deployment Process There is one deployment process that everyone is awareof, agrees on and adheres to. The process is documentedand is subject to further refinement and evolution.
5 DevOps TenetsOperationsPerspective
Take an operations perspective when looking atrequirements. Consider service-level requirements likeavailability, latency, and security.
DeveloperResponsibility
You build it, you run it - developers' responsibilities do notend when code is committed. Devs are involved in allsteps including deployment and incident handling.
Deployment Process There is one deployment process that everyone is awareof, agrees on and adheres to. The process is documentedand is subject to further refinement and evolution.
Continuous Delivery Release often and with minimal need for humaninteraction. Automate as much as possible.
5 DevOps TenetsOperationsPerspective
Take an operations perspective when looking atrequirements. Consider service-level requirements likeavailability, latency, and security.
DeveloperResponsibility
You build it, you run it - developers' responsibilities do notend when code is committed. Devs are involved in allsteps including deployment and incident handling.
Deployment Process There is one deployment process that everyone is awareof, agrees on and adheres to. The process is documentedand is subject to further refinement and evolution.
Continuous Delivery Release often and with minimal need for humaninteraction. Automate as much as possible.
Infrastructure asCode
Use scripts and configuration files instead of manualconfiguration steps. Develop infrastructure code with thesame set of practices as application code (e.g. versioncontrol, code reviews and tests).
Continuous Delivery
Continuous Delivery
Continuous DeliveryMake small, frequent releases
Continuous DeliveryMake small, frequent releases
Start releasing into "production" as early as possible
Continuous DeliveryMake small, frequent releases
Start releasing into "production" as early as possible
Create a build pipeline with a high degree of automation
Continuous DeliveryMake small, frequent releases
Start releasing into "production" as early as possible
Create a build pipeline with a high degree of automation
Keep everything under version control
Release Trains & Feature Toggles
Release Trains & Feature Toggles
Source: https://labs.spotify.com/2014/03/27/spotify-engineering-culture-part-1/
Release Trains & Feature Toggles
Source: https://labs.spotify.com/2014/03/27/spotify-engineering-culture-part-1/
Architecture
ArchitectureThe system architecture should facilitate the delivery process.
ArchitectureThe system architecture should facilitate the delivery process.
The team structure is reflected in the product structure and deploymentprocess.
ArchitectureThe system architecture should facilitate the delivery process.
The team structure is reflected in the product structure and deploymentprocess.
An approach based on microservices and containerization is well-suited forsmall, autonomous teams.
Service-Level Objectives
Service-Level Terminology
Service-Level Terminology
Service-LevelIndicator (SLI)
A quantitative measure of some aspect of the level ofservice that is provided.
Examples: request latency, throughput, availability, errorrate
Service-Level Terminology
Service-LevelIndicator (SLI)
A quantitative measure of some aspect of the level ofservice that is provided.
Examples: request latency, throughput, availability, errorrate
Service-LevelObjective (SLO)
A target value or range for a service level that is measuredby an SLI.
Examples:
99% ≤ availability ≤ 99.9%request latency < 500ms
Service-Level Terminology
Service-LevelIndicator (SLI)
A quantitative measure of some aspect of the level ofservice that is provided.
Examples: request latency, throughput, availability, errorrate
Service-LevelObjective (SLO)
A target value or range for a service level that is measuredby an SLI.
Examples:
99% ≤ availability ≤ 99.9%request latency < 500ms
Service-LevelAgreements (SLA)
An agreement with the users of the service that includesconsequences of meeting (or missing) the SLOs theycontain.
Defining SLOs - Precision
Defining SLOs - Precision"The request latency should be below 500ms"
Defining SLOs - Precision"The request latency should be below 500ms"
vs.
"The request latency of all HTTP GET requests that return a status code200 should be below 500ms"
Defining SLOs - Precision"The request latency should be below 500ms"
vs.
"The request latency of all HTTP GET requests that return a status code200 should be below 500ms"
vs.
"The 90th percentile of the request latency of HTTP GET requests thatreturn a status code 200, aggregated over 1-minute intervals, should bebelow 500ms"
Defining SLOs - Guidelines
Defining SLOs - Guidelines
Define SLOs early
SLOs can provide a valuable base for architectural decisions and planning.
Defining SLOs - Guidelines
Define SLOs early
SLOs can provide a valuable base for architectural decisions and planning.
Keep it simple
Avoid too complicated aggregations when defining SLOs.
Defining SLOs - Guidelines
Define SLOs early
SLOs can provide a valuable base for architectural decisions and planning.
Keep it simple
Avoid too complicated aggregations when defining SLOs.
Have as few SLOs as possible
If you can't win an argument by quoting an SLO, it's probably not worth keeping.
Defining SLOs - Guidelines
Define SLOs early
SLOs can provide a valuable base for architectural decisions and planning.
Keep it simple
Avoid too complicated aggregations when defining SLOs.
Have as few SLOs as possible
If you can't win an argument by quoting an SLO, it's probably not worth keeping.
Avoid absolutes
No system will ever be "infinitely scalable" or "always available".
Defining SLOs - Guidelines
Define SLOs early
SLOs can provide a valuable base for architectural decisions and planning.
Keep it simple
Avoid too complicated aggregations when defining SLOs.
Have as few SLOs as possible
If you can't win an argument by quoting an SLO, it's probably not worth keeping.
Avoid absolutes
No system will ever be "infinitely scalable" or "always available".
Don't reach too high
It's better to start with a loose target and tighten it later.
Monitoring
Monitoring
Alerts
The system can tell us when it's broken, or about to break soon.
Some monitoring tools can also perform simple actions to repair the systemautomatically.
Monitoring
Alerts
The system can tell us when it's broken, or about to break soon.
Some monitoring tools can also perform simple actions to repair the systemautomatically.
Analysis
Monitoring can help analyze bugs and other incidents by looking at what happenedat the same time.
Monitoring
Alerts
The system can tell us when it's broken, or about to break soon.
Some monitoring tools can also perform simple actions to repair the systemautomatically.
Analysis
Monitoring can help analyze bugs and other incidents by looking at what happenedat the same time.
Visibility
Set up monitoring dashboards as early as possible!
This also helps the team understand DevOps as an integral part of the deliveryprocess.
Organizing Work
DevOps & Agile
DevOps & Agile
User Stories Try to fit DevOps stories into the developmentprocess (e.g. Scrum)
DevOps & Agile
User Stories Try to fit DevOps stories into the developmentprocess (e.g. Scrum)
Developers are stakeholders too!
DevOps & Agile
User Stories Try to fit DevOps stories into the developmentprocess (e.g. Scrum)
Developers are stakeholders too!
Process Experiment with different approaches to DevOpswork. Scrum may not always be the bestmethodology.
DevOps-related roles
DevOps-related roles
DevOps Engineer Sets up and tailors the tools for the developmentteamBuilds and maintains the continuous delivery pipelineSets up and maintains infrastructure for automatedtestsAutomates provisioning and deploymentSets up and configures monitoring
DevOps-related roles
DevOps Engineer Sets up and tailors the tools for the developmentteamBuilds and maintains the continuous delivery pipelineSets up and maintains infrastructure for automatedtestsAutomates provisioning and deploymentSets up and configures monitoring
DevOps Architect Addresses cross-team technical concerns: releases,versioning, 3rd-party integration...May work as a technical counterpart to the productowner
DevOps-related roles
DevOps Engineer Sets up and tailors the tools for the developmentteamBuilds and maintains the continuous delivery pipelineSets up and maintains infrastructure for automatedtestsAutomates provisioning and deploymentSets up and configures monitoring
DevOps Architect Addresses cross-team technical concerns: releases,versioning, 3rd-party integration...May work as a technical counterpart to the productowner
DevOps Coach Spreads DevOps culture in the teamAnalyzes and assesses current processesSupports developers in taking on operations-orientedtasks
Hiring for DevOps
Hiring for DevOps
Developer Systems administrator
Hiring for DevOps
Developer Systems administrator
knows the application /architecture
knows the infrastructure
Hiring for DevOps
Developer Systems administrator
knows the application /architecture
knows the infrastructure
likes experiments & cutting edgetechnology
likes stability & predictability
Hiring for DevOps
Developer Systems administrator
knows the application /architecture
knows the infrastructure
likes experiments & cutting edgetechnology
likes stability & predictability
self-service mentality gatekeeper mentality
Hiring for DevOps
Developer Systems administrator
knows the application /architecture
knows the infrastructure
likes experiments & cutting edgetechnology
likes stability & predictability
self-service mentality gatekeeper mentalityfocus on broad skills focus on specialized skills
Hiring for DevOps
Developer Systems administrator
knows the application /architecture
knows the infrastructure
likes experiments & cutting edgetechnology
likes stability & predictability
self-service mentality gatekeeper mentalityfocus on broad skills focus on specialized skillsensures quality by automatedtests
assumes quality based onexperience
Hiring for DevOps
Developer Systems administrator
knows the application /architecture
knows the infrastructure
likes experiments & cutting edgetechnology
likes stability & predictability
self-service mentality gatekeeper mentalityfocus on broad skills focus on specialized skillsensures quality by automatedtests
assumes quality based onexperience
research once, then automate tweak manually, automate ifneeded
Hiring for DevOps
Developer Systems administrator
knows the application /architecture
knows the infrastructure
likes experiments & cutting edgetechnology
likes stability & predictability
self-service mentality gatekeeper mentalityfocus on broad skills focus on specialized skillsensures quality by automatedtests
assumes quality based onexperience
research once, then automate tweak manually, automate ifneeded
80% 20%
Team Structure
Team StructureAutonomous teams - should cover the entire delivery process
Team StructureAutonomous teams - should cover the entire delivery process
Team structure always influences the product structure, and vice versa
Team StructureAutonomous teams - should cover the entire delivery process
Team structure always influences the product structure, and vice versa
At least 1 DevOps support person in every team
Team StructureAutonomous teams - should cover the entire delivery process
Team structure always influences the product structure, and vice versa
At least 1 DevOps support person in every team
Team StructureAutonomous teams - should cover the entire delivery process
Team structure always influences the product structure, and vice versa
At least 1 DevOps support person in every team
Team StructureAutonomous teams - should cover the entire delivery process
Team structure always influences the product structure, and vice versa
At least 1 DevOps support person in every team
Don't have a "DevOps team"!
Definition of Done
Definition of DoneDeploy it to production when it's done
Definition of DoneDeploy it to production when it's done
vs.
It's done when it runs in production
Discussion points
Discussion pointsDo I still need an Operations team?
Discussion pointsDo I still need an Operations team?
Do I need a DevOps Product Owner?
Driving DevOps Culture
Explore
ExploreCreate platforms where people can try out new stuff, and show off theircreations.
ExploreCreate platforms where people can try out new stuff, and show off theircreations.
Creating a side project in a small team and seeing it through to productionmakes a great DevOps exercise.
ExploreCreate platforms where people can try out new stuff, and show off theircreations.
Creating a side project in a small team and seeing it through to productionmakes a great DevOps exercise.
Trying to automate something is a great way to learn how it works.
ExploreCreate platforms where people can try out new stuff, and show off theircreations.
Creating a side project in a small team and seeing it through to productionmakes a great DevOps exercise.
Trying to automate something is a great way to learn how it works.
Fail fast, fail often - failure should be embraced as a lesson learned.
Trust
TrustGive everyone in the team access to everything. "Trust barriers" killmotivation.
TrustGive everyone in the team access to everything. "Trust barriers" killmotivation.
Empower the team by giving them freedom and responsibilities. Don'tmicromanage.
TrustGive everyone in the team access to everything. "Trust barriers" killmotivation.
Empower the team by giving them freedom and responsibilities. Don'tmicromanage.
Cross-pollination is a great driver of culture. Decree from above is not.
Transparency
TransparencyIf you do not have full knowledge of your system, it will take much longer toinvestigate or fix problems.
TransparencyIf you do not have full knowledge of your system, it will take much longer toinvestigate or fix problems.
Everything concerning operations and infrastructure should be clear andtransparent for everyone.
TransparencyIf you do not have full knowledge of your system, it will take much longer toinvestigate or fix problems.
Everything concerning operations and infrastructure should be clear andtransparent for everyone.
Scripting and automation lead to a self-documenting system.
TransparencyIf you do not have full knowledge of your system, it will take much longer toinvestigate or fix problems.
Everything concerning operations and infrastructure should be clear andtransparent for everyone.
Scripting and automation lead to a self-documenting system.
Questions to ask:
TransparencyIf you do not have full knowledge of your system, it will take much longer toinvestigate or fix problems.
Everything concerning operations and infrastructure should be clear andtransparent for everyone.
Scripting and automation lead to a self-documenting system.
Questions to ask:
How quickly can someone I hire today start working productively?
TransparencyIf you do not have full knowledge of your system, it will take much longer toinvestigate or fix problems.
Everything concerning operations and infrastructure should be clear andtransparent for everyone.
Scripting and automation lead to a self-documenting system.
Questions to ask:
How quickly can someone I hire today start working productively?How good is my bus factor?
Practices
Practices
Error Budget
Define an SLO, e.g. 99% availability.As long as you're above that goal, push out new features, destabilizing thesystem.As soon as you drop below the threshold, stall or slow down releases andfocus on stability.
Practices
Error Budget
Define an SLO, e.g. 99% availability.As long as you're above that goal, push out new features, destabilizing thesystem.As soon as you drop below the threshold, stall or slow down releases andfocus on stability.
Rotating Roles
Put a team member (or one team in a multi-team environment) on operations-related duties.Rotate in fixed intervals (e.g. each sprint).
Practices
Error Budget
Define an SLO, e.g. 99% availability.As long as you're above that goal, push out new features, destabilizing thesystem.As soon as you drop below the threshold, stall or slow down releases andfocus on stability.
Rotating Roles
Put a team member (or one team in a multi-team environment) on operations-related duties.Rotate in fixed intervals (e.g. each sprint).
Ops Hat
In planning or estimation sessions, one participant explicitly takes the "ops hat"and puts extra scrutiny on operations-relevant aspects.
Resources
Site Reliability Engineering - How Google Runs Production Systems
DevOps: A Software Architect's Perspective
Spotify Resources
Spotify Engineering Culture - Part 1
https://labs.spotify.com/2014/03/27/spotify-engineering-culture-part-1/
Spotify Engineering Culture - Part 2
https://labs.spotify.com/2014/09/20/spotify-engineering-culture-part-2/
Netflix Resources
DevOps at Netflix (2012 re:Invent talk)
http://www.slideshare.net/jedberg/devops-at-netflix-reinvent
Beyond DevOps: How Netflix Bridges the Gap (2015 QCon talk)
https://www.infoq.com/presentations/netflix-operations-devops
Thank you!Feel free to contact me or send feedback at [email protected]