cloud-native workshop nyc - the cloud-native landscape
TRANSCRIPT
pivotal.io/roadshow #cnr
Cloud-Native RoadshowNew York City
pivotal.io/roadshow #cnr
Pivotal
“Our Mission is to transform how the world builds software.”
pivotal.io/roadshow #cnr
Cameron Stewart @cws322
Casey West @caseywest
pivotal.io/roadshow #cnr
Yes
pivotal.io/roadshow #cnr
Will I get a copy of these materials?
Yes
pivotal.io/roadshow #cnr
Jamie Dimon, CEO JPMC Source: JPMC Annual Shareholder Letter (2015)
“Silicon Valley is coming… and they want to eat our lunch.”
pivotal.io/roadshow #cnr
Casey West
“We’re from Silicon Valley. We brought lunch.”
pivotal.io/roadshow #cnr
The Pivotal Cloud Foundry Ecosystem
Pivotal Google Cloud
pivotal.io/roadshow #cnr
Google Cloud
Team Google
Cloud Marketing Software EngineerSoftware Engineer
Meaghan Kjelland
Solutions Stuff
Jay MarshallNicole Rogers Colleen Briant
pivotal.io/roadshow #cnrCopyright*Solace
o Open*protocol2basedo Hybrid*cloud*readyo Proven
pivotal.io/roadshow #cnr
MonitoringredefinedEveryuser,everyapp,everywhere.AIpowered,fullstack,automated.
Fulllifecycle- development,test,andproduction
pivotal.io/roadshow #cnr
The Cloud Foundry Ecosystem
pivotal.io/roadshow #cnr
What is Cloud-Native?
pivotal.io/roadshow #cnr
Cloud-Native is
pivotal.io/roadshow #cnr
• Composable Architectures
Cloud-Native is
pivotal.io/roadshow #cnr
• Composable Architectures
• Automated Process
Cloud-Native is
pivotal.io/roadshow #cnr
• Composable Architectures
• Automated Process
• Collaborative Culture
Cloud-Native is
pivotal.io/roadshow #cnr
• Composable Architectures
• Automated Process
• Collaborative Culture
• Structured Platform
Cloud-Native is
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Microservices
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Microservices
• Functions as a Service a.k.a. “serverless”
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Microservices
• Functions as a Service a.k.a. “serverless”
• Migration from Monolith to µServices
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Microservices
• Functions as a Service a.k.a. “serverless”
• Migration from Monolith to µServices
• Spring Boot
pivotal.io/roadshow #cnr
Your architecture plays a key role in your operational maturity.
Architecture Process Culture Platform
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Test Driven Development
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Test Driven Development
• Continuous Delivery
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Test Driven Development
• Continuous Delivery
• Automated Software Delivery Life Cycle (SDLC)
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Automate integration tests.
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Automate the path to production.
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Increase velocity and reduce risk with frequent, small batch sizes.
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Choose architectures that are less likely to resist automation.
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Devops
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Devops
• C.A.L.M.S.
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Devops
• C.A.L.M.S.
• Site Reliability Engineering (SRE)
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Devops
• C.A.L.M.S.
• Site Reliability Engineering (SRE)
• Customer Reliability Engineering (CRE)
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Kelsey Hightower, Google
“Devops is group therapy for inefficient tools.”
pivotal.io/roadshow #cnr
Architecture Process Culture Platform• Collaboration
• Automation
• Learning
• Measuring
• Sharing
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Ben Treynor, Founder of Google’s Site Reliability Team
“Site Reliability Engineering is what happens when you ask a software engineer to design an operations function.”
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Dave Rensin, Director of Google Customer Reliability Engineering
“Customer Reliability Engineering’s mission is to create a shared operational fate between Google and our Google Cloud Platform customers.”
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Minimum Viable Platform
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Minimum Viable Platform
• Infrastructure Integration
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Minimum Viable Platform
• Infrastructure Integration
• Service Integration
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
• Minimum Viable Platform
• Infrastructure Integration
• Service Integration
• Polyglot
pivotal.io/roadshow #cnr
Architecture Process Culture Platform• Dynamic DNS, routing, and load balancing
• Automated service discovery and brokering
• Infrastructure automation
• Health management, monitoring, and recovery
• Immutable artifact repository
• Log aggregation
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations ManagerBOSH Release
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
BOSH Release
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
BOSH Release
12 Factor
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
BOSH Release
12 Factor
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
Programmable compute, storage & networking
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
Programmable compute, storage & networking
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
Programmable Infrastructure
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
Programmable Infrastructure
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Pivotal Cloud Foundry Elastic Runtime
Pivotal Cloud Foundry Operations Manager
Spring Boot and Spring Cloud Services
Cloud Provider Interface (CPI)
BOSH Release
12 Factor
• Apigee • Cloud Storage • BigQuery • PubSub • Cloud SQL • Machine Learning APIs • Bigtable • Spanner • Stackdriver
Programmable Infrastructure
Java, .Net, Static, Node, Python, Go, PHP, Ruby, Binary
pivotal.io/roadshow #cnr
Architecture Process Culture Platform
Production should keep promises about resiliency, repeatability, and reliability.
pivotal.io/roadshow #cnr
The Evolution of Cloud-Native
AgileConfig Mgmt
TDD12 Factor
pivotal.io/roadshow #cnr
The Evolution of Cloud-Native
AgileConfig Mgmt
TDD12 Factor
µServicesDevopsCI/CD
Platforms
pivotal.io/roadshow #cnr
The Evolution of Cloud-Native
AgileConfig Mgmt
TDD12 Factor
µServicesDevopsCI/CD
Platforms
ObservabilitySLI/SLOReliabilityAvailability
pivotal.io/roadshow #cnr
Verma et al, “Large-scale cluster management at Google with Borg”
“Almost every task run under Borg contains a built-in HTTP server that publishes information about the health of the task and thousands of performance metrics (e.g., RPC latencies).”
Observability
pivotal.io/roadshow #cnr
Spring Boot Actuator – Health$ curl -s http://my-app/health | jq { "status": "UP", "diskSpace": { "status": "UP", "total": 1056858112, "free": 907612160, "threshold": 10485760 } }
pivotal.io/roadshow #cnr
Spring Boot Actuator – Metrics$ curl -s http://my-app/metrics | jq { "mem": 734352, "mem.free": 459292, "processors": 4, "instance.uptime": 17072859, "uptime": 17078694, "systemload.average": 0.6, "heap.committed": 664064, . . .
pivotal.io/roadshow #cnr
Service Level Indicators are data about the operational characteristics of a service.
SLIs
pivotal.io/roadshow #cnr
Service Level Objectives set reliability expectations based on SLIs.
SLOs
pivotal.io/roadshow #cnr
If a system should be 99.99% available then it can be 0.01% unavailable.
If we have error budget left development can take risks. If not we have to fix it.
SLAs – Error Budgets
pivotal.io/roadshow #cnr
Minimize the amount of errors so we can launch code as fast as possible.
Error Budgets – Aligned Incentives
pivotal.io/roadshow #cnr
Service Level Objective: 99.99% of requests return under 50ms.
The error budget allows for 0.01% of requests to exceed the SLO.
Error Budgets – Latency
pivotal.io/roadshow #cnr
Service Reliability Hierarchy
Monitoring
Incident Response
Post Mortem / Root Cause Analysis
Testing / Release Procedure
Capacity Planning
Development
Product
pivotal.io/roadshow #cnr
Service Reliability Hierarchy
Monitoring
Incident Response
Post Mortem / Root Cause Analysis
Testing / Release Procedure
Capacity Planning
Development
Product
pivotal.io/roadshow #cnr
Service Reliability Hierarchy
Monitoring
Incident Response
Post Mortem / Root Cause Analysis
Testing / Release Procedure
Capacity Planning
Development
Product
pivotal.io/roadshow #cnr
Service Reliability Hierarchy
Monitoring
Incident Response
Post Mortem / Root Cause Analysis
Testing / Release Procedure
Capacity Planning
Development
Product
pivotal.io/roadshow #cnr
Service Reliability Hierarchy
Monitoring
Incident Response
Post Mortem / Root Cause Analysis
Testing / Release Procedure
Capacity Planning
Development
Product
pivotal.io/roadshow #cnr
Service Reliability Hierarchy
Monitoring
Incident Response
Post Mortem / Root Cause Analysis
Testing / Release Procedure
Capacity Planning
Development
Product
pivotal.io/roadshow #cnr
Service Reliability Hierarchy
Monitoring
Incident Response
Post Mortem / Root Cause Analysis
Testing / Release Procedure
Capacity Planning
Development
Product
pivotal.io/roadshow #cnr
Service Reliability Hierarchy
Monitoring
Incident Response
Post Mortem / Root Cause Analysis
Testing / Release Procedure
Capacity Planning
Development
Product
pivotal.io/roadshow #cnr
John Allspaw
“Ways in which things go right are special cases of the ways in which things go wrong.”
pivotal.io/roadshow #cnr
Susan J. Fowler, “Production-Ready Microservices”
“Every µService at Uber should be stable, reliable, scalable, fault tolerant, performant, monitored, documented, and prepared for any catastrophe.”
pivotal.io/roadshow #cnr
A distributed system cannot simultaneously have consistent views of the data at each node and availability of the data at each node if the network becomes partitioned.
The CAP Theorem
pivotal.io/roadshow #cnr
A distributed system cannot simultaneously have consistent views of the data at each node and availability of the data at each node if the network becomes partitioned.
The CAP Theorem
pivotal.io/roadshow #cnr
and availability of the data at each node if the network becomes partitioned.
The CAP Theorem
A distributed system cannot simultaneously have consistent views of the data at each node
pivotal.io/roadshow #cnr
and availability of the data at each node if the
network becomes partitioned.
The CAP TheoremA distributed system cannot simultaneously
have consistent views of the data at each node
pivotal.io/roadshow #cnr
and availability of the data at each node if the
network becomes partitioned.
The CAP TheoremA distributed system cannot simultaneously
have consistent views of the data at each node
Requests aren’t
being served!
pivotal.io/roadshow #cnr
and availability of the data at each node if the
network becomes partitioned.
The CAP TheoremA distributed system cannot simultaneously
have consistent views of the data at each node
Requests aren’t
being served!Unavailable!
pivotal.io/roadshow #cnr
and availability of the data at each node if the
network becomes partitioned.
The CAP TheoremA distributed system cannot simultaneously
have consistent views of the data at each node
pivotal.io/roadshow #cnr
and availability of the data at each node if the
network becomes partitioned.
The CAP TheoremA distributed system cannot simultaneously
have consistent views of the data at each node
Serving requests
like normal!
pivotal.io/roadshow #cnr
and availability of the data at each node if the
network becomes partitioned.
The CAP TheoremA distributed system cannot simultaneously
have consistent views of the data at each node
Serving requests
like normal!Inconsistent!
pivotal.io/roadshow #cnr
Raymond Blum and Rhandeev Singh, “Site Reliability Engineering”
“Data integrity is a function of availability of a given entity over its lifetime. This is analogous to system uptime and even more critical.”
pivotal.io/roadshow #cnr
Raymond Blum and Rhandeev Singh, “Site Reliability Engineering”
“Data availability must be a foremost concern of any data-centric system.”
pivotal.io/roadshow #cnr
Raymond Blum and Rhandeev Singh, “Site Reliability Engineering”
“From the user’s point of view, data integrity without expected and regular data availability is effectively the same as having no data at all.”
pivotal.io/roadshow #cnr
Availability is a User Experience problem.
pivotal.io/roadshow #cnr
Building software like SRE—with a focus on observability, reliability, and availability—makes you cloud-native.
pivotal.io/roadshow #cnr
Ready?
pivotal.io/roadshow #cnr
pivotal.io/roadshow #cnr
Cloud-Native Roadshow Closing
pivotal.io/roadshow #cnr
Wikipedia Article “Operability”
“Operability is the ability to keep an equipment, a system, or a whole industrial installation in a safe and reliable functioning condition, according to pre-defined operational requirements.”
What is operability?
pivotal.io/roadshow #cnr
Kenny Bastani, Pivotal
“A microservice is an application small enough that an engineer new to the source code can reason about it in a day or less.”
Microservice
pivotal.io/roadshow #cnr
The ability to deploy to production whenever the organization chooses without anyone setting themselves on fire.
Continuous Delivery
pivotal.io/roadshow #cnr
Engineer your operations.
SRE Culture
pivotal.io/roadshow #cnr
It doesn’t matter how beautiful your architecture is, how easy deployment is, or how great your culture is if production is a tire fire.
Pivotal Cloud Foundry
pivotal.io/roadshow #cnr
No CEO Ever
“I appreciate the progress you made on not delivering anything.”
Undifferentiated Heavy Lifting
pivotal.io/roadshow #cnr
Unique Business Value is the tools, systems, and processes which improve the unique value your organization provides.
The only thing that matters
pivotal.io/roadshow #cnr
Acacio Cruz and Ashish Bhambhani, “Site Reliability Engineering”
“Provide product development with a platform of SRE-validated infrastructure, upon which thy can build their systems. This platform will have the double benefit of being both reliable and scalable.”
pivotal.io/roadshow #cnr
PCF is the first platform in CRE review.
pivotal.io/roadshow #cnr
Ben Treynor, Founder of Google’s Site Reliability Team
“The SRE Benediction: May the Queries Flow, And the Pagers Remain Silent”
pivotal.io/roadshow #cnr
Cameron Stewart @cws322
Casey West @caseywest
pivotal.io/roadshow #cnr
The Pivotal Cloud Foundry Ecosystem
Pivotal Google Cloud
pivotal.io/roadshow #cnr
You are all cloud-native now.
pivotal.io/roadshow #cnr
You learned how to deliver software like Pivotal and Google.
Read for free: landing.google.com/sre/book.html
Save $100 on registration with code S1P_EVENT_CNR100