deploy microservices in the real world
TRANSCRIPT
Deploying Microservices
Deploying Microservices Russell Perkins5/4/2017
About KenzanCore offeringsApplication Development, Platform as a service, cloud virtualization, platform engineering, consulting services and business transformation.
Primary ClientsMulti billion dollar companies and media/content providers such as Thompson Reuters, Charter & Cablevision
LocationsProvidence (RI), New York (NY), Denver (CO), Los Angeles (CA), and a London presence
Founded in 2004.
We are a software engineering and digital consulting firm that has been helping clients Make Next Possible for over a decade:
Full Service Consulting FirmArchitecture, front and back end development, business analysis and DevTest.
Cloud Virtualization Experts And EnablersAWS, Netflix stack, enterprise architecture and beyond.
DevOps LeadershipPlatform builds, continuous delivery and scalable resourcing.
Veterans of the Media IndustryMigrations, enterprise wide solutions, digital experts and thought leaders.
Employee focused Collaboration, communication and culture are key.
Agenda
● What is CI/CD
● Deployment types
● On prem physical servers
● Calculating health against SLO’s and SLA’s
● Canary Deployments
● Common causes of outages
Continuous IntegrationContinuous Deployments
Of course..But how?
Deployment Pipelines
Central code repositoryAutomated builds
Self-testingAutomated deployment
Deployment PipelinesSimple
Git Push Unit Tests Elastic Beanstalk
Deployment PipelinesComplex
Unit Tests Integration Tests
End to End Tests
Stress Tests
Test AWS account
Git Push
Stable AWS
accountManual Judgment Production
Cattle Not Pets
Pets:Servers or server pairs that are treated as indispensable or unique systems that can never be down. Typically they are manually built, managed, and “hand fed”.
Cattle:Arrays of more than two servers, that are built using automated tools, and are designed for failure, where no one, two, or even three servers are irreplaceable. Typically, during failure events no human intervention is required as the array exhibits attributes of “routing around failures” by restarting failed servers or replicating data through strategies like triple replication or erasure coding.
Types of Deployments
Rolling Deployment
Red / Black DeploymentA/Z, Blue/Green
On-Premise Physical Servers
Cattle Not Pets(again)
Seriously
Kubernetes
Cloud Bursting
Cloud bursting is an application deployment model in which an application runs in a private cloud or data center and
bursts into a public cloud when the demand for computing capacity spikes.
Hybrid Cloud Models
Pilot Light
Design:● Images (AMI’s, containers, ect) are copied to the cloud● Auto Scale Groups update to use latest image● Cluster sizes set to 0
Benefits:● Low overhead costs● Can be activated fairly quickly.
Warm Standby
Design:● Images (AMI’s, containers, ect) are copied to the cloud● Auto Scale Groups update to use latest image● Cluster sizes set to a reasonable number, no less than 2.
Benefits:● Can be activated instantly
Multi-Site
Design:● Images (AMI’s, containers, ect) are copied to the cloud● Auto Scale Groups update to use latest image● Cluster sizes set to a reasonable number, no less than
2.● Some traffic is always directed to the cloud servers.
Benefits:● Always active● Far away regions can use the cloud for reduced latency.
Uptime with SLO’s and SLA’s
Service Level Objectives&
Service Level Agreements
SLO:SLOs are specific measurable characteristics of the SLA such as availability, throughput, frequency, response time, or quality.
SLA:The SLA is the entire agreement that specifies what service is to be provided, how it is supported, times, locations, costs, performance, and responsibilities of the parties involved.
Uptime and Automation
99.9% 8 hrs 45 mins
99.99% 52 mins
99.999% 5 mins
Uptime Percentage Acceptable yearly outages
Traditional Uptime Monitoring
Monitoring via logs
Logging Tools:● AWS CloudWatch● GCP StackDriver
logging● Graylog● ELK Stack
Metrics to track:● HTTP status codes● CPU Usage● Memory● DiskSpace● Network
Canary Deployments
Slow is better
We want to make sure our software works in the real world.
But...
Users are both predictable and unpredictable Different regions and devices may behave differentlySome issues (memory leaks) only appear overtime.
Canary Watcher
Simple script run every 10 mins and monitors health / logs.
Keeps track of the deployment state (10%, 50%, ect)
Automatically remove from LB if an SLO is missed.
Can be run ad hoc
Real Time
Applications can be self aware.
Alerts can trigger removal from a LB or auto rollback
Common Causes of Outages
Common Causes of Outages
● Overload● Retry Spikes● Pets● Monitoring Gaps● Scaling Boundaries● Bad Configuration● Lengthy Startup Times
Want to learn more?Follow us!
@kenzanmedia
www.linkedin.com/company/kenzan-media
techblog.kenzan.com
www.facebook.com/kenzanmedia/