amazon ecs with docker | aws public sector summit 2016
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Chad Schmutzer, Solutions Architect, AWS
June 21, 2016
Amazon ECS with DockerIt’s All About Containers
Agenda
Why containers?
What is Docker?
Amazon EC2 Container Service (Amazon ECS)
• Cluster management
• Benefits
• Running services
Why Containers?
The Problem
Different application stacks
Different hardware deployment environments
How to run all applications across different environments?
How to easily migrate from one environment to another?
Static website
Web front end
Background workers
User DB
Analytics DB
Queue
Develop-ment VM
QA server
Single prod
server
On-site cluster
Public cloud
Contributor’s laptop
Customer servers
Static website
Webfront end
Background workers
User DB
Analytics DB
Queue
Develop-ment VM
QA server
Single prod
server
On-site cluster
Public cloud
Contributor’s laptop
Customer servers
The Solution
Unit of software delivery
Lightweight, portable, consistent
Deploy and run everywhere
Deploy and run anything
Containers
User space running on OS kernel
Little overhead
Guest OS choices limited to host OS kernel
Been around for a while: chroot, FreeBSD jails, Solaris containers, OpenVZ, LXC
VMs vs. Containers
VMs Containers
https://www.docker.com/what-docker
Container Advantages
Portable
Flexible
Fast
EfficientServer
Guest OS
Bins/Libs Bins/Libs
App2App1
Benefits
Portable runtime application environment
Package application and dependencies in a single artifact
Run different application versions (different dependencies) simultaneously
Faster development & deployment cycles
Better resource utilization
Use Cases
Consistent environment between development & production
Service-oriented architectures / microservices
Short lived workflows
Isolated environments for testing
Services Evolve to Microservices
Monolithic Application
Order UI User UI Shipping UI
OrderService
UserService
ShippingService
DataAccess
Host 1
Service A
Service B
Host 2
Service B
Service D
Host 3
Service A
Service C
Host 4
Service B
Service C
Containers Are Natural for Microservices
Simple to model
Any app, any language
Image is the version
Test & deploy same artifact
Stateless servers decrease change risk
What is Docker?
Docker
Lightweight container virtualization platform
Tools to manage and deploy your applications
Licensed under the Apache 2.0 license
Built by Docker, Inc.
Docker Engine
Docker daemon
Docker client
Image source - https://docs.docker.com/engine/introduction/understanding-docker/
Client DOCKER_HOST Registry
docker builddocker pull
docker run
Docker daemon
Containers Images
Amazon ECS: Cluster Management
Scheduling
Server
Guest OS
Bins/Libs Bins/Libs
App2App1
Scheduling One Resource Is Straightforward
Scheduling a Cluster Is Hard
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
General Cluster Management: Resource Management
DockerTask
EC2 Instance
Container
DockerTask
EC2 Instance
Container
TaskContainer
Docker
EC2 Instance
TaskContainer
AZ 1 AZ 2
General Cluster Management: Scheduling
DockerTask
EC2 Instance
Container
DockerTask
EC2 Instance
Container
TaskContainer
Docker
EC2 Instance
TaskContainer
AZ 1 AZ 2
Amazon ECS: Resource Management
DockerTask
Container Instance
Container
TaskContainer
DockerTask
Container Instance
Container
TaskContainer
DockerTask
Container Instance
Container
TaskContainer
AZ 1 AZ 2
Cluster Management Engine
Amazon ECS: Agent Communication
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
AZ 1 AZ 2
Cluster Management Engine
Agent Communication Service
Amazon ECS: Key/Value Store
DockerTask
Container Instance
Container
ECS Agent
ELB
Internet
ELB
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
AZ 1 AZ 2
Key/Value Store
Cluster Management Engine
Agent Communication Service
Amazon ECS: APIs
DockerTask
Container Instance
Container
ECS Agent
ELB
Internet
ELB
User / Scheduler
API
Cluster Management Engine
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
AZ 1 AZ 2
Key/Value Store
Agent Communication Service
Amazon ECS: Scheduling
DockerTask
Container Instance
Container
ECS Agent
ELB
Internet
ELB
User / Scheduler
API
Cluster Management Engine
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
AZ 1 AZ 2
Key/Value Store
Agent Communication Service
Amazon ECS: Benefits
Easily Manage Clusters for Any Scale
Nothing to run
Complete state
Control and monitoring
Scale
Scalable
Flexible Container Placement
Applications
Batch jobs
Multiple schedulers
Designed for Use with Other AWS Services
Elastic Load Balancing
Amazon Elastic Block Store
Amazon Virtual Private Cloud
Amazon CloudWatch
AWS Identity and Access Management
AWS CloudTrail
Extensible
Comprehensive APIs
Custom schedulers
Open source agent and CLI
Amazon ECS
DockerTask
Container Instance
Amazon ECS
Container
ECS Agent
ELB
Internet
ELB
User / Scheduler
API
Cluster Management Engine
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
DockerTask
Container Instance
Container
ECS Agent
TaskContainer
AZ 1 AZ 2
Key/Value Store
Agent Communication Service
Amazon ECS: Running Services
Task Definitions
Volume Definitions
Container Definitions
Key Components: Task Definitions
Key Components: Task Definitions
Tasks
Shared Data Volume
Containers
scheduleContainer Instance
Volume Definitions
Container Definitions
Unit of work
Grouping of related containers
Run on container instances
Tasks
Create a Service
Good for long-running applications and services
Create Service
Load balance traffic across containers
Automatically recover unhealthy containers
Discover services
Elastic Load Balancing
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
Scale Service
Scale up
Scale down
Elastic Load Balancing
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
Update Service
Deploy new version
Drain connections
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
new new new
Elastic Load Balancing
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
old old old
Update Service (cont.)
Deploy new version
Drain connections
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
new new new
Elastic Load Balancing
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
old old old
Update Service (cont.)
Deploy new version
Drain connections
Elastic Load Balancing
Shared Data Volume
Containers
Shared Data Volume
Containers
Shared Data Volume
Containers
new new new
Thank You!
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Chris MalekAssociate Director of Academic Development
June 21, 2016
access.caltech in AWS
• California Institute of Technology• Pasadena, CA• Top tier university: #1 in Times Higher
Education world rankings• Small: 6400 people (1000 undergrads,
1200 grads, 300 faculty, 3900 staff)• 3:1 undergrad-faculty ratio• JPL: Founded by Caltech in 30’s,
managed for NASA since 1958
• Part of IMSS, the central IT org• Lean, 6 people, all developers, even management• 35 years of collective systems administration experience• 50 years of collective development experience• ~130 websites and web applications, including www.caltech.edu and
the campus intranet portal• Much smaller than counterparts at peer institutions
Our job: Enable research and instruction through software
Academic Development Services
Upper management, operations and developers pro-cloudMove all on-premise services to cloud within 3 yearsWe've been production in AWS since 2010Many Caltech production workloads currently in AWS
Strategy: DevOps, public data, low-hanging fruit, Field of Dreams model
Cloud Adoption (2010-present)
Leverage AWS scale, expertise, and capabilities
• AZs, APIs, Infrastructure as code• AWS better than us at many things• AWS allows us to do things we can’t on-premises• Don’t have to run low level services
Allows us to concentrate on how we add value
Why cloud?
access.caltech in AWS
• Distributed system comprised of many interconnected systems.
• Authenticating proxy server with around 90 applications behind it
• Covers most of the academic and administrative apps people might use
Two parts: core system and proxied apps
access.caltech: Caltech’s intranet portal
• Needs to be highly-available• Be performant at variable loads
• Typical traffic: 5-10 hits/s• Must scale to 800 hits/s during registration
• Protect and secure proxied apps and data• Certain core components should stay up during disaster• Be able to easily deploy new versions of core software• Need many DEV, TEST, QA and production support envs
access.caltech: key requirements
user
AUTH SERVICE
REDIS
CONTROL SERVICEhaproxy
home
admin
prefs
my_account
loadapp
challenge_questions_api
LDAP
LEGACY LDAP
LDAP
LEGACY LDAP
Active Directory
mail servers
mailman API
mailman
MySQL
PROXY SERVERS
CORE SERVER
LDAP SLAVES LDAP MASTERS AD
~90 PROXIED APPS
ON-PREM ARCHITECTURE
user
AUTH SERVICE
REDIS
CONTROL SERVICEhaproxy
home
admin
prefs
my_account
loadapp
challenge_questions_api
LDAP
LEGACY LDAP
LDAP
LEGACY LDAP
Active Directory
mail serversmailman API
mailman
MySQL
PROXY SERVERS
CORE SERVER
LDAP SLAVES LDAP MASTERS AD
~90 PROXIED APPS
CLOUD MIGRATION: PHASE 1
• Move access.caltech core PROD to VPC in AWS• Continuous deployment system based on Jenkins,
Docker containers, and Consul• Be able to build DEV and TEST environments in AWS• Proxy from AWS to on-premises apps via VPN tunnel
Later phases: move proxied apps individually to AWS
access.caltech in AWS: phase 1
ELB
NAT
NAT
RDS
AWS
VPN
VPC 1
AZ1ELASTICACHE
(REDIS)
AZ2
PROXY CORE
CONSUL
CONSUL
PROXY CORE
ELASTICACHE (REDIS)RDS
LDAPMASTER
LDAPSLAVE
VPC 2
AZ1
LDAPSLAVE
LDAPMASTER
LDAPSLAVE
AZ2
LDAPSLAVE
ELB ELB
ELASTICSEARCH
~90 PROXIED APPS
CAL TECH
PEERING
ECS MACHINE
AWS SERVICE
EC2 INSTANCE
PRIVATE
PUBLIC
SUBNETS
JENKINS
Need a more rapid, consistent deployment mechanism• Our current process takes weeks to months to get new versions
to production, and deployments are rocky• Raw vs cooked. Cooked: build as much before deployment as
possible. • encapsulation of entire OS as a software artifact• guaranteed same code and OS build for DEV, TEST, PROD• easily replicate whole systems architectures in DEV
Docker image community
Why Docker?
Deployment pipeline (Jenkins)
Build Test Image Run Tests Build and Push final image
Deploy to QA infrastructure
Run integration tests
HumanReview
Deploy to prod infrastructure
Run integration tests
Deploy to prod support
infrastructure
QA P ipeline
Developerpushescode
Promote to Prod pipeline
No orchestration infrastructure to run• Container scheduling and placement are implicitly at cloud scale
— no need to plan for HA, throughput, etc.• Built in monitoring via CloudWatch and ECS event stream• Powerful ECS command line tools
AWS API for managing tasks and servicesAWS service integration, especially for load balancers and VPCsECS repositories
Why ECS? (vs Docker Swarm) PROS
Painful to debug container launch failsdocker version lags behind current, sometimes significantlyNo equivalent to swarm overlay networkDifferent strategies for deploying containers
• Swarm has spread, binpack and random• ECS has task and service strategies, which both seem to be like
Swarm’s “spread” strategy• Although ECS allows you to develop your own strategies via
custom schedulers via StartTask API
Why ECS? (vs Docker Swarm) CONS
The entire container is your software• not just your own code.• OS + code becomes a software artifact
Development team will need to have or develop systems experience
• Or work closely with systems people
Probably need to remediate your code in order to take advantage of the container environment
Docker/ECS Challenges
Containers are truly disposable and anonymous• Figuring out which container is having issues is interesting• Entire OS is destroyed when re-deploying containers
Containers are not VMs• No ssh interface to containers• Containers are minimal systems: no ssh, no cron, no syslogd, etc.
Need to change your architecture and practicesLogging, monitoring
Docker/ECS Challenges, cont.
Thank You!