weaveworks at aws re:invent 2016: operations management with amazon ecs

62
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Paul Maddox, Amazon Web Services Alfonso Acosta, Weaveworks December 1, 2016 Operational Management with Amazon ECS CON301

Upload: weaveworks

Post on 12-Jan-2017

187 views

Category:

Technology


0 download

TRANSCRIPT

Deck Guidelines

Paul Maddox, Amazon Web ServicesAlfonso Acosta, WeaveworksDecember 1, 2016Operational Management with Amazon ECSCON301

2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Shared model of operational responsibilityDeploymentAvailabilityCost optimizationScalingSecurityMonitoring & logging

Weaveworks: Networking and monitoring in ECSWeave NetWeave Scope

What to Expect from the Session

What *not* to Expect from the Session

CON302 -Development Workflow with Docker and Amazon ECS (CI/CD)

CON309 -Running Microservices on Amazon ECS (service discovery)

Key Components

Development cluster

Container instanceContainer instanceContainer instance

Production cluster

Container instanceContainer instanceContainer instanceAmazon EC2 Container Service (Amazon ECS)

ContainerContainerVolumeTask definition

Amazon EC2 Container Registry (Amazon ECR)

Component: ECS

AWS is responsible for operations of the cloudYou are responsible for operations in the cloudusing the building blocks provided.

DeploymentSecurityPatchingMonitoringScalingAvailabilityCost Control$ aws ecs create-cluster --cluster-name devAWSCustomer

Component: ECRAWS is responsible for operations of the cloudYou are responsible for operations in the cloudusing the building blocks provided.

DeploymentSecurityCost ControlAWSCustomerMonitoringScalingAvailabilityPatching

Component: Container Instances

Development cluster

Cluster instanceCluster instanceCluster instanceAWS is responsible for operations of the cloudDeploymentCost ControlPatchingMonitoringScalingAvailabilitySecurityAWSCustomerYou are responsible for operations in the cloudusing the building blocks provided.

Component: Container InstancesAn EC2 instance (or collection of)Running DockerWith the open-source ECS agent running

Tip: Use ECS-optimized AMIs

echo ECS_CLUSTER=dev >> /etc/ecs/ecs.config

https://github.com/aws/amazon-ecs-agenthttp://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html

8

Container Instances: Building Blocks ProvidedDeploymentSecurityPatchingMonitoringScalingAvailabilityCost ControlCloudFormationUpdate your AMI, replace instancesCloudWatchAuto Scaling groupReserved InstancesCLISDKsetc...IAMInspectorVPC Flow Logsetc...Spot Fleet

SIMPLY not JUST9

Component: Tasks & Containers

ContainerContainerVolumeAWS is responsible for operations of the cloudDeploymentSecurityPatchingMonitoringScalingAvailabilityLoggingAWSCustomerYou are responsible for operations in the cloudusing the building blocks provided.

10

How Should I Set This Up?

Use the AWS Management Console?

Time-consuming

Error-prone

Not repeatable

How Should I Set This Up?Flex your scripting skills?What happens if my script fails halfway through?How long should I pause?How do I upgrade / roll back?

#!/bin/bashset -e

CLUSTER_NAME=devAMI=ami-c8337dbb

CLUSTER_ID=$(aws ecs create-cluster --cluster-name $CLUSTER_NAME | jq '.cluster.clusterArn);# TODO: Dont forget to add error checks here

aws ec2 run-instances \--instance-type t2.medium \--image-id ami-1924770e \--user-data "echo ECS_CLUSTER=$CLUSTER_NAME >> /etc/ecs/ecs.config"

# ???sleep 120

/12

AWS CloudFormationInfrastructure as Code

This is AliceShe needs to build a new environment.

It needs to be:

A self-contained, deployable unitRepeatableAuditableSelf-documenting

Luckily, Alice knows about CloudFormation

Time to deploy!

alice@macbook:~$ aws cloudformation create-stack --stack-name preprod --template-body file://Users/alice/env.yaml or

Mention Tagging16

Time to update

alice@macbook:~$ aws cloudformation update-stack --stack-name preprod --template-body file://Users/alice/env.yaml or

Mention Change Sets17

When a new environment is required

alice@macbook:~$ aws cloudformation create-stack --stack-name production --template-body file://Users/alice/env.yaml or

AWS CLI

$ aws ecr create-repository --repository-name myapp

{ "repository": { "registryId": 123456789012", "repositoryName": "myapp", "repositoryArn": "arn:aws:ecr:us-east...,"repositoryUri": 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp" }}

ECRCloudFormation (YAML)

Resources:

ECRRepository: Type: AWS::ECR::Repository Properties: Name: myapp

Using ECR

Use AWS CLI to perform docker login

Tip: Use the Amazon ECR Credential Helper for automatic loginshttps://github.com/awslabs/amazon-ecr-credential-helper

$ $(aws ecr get-login)$ docker pull /:

AWS CLI

$ aws ecs create-cluster --cluster-name preprod{ "cluster": { "status": "ACTIVE", "clusterName": preprod", "registeredContainerInstancesCount": 0, "pendingTasksCount": 0 "runningTasksCount": 0, "activeServicesCount": 0, "clusterArn": "arn:aws:ecs:us-east }}ECS ClusterCloudFormation (YAML)

Resources:

ECSCluster: Type: AWS::ECS::Cluster Properties: ClusterName: preprod

ECS Container InstancesHighly available architecture, distributed across multiple Availability Zones

VPC with public and private subnets

Application Load Balancer with path based routing for inbound traffic

NAT gateways for outbound traffic

Auto Scaling group of container instances

CloudWatch Logs for centralized container logging

Inbound Traffic$ curl -v https://api.example.com/v1/products/1> GET / HTTP/1.1> Host: api.example.com> User-Agent: curl/7.43.0> Accept: */*Incoming HTTP/HTTPS traffic comes in via the Application Load Balancer (ALB) in public subnets

The ALB uses path based routing to route /products/* to the container instances in private subnets running our products service

Supports dynamic host port mapping, allowing multiple containers of the same type on each host

Quite a lot of text23

Outbound TrafficOur container instances are in private subnets, with no direct internet access

At some point, they might need access to external services

NAT gateways provide a highly scalable and available solution

Logging

ECS integrates directly with CloudWatch Logs (as well as others)

Centralized collection container logs

Search, filter, and alert on log conditions

(more to come later)

tl;dr - ECS Reference Architecture on GitHub

https://github.com/awslabs/ecs-refarch-cloudformation

Cost Optimization

Reserved Instances

Up to 75% Savings*

Use Auto Scaling groups

Reserve ECS container instances when you have known baseline capacity requirements.

Use On-Demand pricing for capacity peaks.* Dependent on specific AWS service, size/type, and region

Spot Instances

Up to 90% Savings*

Use Spot Fleet to maintain instance availability and define cluster based on required CPU/memory.

* Compared to On-Demand price based on specific EC2 instance type, region, and Availability Zone

29

Multiple ECS ClustersCreating multiple ECS clusters is easy, and often more cost efficient. Consider availability and compute requirements.

Example: Development ClusterSpot FleetExample: Production ClusterAuto Scaling group with Reserved Instances for baseline and On-Demand for capacity peaksExample: Batch Processing Cluster Spot Fleet of GPU Instances

Scaling

Scaling ECS Container Instances Automatically

MinDesiredScale out as neededMaxUse Auto Scaling groups

Set Auto Scaling group min, max, desired

Scale in and out based on CloudWatch alarms

Scaling ECS Container Instances Automatically

TipUse the ECS cluster MemoryReservation CloudWatch metric

Tutorial: Scaling Container Instances with CloudWatch Alarms

Application Auto Scaling for ECS Services

34

Application Auto Scaling for ECS Services

Security

Patching ECS Container InstancesECSLaunchConfiguration: Type: AWS::AutoScaling::LaunchConfiguration Properties: ImageId: ami-1924770e

ECSAutoScalingGroup: Type: AWS::AutoScaling::AutoScalingGroup Properties: MinSize: 2 MaxSize: 8 DesiredCapacity: 2 AutoScalingRollingUpdate: MinInstancesInService: 2 MaxBatchSize: 2 PauseTime: PT15M WaitOnResourceSignals: trueEnsure you have an AutoScalingRollingUpdate policy on your Auto Scaling group

Update the AMI in your CloudFormation template

aws cloudformation update-stack

Let CloudFormation perform a rolling update to your ECS container instances

Patching Containers

Minimal Containers

Use the smallest FROM base container to minimize surface attack

FROM scratch is ideal for Go and other languages that compile a (near) static binary

IAM RolesIAM roles for container instances:Bound to the ECS container instanceApplies to all containers running on the hostPulling images from ECRCloudWatch Logs

IAM roles for tasks:Bound to specific ECS tasksTask-specific access to AWS services

Tip Use principle of least privilege prefer IAM roles for tasks where applicable

Security is #1 priority40

Environment Variables

Quick and easy

Configuration stored in task definition (or passed in)

Version in immutable definition; easy rollback

Good for configuration items

Bad for secrets (API keys, passwords, etc.)Configuration & Secrets Management

KMS + S3 / DynamoDB

Use environment variables to provide pointer to encrypted data in S3/DynamoDB

Use KMS or AWS encryption clients to encrypt secrets at rest

Use VPC endpoints, IAM policies, and IAM roles to restrict decryptionConfiguration & Secrets Management

Monitoring & Logging

Monitoring with CloudWatch

Monitoring with CloudWatch

Centralized Logging with CloudWatch Logs{ "image": nginx:latest", ... "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": nginx", "awslogs-region": "us-east-1" } }{Defined within the task definition

Available log driversawslogsfluentdgelfjournaldjson-filesplunkSyslog

Submit a pull request on ECS agent GitHub repo if you would like others

Centralized Logging with CloudWatch Logs

Mention expiring the logs47

Tip: Use Metric Filters with CloudWatch Logs

5

SummaryAWS is responsible for operations of the cloud. You are responsible for operations in the cloud using the building blocks provided.

Networking and MonitoringWeave Net and Weave Scope

Whats weave, goalHow it complements ECSAMIs / CloudFormation50

Weave NetOverlay network between hostsFirst container networking solutionAutomatic DNS-based service discoveryAutomatic IP allocation (IPAM)Minimum overhead (VxLan)Gossip protocol to share updates (no explicit DB)Multi DCEncryption

Each container gets an IPAlso: Multicast, AWS VPCData-center agnostic51

Weave Net: Overlay Network

Each hexagon is a container Each container gets its own IP (no port clashes)Non-fully connected topologyMulti cloud, multi region,even multiorchestratorRouting and naming information is propagated through gossip without a central DBtolerant to partitions

52

Weave Net: Service Discovery

All the containers are created with name NAMEWeave creates DNS records for each container and propagates it through GossipA client can access the containers by that name and requests will be load balanced, randomly client-side53

Weave Net: Service Discovery

Sample 2-tier appication54

Weave Net on ECS??

Explain ECS infrastructure

How is service discovery done?* Statically (list of IPs associated to each service)ELBALB

This requires management 55

Weave Net on ECS

This is what we provide in the AMIs and Cloud FormationThis is how we solve service discovery with Weave NetExpain how:Each node is equipped with Weave Router/DNS, propagating routing and DNS informationTraffic itself doesnt normally go through Weave: VxLanHow Weave Proxy intercepts calls 56

Weave Scope

Visualization monitoring and control solutionNO INSTRUMENTATION!!!Weave Scope describes and lets you interact with your microservice application without any instrumentation, you just need to run an agent (probe) in each of your hosts57

Weave Scope

Scope Probe (host 1)Scope Probe (host 2)Scope Probe (host n)Scope AppReports (CRDT-like semantics)

Controls

Weave Scope standalone is open source58

Weave CloudScope Probe (host 1)Scope Probe (host 2)Scope Probe (host n)

https://cloud.weave.works

Weave cloud hosts Scope for you* Providing enterprise features: authentication, team managementZero management and firewall problems

59

Weave Net + Scope on ECS

https://cloud.weave.works

The Weave AMIs and Cloud formation Templates also come equipped with Weave Scope60

Thank you!

Remember to complete your evaluations!