aws re:invent 2016| gam401 | riot games: standardizing application deployments using amazon ecs and...

71
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Adam Rozumek November 28, 2016 Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform GAM401

Upload: amazon-web-services

Post on 16-Apr-2017

1.160 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Adam Rozumek

November 28, 2016

Riot Games: Standardizing

Application Deployments Using

Amazon ECS and Terraform

GAM401

Page 2: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

RIOT GAMES | GAM 401Standardizing Application

Deployments Using Amazon ECS

and Terraform

Page 3: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

SYSTEMS ENGINEER

ADAM ROZUMEK

INTRODUCTIONSWHO AM I?

Page 4: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON ECS. TERRAFORM. VIDEO GAMES.

WHAT TO EXPECT

ECS CONSOLIDATION WINS & LESSONS

Adapting existing deployments & infrastructure

TERRAFORM ENCAPSULATION STRATS

Object Oriented Development Operations

MULTI-GAME MODULAR SERVICE DESIGN

A different kind of scaling

1

2

3

Page 5: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON ECS. TERRAFORM. VIDEO GAMES.

WHAT TO EXPECT

ECS CONSOLIDATION WINS & LESSONS

Adapting existing deployments & infrastructure

TERRAFORM ENCAPSULATION STRATS

Object Oriented Development Operations

MULTI-GAME MODULAR SERVICE DESIGN

A different kind of scaling

1

2

3

Page 6: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON ECS. TERRAFORM. VIDEO GAMES.

WHAT TO EXPECT

ECS CONSOLIDATION WINS & LESSONS

Adapting existing deployments & infrastructure

TERRAFORM ENCAPSULATION STRATS

Object Oriented Development Operations

MULTI-GAME MODULAR SERVICE DESIGN

A different kind of scaling

1

2

3

Page 7: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON ECS. TERRAFORM. VIDEO GAMES.

WHAT TO EXPECT

ECS CONSOLIDATION WINS & LESSONS

Adapting existing deployments & infrastructure

TERRAFORM ENCAPSULATION STRATS

Object Oriented Development Operations

MULTI-GAME MODULAR SERVICE DESIGN

A different kind of scaling

1

2

3

Page 8: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform
Page 9: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform
Page 10: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

7.5MILLION

PEAK CONCURRENT

PLAYERS

100MILLION

MONTHLY ACTIVE

PLAYERS

MORE THAN

27MILLION

DAILY ACTIVE

PLAYERS

MORE THAN MORE THAN

2016 LEAGUE OF LEGENDS STATS

Page 11: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

DATA PRODUCTS & SERVICESOUR MISSION

Empower teams at Riot to make timely, data-informed products by maintaining a

scalable and reliable data platform

AWS re:Invent 2015 | (GAM303) Riot Games: Migrating Mountains of Big Data to AWS

Sean Maloney

Page 12: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PROBLEM

Page 13: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PROBLEMSCALING TOTAL OWNERSHIP ON AWS

Total ownership

We want to empower developers to:

• Provision their own infrastructure

• Execute their own deployments

• Monitor their own metrics

Page 14: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PROBLEMSCALING TOTAL OWNERSHIP ON AWS

Resource attribution

• Who owns these EBS volumes?

• What applications depend on these security groups?

• Can these AMIs be deleted?

Page 15: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PROBLEMSCALING TOTAL OWNERSHIP ON AWS

Security

Page 16: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PROBLEMSCALING TOTAL OWNERSHIP ON AWS

Security

• Auditing is important, but it’s reactive

• Operational time sink

Security Monkey AWS Trusted Advisor

Page 17: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PROBLEMSCALING TOTAL OWNERSHIP ON AWS

Enforcing conventions

• Common syntax for tags

• Simplify reservations

• Standardize networks

AWS re:Invent 2014 | (GAM304) How Riot Games re:Invented Their AWS Model

Jonathan McCaffrey

Marty Chong

Page 18: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PROBLEMSCALING TOTAL OWNERSHIP ON AWS

Enforcing conventions

• Common syntax for tags

• Simplify reservations

• Standardize networks

AWS re:Invent 2014 | (GAM304) How Riot Games re:Invented Their AWS Model

Jonathan McCaffrey

Marty Chong

Page 19: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PROBLEMSCALING TOTAL OWNERSHIP ON AWS

Amazon EC2

Container Service Terraform

Page 20: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS

Page 21: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

• Dockerfiles capture application dependencies

Page 22: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

• Dockerfiles capture application dependencies

• Common use cases have great community support

Page 23: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

• Dockerfiles capture application dependencies

• Common use cases have great community support

• Profit from our own engineering community

Page 24: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

Page 25: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

Embrace the abstraction

• Plan for failure at all levels

• Avoid manual intervention whenever possible

• It’s ephemeral all the way down

Page 26: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

Scheduling is hard

We need to:

• Quickly and “fairly” run tasks

• Prevent resource conflicts

• Provide reasonable fault tolerance

Page 27: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

We need AWS hardware

• Enable total ownership of the AWS resources backing our containers

• Avoid the security, resource attribution, and convention degradation pitfalls

Page 28: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

HISTORY

Page 29: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

E

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

COS

First iteration (2013)

• Traditional ASGs

• AMIs configured to launch a single Docker container

• Managed by Netflix’s Asgard

Page 30: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

E

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

COS

First iteration (2013)

• Easy to adopt

• Familiar AWS concepts

Page 31: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

E

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

COS

First iteration (2013)

• No increased resource utilization

• No elasticity improvements

Page 32: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS

Second iteration (2013)

• Third party container orchestration

• Production scale had unexpected challenges

Page 33: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CONTAINERS STANDARDIZED APPLICATION UNITS… ON ECS!

Third iteration: Amazon EC2 Container Service

• ECS AMI provides all necessary software

• Designed with AWS integration in mind

• Free!

Page 34: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON EC2 CONTAINER SERVICEKEY FEATURE TIMELINE

Nov 2014

ECS ANNOUNCEDRe:Invent 2014

Page 35: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON EC2 CONTAINER SERVICEKEY FEATURE TIMELINE

Nov 2014 April 2015 Dec 2015 August 2016

ECS ANNOUNCEDRe:Invent 2014

ECS GENERALLY

AVAILABLE

ECR & NEW REGIONSECS becomes available in the final

missing region we need (Frankfurt)

for our globally deployed

applications

Page 36: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON EC2 CONTAINER SERVICEKEY FEATURE TIMELINE

Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016

ECS ANNOUNCEDre:Invent 2014

ECS GENERALLY

AVAILABLE

ECR & NEW REGIONSECS becomes available in the final

missing region we need (Frankfurt)

for our globally deployed

applications

SERVICE

SCALINGAutomatic task count

scaling based on

CloudWatch metrics

introduced

Page 37: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON EC2 CONTAINER SERVICEKEY FEATURE TIMELINE

Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016

ECS ANNOUNCEDRe:Invent 2014

ECS GENERALLY

AVAILABLE

ECR & NEW REGIONSECS becomes available in the final

missing region we need (Frankfurt)

for our globally deployed

applications

SERVICE

SCALINGAutomatic task count

scaling based on

CloudWatch metrics

introduced

TASK SPECIFIC

IAM ROLESUnlocked a lot of cluster

sharing potential

Page 38: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

AMAZON EC2 CONTAINER SERVICEKEY FEATURE TIMELINE

Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016

ECS ANNOUNCEDre:Invent 2014

ECS GENERALLY

AVAILABLE

ECR & NEW REGIONSECS becomes available in the final

missing region we need (Frankfurt)

for our globally deployed

applications

SERVICE

SCALINGAutomatic task count

scaling based on

CloudWatch metrics

introduced

TASK SPECIFIC

IAM ROLESUnlocked a lot of cluster

sharing potential

APPLICATION

LOAD BALANCERSSeveral key improvements

for ECS

Page 39: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

BEYOND ECSINFRASTRUCTURE AS CODE

• At scale, orchestrating infrastructure in a consistent, reproducible way is key

Page 40: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

BEYOND ECSINFRASTRUCTURE AS CODE

• At scale, orchestrating infrastructure in a consistent, reproducible way is key

Total ownership

Page 41: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

TERRAFORM

Page 42: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

TERRAFORMINFRASTRUCTURE AS OBJECT-ORIENTED CODE

• Intelligent & flexible flow control

Page 43: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

TERRAFORMINFRASTRUCTURE AS OBJECT-ORIENTED CODE

• Intelligent & flexible flow control

• Configuration drift managed

Page 44: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

TERRAFORMINFRASTRUCTURE AS OBJECT-ORIENTED CODE

• Intelligent & flexible flow control

• Configuration drift managed

• AWS resources documented

Page 45: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

TERRAFORMINFRASTRUCTURE AS OBJECT-ORIENTED CODE

• Intelligent & flexible flow control

• Configuration drift managed

• AWS resources documented

• Share common use cases

Page 46: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

VPC NATGateway

Route 53Hosted Zone

RouteTables

VPNGateway

VPC InternetGateway

Application Subnets Tools

Instances

Instances

Instances

Availability

Zone C

Availability

Zone B

Availability

Zone A

TERRAFORMINFRASTRUCTURE AS OBJECT-ORIENTED CODE

Page 47: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

TERRAFORMINFRASTRUCTURE AS OBJECT-ORIENTED CODE

Simplified infrastructure definitions

Page 48: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

CASE STUDIES

Page 49: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

ECS CLUSTERTERRAFORM BUILDING BLOCKS

VPC NATGateway

Route 53Hosted Zone

RouteTables

VPNGateway

VPC InternetGateway

Application Subnets Tools

Instances

Instances

Instances

Availability

Zone C

Availability

Zone B

Availability

Zone A

ECS Cluster

Auto Scaling Group

Security Group

Security Group

Security Group

Instance Instance Instance Instance Instance Instance

Launch Configuration User DataIAM Role CloudWatch

Alarms

Page 50: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

ECS CLUSTERTERRAFORM BUILDING BLOCKS

Module 1

Module 2

Module 3

Page 51: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

ECS CLUSTERTERRAFORM BUILDING BLOCKS

Page 52: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

MICROSERVICESWITHOUT SERVICE ENDPOINTS

ECS Cluster

Autoscaling Group

Security Group

Security Group

Security Group

Instance Instance Instance Instance Instance Instance

Launch Configuration User DataIAM Role CloudWatch

Alarm

ECS Service Task Definition

CloudWatch Alarms IAM Role

Page 53: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

MICROSERVICESWITH SERVICE ENDPOINTS

ECS Cluster

ECS Service Task Definition

CloudWatch Alarms IAM Role

Application Load Balancer

Monitoring

CloudWatch Alarms SNS Topics

Security Group

Security Group

Listeners

Target GroupsRoute 53

Page 54: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

PERSISTENT DATALOSE ECS HOSTS WITHOUT LOSING DATA

ECS Cluster

ECS Service

Application Load Balancer

Monitoring

CloudWatch Alarms SNS Topics

Security Group

Security Group

Listeners

Target GroupsRoute 53

Attachment Group

EBS EBS EBS

Elastic Network Interface

Elastic IP

Page 55: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONS LEARNED

Page 56: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Socialize your templates & take advantage of the community!

• Terraform community modules

• Terraform AWS blog

Page 57: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Break apart your stacks

ECS Cluster

Page 58: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Break apart your stacks, but don’t overdo it

ECS Service

ALB

SNS Topics

Route 53

IAM Role

Task Definition

CloudWatch Alarms

Page 59: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Break apart your stacks, but don’t overdo it

• Use modules to reduce code duplication

CloudWatch Alarms SNS Topics

ECS Cluster ALB

Page 60: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Use remote state files

Page 61: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Be liberal with your cluster provisioning

• Don’t risk resource contention in production

• With good orchestration, additional clusters != additional operational overhead

Page 62: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Tag everything all of the time

• Keep your tags organized in your Terraform templates

• Have top level variables that get applied to every resource

• Create a tag for every dimension that is useful to your business

Page 63: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Centralize your logs

or

Amazon

CloudWatch

Logs

Page 64: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Stay up to date with release blogs and application updates

• ECS updates on the AWS blog

• Terraform’s open source GitHub repository changelog

Page 65: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Capture your AMI generation process

Page 66: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Profile your memory requirements, monitor for scheduling issues

Page 67: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

LESSONSWHAT WORKED FOR US

• Take advantage of flow control in Terraform

• Use the Terraform graph tool to visualize your dependencies

Page 68: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

RIOT ENGINEERING BLOGTHESE PROBLEMS AND MORE

https://engineering.riotgames.com/

Riot engineering

How we use data

http://na.leagueoflegends.com/en/tag/insights

Page 69: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

JOIN US!

ENJOY NOMS.

DRINK DRINKS.

PLAY GAMES.

Tuesday

November 29th, 2016

6:30 – 10:30PM

Paiza Lounge

The Venetian

36th Floor

Page 70: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

Thank you!

Page 71: AWS re:Invent 2016| GAM401 | Riot Games: Standardizing Application Deployments Using Amazon ECS and Terraform

Remember to complete

your evaluations!