how netflix’s tools can help accelerate your start-up (svc202) | aws re:invent 2013

79
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Asgard, Aminator, Simian Army and More: How Netflix’s proven Tools Can Help Accelerate Your Startup Adrian Cockcroft, Ruslan Meshenberg, Netflix November 2013

Upload: amazon-web-services

Post on 06-May-2015

1.746 views

Category:

Technology


2 download

DESCRIPTION

You're on the verge of a new startup and you need to build a world-class, high-scale web application on AWS so it can handle millions of users. How do you build it quickly without having to reinvent and re-implement the best-practices of large successful Internet companies? NetflixOSS is your answer. In this session, we’ll cover how an emerging startup can leverage the different open source tools that Netflix has developed and uses every day in production, ranging from baking and deploying applications (Asgard, Aminator), to hardening resiliency to failures (Hystrix, Simian Army, Zuul), making them highly distributed and load balanced (Eureka, Ribbon, Archaius) and managing your AWS resources efficiently and effectively (Edda, Ice). You’ll learn how to get started using these tools, learn best practices from engineers who actually created them, so, like Netflix, you can too unleash the power of AWS and scale your application processes as you grow.

TRANSCRIPT

Page 1: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Asgard, Aminator, Simian Army and More: How Netflix’s proven

Tools Can Help Accelerate Your Startup

Adrian Cockcroft, Ruslan Meshenberg, Netflix

November 2013

Page 2: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Congratulations, Your Startup Got Funding!

• More developers

• More customers

• Higher availability

• Global distribution

• No time….

Growth

Page 3: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

AWS Zone A

Your Architecture Looks like This:

Web UI / Front End API

Middle Tier

RDS/MySQL

Page 4: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

And It Needs to Look More like This…

Cassandra Replicas

Zone A

Cassandra Replicas

Zone B

Cassandra Replicas

Zone C

Regional Load Balancers

Cassandra Replicas

Zone A

Cassandra Replicas

Zone B

Cassandra Replicas

Zone C

Regional Load Balancers

Page 5: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Inside Each AWS Zone:

Microservices and Denormalized Data Stores

API or Web Calls

Memcached

Cassandra

Web Service

S3 Bucket

Page 6: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

We’re here to help you get to global scale…

Apache Licensed Cloud Native OSS Platform

http://netflix.github.com

Page 7: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Technical Indigestion – What Do All These Do?

Page 8: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Updated Site – Make It Easier to Find What You Need

Page 9: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Getting Started with NetflixOSS Step by Step

1. Set up AWS accounts to get the foundation in place

2. Security and access management setup

3. Account management tools: Asgard for deploy & Ice for cost monitoring

4. Build tools: Aminator to automate baking AMIs

5. Service registry and searchable account history: Eureka & Edda

6. Configuration management: Archaius dynamic property system

7. Data storage: Cassandra, Astyanax, Priam, EVCache

8. Dynamic traffic routing: Denominator, Zuul, Ribbon, Karyon

9. Availability: Simian Army (Chaos Monkey), Hystrix, Turbine

10. Developer productivity: Blitz4J, GCViz, Pytheas, RxJava

11. Big data: Genie for Hadoop PaaS, Lipstick visualizer for Pig

12. Sample apps to get started: RSS Reader, ACME Air, FluxCapacitor

Page 10: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

AWS Account Setup

Page 11: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Flow of Code and Data between AWS Accounts

Production

Account

Archive Account

Auditable

Account

Dev Test Build

Account

AMI

AMI

Backup

Data to S3

Weekend

S3 restore

New Code

Backup

Data to S3

Page 12: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Account Security

• Protect accounts – Two-factor authentication for primary login

• Delegated minimum privilege – Create IAM roles for everything

• Security groups – Control who can call your services

Page 13: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Cloud Access Control

www-prod

• Userid wwwprod

Dal-prod

• Userid dalprod

Cass-prod

• Userid cassprod

Cloud

access

audit log

SSH/sudo

bastion Security groups don’t allow

SSH between instances

Developers

Page 14: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Tooling and Infrastructure

Page 15: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Fast Start Amazon Machine Images

https://github.com/Answers4AWS/netflixoss-ansible/wiki/AMIs-for-NetflixOSS

• Pre-built AMIs for – Asgard – developer self-service deployment console

– Aminator – build system to bake code onto AMIs

– Edda – historical configuration database

– Eureka – service registry

– Simian Army – Janitor Monkey, Chaos Monkey, Conformity Monkey

• NetflixOSS Cloud Prize Contribution – Produced by Answers4aws – Peter Sankauskas

Page 16: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Fast Setup CloudFormation Templates

http://answersforaws.com/resources/netflixoss/cloudformation/

• CloudFormation templates for – Asgard – developer self service deployment console

– Aminator – build system to bake code onto AMIs

– Edda – historical configuration database

– Eureka – service registry

– Simian Army – Janitor Monkey for cleanup

Page 17: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

CloudFormation Walkthrough for Asgard (Repeat for Prod, Test, and Audit Accounts)

Page 18: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013
Page 19: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Setting Up Asgard – Step 1 Create New Stack

Page 20: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Setting Up Asgard – Step 2 Select Template

Page 21: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Setting Up Asgard – Step 3 Enter IP and Keys

Page 22: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Setting Up Asgard – Step 4 Skip Tags

Page 23: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Setting Up Asgard – Step 5 Confirm

Page 24: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Setting Up Asgard – Step 6 Watch CloudFormation

Page 25: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Setting Up Asgard – Step 7 Find PublicDNS Name

Page 26: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Open Asgard – Step 8 Enter Credentials

Page 27: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Use Asgard – AWS Self-service Portal

Page 28: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Use Asgard – Manage Red/Black Deployments

Page 29: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Track AWS Spend in Detail with ICE

Page 30: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Ice – Slice and Dice Detailed Costs and Usage

Page 31: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Setting Up ICE

• Visit github site for instructions

• Currently depends on HiCharts – Non–open source package license

– Free for noncommercial use

– Download and license your own copy

– We can’t provide a prebuilt AMI – sorry!

• Long-term plan to make ICE fully OSS – Anyone want to help?

Page 32: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Build Pipeline Automation Jenkins in the Cloud Autobuilds NetflixOSS Pull Requests

http://www.cloudbees.com/jenkins

Page 33: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Automatically Baking AMIs with Aminator

• Autoscale group instances should be identical

• Base plus code/config

• Immutable instances

• Works for 1 or 1000…

• Aminator launch – Use Asgard to start AMI or

– AWS CloudFormation recipe

Page 34: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Discovering your Services – Eureka

• Map applications by name to – AMI, instances, zones

– IP addresses, URLs, ports

– Keep track of healthy, unhealthy and initializing instances

• Eureka Launch – Use Asgard to launch AMI or use AWS CloudFormation template

Page 35: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Deploying Eureka Service – 1 per Zone

Page 36: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Edda

AWS Instances, ASGs, etc.

Eureka Services Metadata

Your Own Custom State

Searchable State History for a Region / Account

Monkeys

Timestamped delta

cache of JSON describe

call results for anything

of interest…

Edda Launch

Use Asgard to launch AMI or

use CloudFormation template

Page 37: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Edda Query Examples

Find any instances that have ever had a specific public IP address $ curl "http://edda/api/v2/view/instances;publicIpAddress=1.2.3.4;_since=0"

["i-0123456789","i-012345678a","i-012345678b”]

Show the most recent change to a security group $ curl "http://edda/api/v2/aws/securityGroups/sg-0123456789;_diff;_all;_limit=2"

--- /api/v2/aws.securityGroups/sg-0123456789;_pp;_at=1351040779810

+++ /api/v2/aws.securityGroups/sg-0123456789;_pp;_at=1351044093504

@@ -1,33 +1,33 @@

{

"ipRanges" : [

"10.10.1.1/32",

"10.10.1.2/32",

+ "10.10.1.3/32",

- "10.10.1.4/32"

}

Page 38: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Archaius – Property Console

Page 39: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Archaius Library – Configuration Management

SimpleDB or DynamoDB for

NetflixOSS. Netflix uses

Cassandra for multi-region…

Based on Pytheas.

Not open sourced yet

Page 40: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Data Storage and Access

Page 41: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Data Storage Options

• RDS for MySQL – Deploy using Asgard

• DynamoDB – Fast, easy to setup and scales up from a very low cost base

• Cassandra – Provides portability, multiregion support, very large scale

– Storage model supports incremental/immutable backups

– Priam: easy deployment automation for Cassandra on AWS

Page 42: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Priam – Cassandra co-process

• Runs alongside Cassandra on each instance

• Fully distributed, no central master coordination

• Amazon S3-based backup and recovery automation

• Bootstrapping and automated token assignment

• Centralized configuration management

• RESTful monitoring and metrics

• Underlying config in SimpleDB (Cass_turtle for MR)

Page 43: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Astyanax Cassandra Client for Java

• Features – Abstraction of connection pool from RPC protocol

– Fluent style API

– Operation retry with backoff

– Token aware

– Batch manager

– Many useful recipes

– Entity mapper based on JPA annotations

Page 44: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Cassandra Astyanax Recipes

• Distributed row lock (without needing zookeeper)

• Multiregion row lock

• Uniqueness constraint

• Multirow uniqueness constraint

• Chunked and multithreaded large file storage

• Reverse index search

• All rows query

• Durable message queue

• Contributed: High cardinality reverse index

Page 45: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

EVCache - Low latency data access

• Multi-AZ and multiregion replication

• Ephemeral data, session state (sort of)

• Client code

• Memcached

Page 46: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Routing Customers to Code

Page 47: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Denominator: DNS for Multiregion Availability

Cassandra Replicas

Zone A

Cassandra Replicas

Zone B

Cassandra Replicas

Zone C

Regional Load Balancers

Cassandra Replicas

Zone A

Cassandra Replicas

Zone B

Cassandra Replicas

Zone C

Regional Load Balancers

UltraDNS DynECT

DNS

AWS Route53

Denominator – manage traffic via multiple DNS providers with Java code

Denominator

Zuul API Router

Page 48: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Zuul – Smart and Scalable Routing Layer

Page 49: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Ribbon Library for Internal Request Routing

Page 50: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Ribbon – Zone Aware LB

Page 51: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Karyon – Common Server Container

• Bootstrapping

o Dependency & lifecycle management via

Governator.

o Service registry via Eureka.

o Property management via Archaius

o Hooks for Latency Monkey testing

o Preconfigured status page and heathcheck servlets

Page 52: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

• Embedded status page console o Environment

o Eureka

o JMX

Karyon

Page 53: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

• Sample service using Karyon available as

"Hello-netflix-oss" on github

• Recipes ...

Karyon

Page 54: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Availability

Page 55: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Either You Break It, or Users Will

Page 56: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Add Some Chaos to Your System

Page 57: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Clean Up Your Room! – Janitor Monkey Works with Edda history to clean up after Asgard

Page 58: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Conformity Monkey Track and alert for old code versions and known issues

Walks Karyon status pages found via Edda

Page 59: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Hystrix Circuit Breaker: Fail Fast -> Recover Fast

Page 60: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Hystrix Circuit Breaker State Flow

Page 61: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Turbine Dashboard Per second update circuit breakers in a web browser

Page 62: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Developer Productivity

Page 63: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Blitz4J – Nonblocking Logging

• Better handling of log messages during storms

• Replace sync with concurrent data structures.

• Extreme configurability

• Isolation of app threads from logging threads

Page 64: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

JVM Garbage Collection issues? GCViz!

• Convenient

• Visual

• Causation

• Clarity

• Iterative

Page 65: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Pytheas – OSS-based Tooling Framework

• Guice

• Jersey

• FreeMarker

• JQuery

• DataTables

• D3

• JQuery-UI

• Bootstrap

Page 66: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

RxJava - Functional Reactive Programming

• A simpler approach to concurrency – Use Observable as a simple stable composable abstraction

• Observable service layer enables any of these – Conditionally return immediately from a cache

– Block instead of using threads if resources are constrained

– Use multiple threads

– Use nonblocking I/O

– Migrate an underlying implementation from network based to in-memory cache

Page 67: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Big Data and Analytics

Page 68: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Hadoop Jobs – Genie

Page 69: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Lipstick – Visualization for Pig Queries

Page 70: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Putting It All Together…

Page 71: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Sample Application – RSS Reader

Page 72: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Third-party Sample App by Chris Fregly fluxcapacitor.com Flux Capacitor is a Java-based reference application demonstrating the following:

archaius (zookeeper-based dynamic configuration)

astyanax (cassandra client)

blitz4j (asynchronous logging)

curator (zookeeper client)

eureka (discovery service)

exhibitor (zookeeper administration)

governator (guice-based DI extensions)

hystrix (circuit breaker)

karyon (common base web service)

ribbon (eureka-based REST client)

servo (metrics client)

turbine (metrics aggregation)

Flux uses many popular open source tools such as Graphite, Jersey, Jetty, Netty, and Tomcat.

Page 73: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Third-party Sample App by IBM https://github.com/aspyker/acmeair-netflix/

Page 74: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Some of the Companies Using NetflixOSS (There are many more, please send us your logo!)

Page 75: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Takeaway

Page 76: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Use NetflixOSS to scale your startup or re:Invent your Enterprise

Contribute to existing github projects and add your own

Talk to us about @NetflixOSS at the Netflix booth in the Expo

Page 77: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Topic Session # When

How Netflix’s Proven Tools Can Help Accelerate Your Start-up SVC202 Wednesday, Nov 13, 1:30 PM - 2:30 PM

What Enterprises Can Learn from “All-in” Cloud Users Wednesday, Nov 13, 2:30 PM - 3:00 PM

Accelerating Netflix Product Development Using AWS DMG206 Wednesday, Nov 13, 3:00 PM - 4:00 PM

How Netflix Leverages Multiple Regions to Increase Availability: An

Isthmus and Active-Active Case Study ARC305 Wednesday, Nov 13, 4:15 PM - 5:15 PM

Data Science at Netflix with Amazon EMR BDT306 Wednesday, Nov 13, 4:15 PM - 5:15 PM

What an Enterprise Can Learn from Netflix, a Cloud-native Company ENT203 Thursday, Nov 14, 4:15 PM - 5:15 PM

Maximizing Audience Engagement in Media Delivery MED303 Thursday, Nov 14, 4:15 PM - 5:15 PM

Scaling your Analytics with Amazon Elastic MapReduce BDT301 Thursday, Nov 14, 4:15 PM - 5:15 PM

Automated Media Workflows in the Cloud MED304 Thursday, Nov 14, 5:30 PM - 6:30 PM

Deft Data at Netflix: Using Amazon S3 and Amazon Elastic MapReduce

for Monitoring at Gigascale BDT302 Thursday, Nov 14, 5:30 PM - 6:30 PM

Encryption and Key Management in AWS SEC304 Friday, Nov 15, 9:00 AM - 10:00 AM

Your Linux AMI: Optimization and Performance CPN302 Friday, Nov 15, 11:30 AM - 12:30 PM

Page 78: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

Thank You!

Page 79: How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Invent 2013

We are sincerely eager to hear

your feedback on this

presentation and on re:Invent.

SVC202 Please fill out an evaluation form

when you have a chance.