a day in the life of a billion packets (cpn401) | aws re:invent 2013

35
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. A Day in the Life of a Billion Packets Eric Brandwine, AWS Security November 14, 2013

Upload: amazon-web-services

Post on 18-Nov-2014

2.797 views

Category:

Technology


1 download

DESCRIPTION

In this talk, we walk through the VPC network presentation, and describe the problems we were trying to solve. Next, we walk through how these problems are traditionally solved, and why those solutions are not scalable, cheap, or secure enough for AWS. Finally, we provide an overview of the solution that we've implemented and discuss some of the unique mechanisms that we use to ensure customer isolation.

TRANSCRIPT

Page 1: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

A Day in the Life of a Billion Packets

Eric Brandwine, AWS Security

November 14, 2013

Page 2: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

We have the cloud

EBS

RDS ElastiCache Redshift

AWS Cloud

EC2 Elastic Load

Balancing

Page 3: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Customers have Data centers

Page 4: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Whiteboard Engineering

EBS

RDS ElastiCache Redshift

AWS Cloud

EC2 Elastic Load

Balancing

Page 5: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013
Page 6: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

EC2 as it was

10.44.12.4 10.44.12.5

10.44.92.17 10.44.12.27

10.108.6.4

Amazon EC2

Page 7: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Why that doesn’t work 192.168.0.0/16

Routing Table

• 192.168.0.0/16: stay here

• 10.44.12.4/32: AWS

• 10.44.92.17/32: AWS

• 10.108.6.4/32: AWS

10.44.0.0/16

10.44.12.4 10.44.12.5

10.44.92.17 10.44.12.27

10.108.6.4

Amazon EC2

Page 8: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Requirements

• Customer Selected IP Addresses

• Route Aggregation for External Connectivity

• Conformance with Existing Network Designs

Page 9: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Virtual Private Cloud

172.31.0.0/18

192.168.0.0/16

Routing Table

• 192.168.0.0/16: stay here

• 172.31.0.0/18: AWS

172.31.1.0/24 172.31.2.0/24

172.31.1.7

172.31.1.8

172.31.1.9

172.31.2.12

172.31.2.51

Page 10: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

This is just virtual networking!

• Subnet ~= VLAN

• VPC ~= VRF (Virtual Routing and Forwarding)

• But…

Page 11: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Scaling Challenges

• VLAN ID space is constrained – 12 bits => 4096 total VLANs

• VRF support is constrained – Large routers => 1-2 thousand VRFs

• Fixed ratio of VLANs:VRFs

Page 12: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Router and capacity dimensions

Big Router

Data Plane

Control

Plane

Big Router

Data Plane

Control

Plane

Page 13: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

An Example

• Average Router Configuration Line: 50 chars

• Config per VPC: 10 lines

• Subnets per VPC: 4

• Config per Subnet: 5 lines

• Total VPCs: 2,000

• Config size: 3MB

Page 14: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Silos of Capacity

A

C

B

F E

D

G

A A A

A

B

C

B B

B B

C

D

F F F

D

D

B

G G

/4 /4

/40 /40

0

0

0

0

1 3 2 4 1 3 2

C

G G

3 2 7

D D D

9 9 10

F F F F F

18 15 40

B B B B B

B B B B B

B B B B B

B B

Page 15: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Implementation Requirements

• Scale to millions of environments the size of

Amazon.com

• Any server, anywhere in a region can host an

instance attached to any Subnet in any VPC

Page 16: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Concepts

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.0.2

10.0.0.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

Server:

Physical host in an

Amazon datacenter

Instance:

Amazon EC2

instance owned by a

customer

VPC:

Amazon Virtual

Private Cloud

owned by a

customer

VPC ID:

Identifier for a VPC

such as vpc-

1a2b3c4d

Mapping Service:

Distributed lookup

service. Maps VPC

+ Instance IP to

server

Page 17: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

L2 - Ethernet

10.0.0.2

10.0.0.3

L2 Src: MAC(10.0.0.2)

L2 Dst: ff:ff:ff:ff:ff:ff

ARP Who has

10.0.0.3?

The switch floods the

ARP request out all

ports

Ethernet Switch

L2 Src: MAC(10.0.0.3)

L2 Dst: MAC(10.0.0.2)

ARP 10.0.0.3 is at

MAC(10.0.0.3)

The switch snoops the

ARP response and

learns the port for

MAC(10.0.0.3).

L2 Src: MAC(10.0.0.2)

L2 Dst: MAC(10.0.0.3)

L3 Src: 10.0.0.2

L3 Dst: 10.0.0.3

ICMP/TCP/UDP/…

Page 18: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

L2 - VPC

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.0.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

L2 Src: MAC(10.0.0.2)

L2 Dst: ff:ff:ff:ff:ff:ff

ARP Who has

10.0.0.3?

L2 Src: MAC(10.0.0.3)

L2 Dst: MAC(10.0.0.2)

ARP 10.0.0.3 is at

MAC(10.0.0.3)

Src: 192.168.0.3

Dst: Mapping Service

Query:

Orange 10.0.0.3

Src: Mapping Service

Dst: 192.168.0.3

Reply:

Host: 192.168.1.4

MAC: MAC(10.0.0.3)

10.0.0.2

Page 19: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

L2 - VPC

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.0.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

10.0.0.2

L2 Src: MAC(10.0.0.2)

L2 Dst: MAC(10.0.0.3)

L3 Src: 10.0.0.2

L3 Dst: 10.0.0.3

ICMP/TCP/UDP/…

VPC: Orange

Src: 192.168.0.3

Dst: 192.168.1.4

Src: 192.168.1.4

Dst: Mapping Service

Validate:

Orange 10.0.0.2 is at

192.168.0.3

Src: Mapping Service

Dst: 192.168.1.4

Mapping valid:

Orange 10.0.0.2 is at

192.168.0.3

Page 20: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

VPC Isolation

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.0.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

10.0.0.2

Src: 192.168.0.4

Dst: Mapping Service

Query:

Grey 10.0.0.3

L2 Src: MAC(10.0.0.4)

L2 Dst: ff:ff:ff:ff:ff:ff

ARP Who has

10.0.0.3?

Page 21: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

VPC Isolation

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.0.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

10.0.0.2

Src: 192.168.0.4

Dst: Mapping Service

Query:

Orange 10.0.0.3

L2 Src: MAC(10.0.0.4)

L2 Dst: ff:ff:ff:ff:ff:ff

ARP Who has

10.0.0.3?

192.168.0.4 is not

hosting any instances

in VPC Orange.

Mapping Denied

Alarm Raised

Page 22: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

VPC Isolation

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.0.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

10.0.0.2

L2 Src: MAC(10.0.0.4)

L2 Dst: MAC(10.0.0.3)

L3 Src: 10.0.0.4

L3 Dst: 10.0.0.3

ICMP/TCP/UDP/…

VPC: Orange

Src: 192.168.0.4

Dst: 192.168.1.4

Src: 192.168.1.4

Dst: Mapping Service

Validate:

Orange 10.0.0.4 is at

192.168.0.4

Src: Mapping Service

Dst: 192.168.1.4

Mapping invalid!

192.168.1.4 does not

deliver the packet to

the instance.

Alarm Raised.

Page 23: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

L3 – IP Routing

10.0.0.2

10.0.1.3

L2 Src: MAC(10.0.0.2)

L2 Dst: ff:ff:ff:ff:ff:ff

ARP Who has

10.0.0.1?

Ethernet Switch

L2 Src: MAC(10.0.0.1)

L2 Dst: MAC(10.0.0.2)

ARP 10.0.0.1 is at

MAC(10.0.0.1)

L2 Src: MAC(10.0.0.2)

L2 Dst: MAC(10.0.0.1)

L3 Src: 10.0.0.2

L3 Dst: 10.0.1.3

ICMP/TCP/UDP/…

Router Ethernet Switch

L2 Src: MAC(10.0.1.1)

L2 Dst: MAC(10.0.1.3)

L3 Src: 10.0.0.2

L3 Dst: 10.0.1.3

ICMP/TCP/UDP/…

Page 24: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

L3 - VPC

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.1.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

L2 Src: MAC(10.0.0.2)

L2 Dst: ff:ff:ff:ff:ff:ff

ARP Who has

10.0.0.1?

L2 Src: MAC(10.0.0.1)

L2 Dst: MAC(10.0.0.2)

ARP 10.0.0.1 is at

MAC(10.0.0.1)

Src: 192.168.0.3

Dst: Mapping Service

Query:

Orange 10.0.0.1

Src: Mapping Service

Dst: 192.168.0.3

Reply:

Host: Gateway

MAC: MAC(10.0.0.1)

10.0.0.2

Page 25: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

L3 - VPC

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.1.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

Src: 192.168.0.3

Dst: Mapping Service

Query:

Orange 10.0.1.3

Src: Mapping Service

Dst: 192.168.0.3

Reply:

Host: 192.168.1.4

MAC: MAC(10.0.1.3)

10.0.0.2

L2 Src: MAC(10.0.0.2)

L2 Dst: MAC(10.0.0.1)

L3 Src: 10.0.0.2

L3 Dst: 10.0.1.3

ICMP/TCP/UDP/…

VPC: Orange

Src: 192.168.0.3

Dst: 192.168.1.4

Src: 192.168.1.4

Dst: Mapping Service

Validate:

Orange 10.0.0.2 is at

192.168.0.3

Src: Mapping Service

Dst: 192.168.1.4

Mapping valid:

Orange 10.0.0.2 is at

192.168.0.3

L2 Src: MAC(10.0.1.1)

L2 Dst: MAC(10.0.1.3)

L3 Src: 10.0.0.2

L3 Dst: 10.0.1.3

ICMP/TCP/UDP/…

Page 26: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Caching

Server 192.168.0.3

Server 192.168.0.4

Server 192.168.1.3

Server 192.168.1.4

10.0.0.2

10.0.0.3

10.0.0.4

10.0.0.4

10.0.0.2

10.0.0.5

10.0.0.3

Mapping Service

L2 Src: MAC(10.0.1.1)

L2 Dst: MAC(10.0.1.3)

L3 Src: 10.0.0.2

L3 Dst: 10.0.1.3

ICMP/TCP/UDP/…

Page 27: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

VPC Pricing

Cost per VPC: $0.00

Cost per Subnet: $0.00

Upcharge per Instance: $0.00

Page 28: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Nov 10, 2010

Page 29: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

VPC as a Platform

172.31.0.0/18

172.31.1.0/24 172.31.2.0/24

172.31.1.7

172.31.1.8

172.31.2.12

172.31.2.51

Page 30: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

VPC as a Platform

• VPN and Direct Connect

• Security Group Egress Filtering

• Network ACLs

• Routing Tables

• Elastic Network Interfaces (ENIs)

• Multiple IPs

Page 31: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Simple Complex

Limited Flexible

EC2 VPC

Page 32: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Default VPC

172.31.0.0/18

172.31.1.0/24 172.31.2.0/24

172.31.1.7

172.31.1.8

172.31.1.9

172.31.2.12

172.31.2.51

Page 33: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Simple Complex

Limited Flexible

EC2 - VPC

Page 34: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Other VPC Sessions

ARC202: High Availability Application Architectures in Amazon VPC

ARC401: From One to Many: Evolving VPC Design

CPN208: Selecting the Best VPC Network Architecture (single VPC vs. multiple VPCs)

CPN301: Amazon EC2 to Amazon VPC: A case study (this is the migration story)

Page 35: A Day in the Life of a Billion Packets (CPN401) | AWS re:Invent 2013

Please give us your feedback on this

presentation

As a thank you, we will select prize

winners daily for completed surveys!

CPN401