aws re:invent 2016: global traffic management with amazon route 53 traffic flow (net302)
Post on 16-Apr-2017
567 Views
Preview:
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sergey Royt, SDM, Amazon Route 53
11/30/2016
NET302
Managing Global Traffic with Amazon
Route 53 Traffic Flow
What to Expect from the Session
• Concepts for managing global traffic
• Introducing Amazon Route 53 traffic flow
• Using traffic flow for traffic management
• Case study: Amazon VPN endpoint selection
What is traffic management?
• Connecting clients to servers
• End users
• Programmatic clients
• Internal clients (components of your systems)
How to manage traffic
• Load balancing: network proxy to servers
• Service discovery: let clients decide
• Content delivery network (CDN): fully managed
distribution
• DNS level: more flexibility than CDN; multiple origins
What are typical application architectures?
• One typical progression
• Growing demands lead to increased level of
sophistication
Load balancing
Fleet of servers behind a load balancer
Auto Scaling group
Elastic Load Balancinginstances
Multi-AZ service
Load balancers across multiple Availability Zones
Au
to S
ca
lin
g g
rou
p
Elastic Load
Balancing
instances
Availability Zone 1
Au
to S
ca
lin
g g
rou
p
instances
Availability Zone 2
Global service
Load balancers and/or servers across multiple regions
globally
Region A
Region B
Region C
user
user
user
Serving a global user base
• Get closer to users; latency matters
• In the old days: create separate “stacks” with domain
names: mycorp.com (USA), mycorp.ca (Canada),
mycorp.fr (France), etc.
• Can more advanced DNS help?
The role of DNS
• Point of entry to your service
• The one global piece of infrastructure
• Recognize geographical locations
• Recognize client networks
Evolution of DNS for a global service
Static records
• Need capabilities to route traffic
MX 10 mail.example.com.
MX 20 mail2.example.com.
mail A 10.0.1.2
svc A 10.0.1.3
www CNAME svc.example.com.
Evolution of DNS for a global service
Dynamic source-discriminating DNS records
• All these records -- what about management?
Evolution of DNS for a global service
Policy-based configuration
• More meaning, less overhead!
mycorp.com
Europe
eu-central-1
Madrid
Americas
East Coast
California
Introduction to Route 53 traffic flow
• Traffic policy is a versioned document composed of rules
and endpoints
• Versioning provides atomic roll back/roll forward
• Traffic policy is applied to an actual domain name, so all
rules and endpoints apply to that domain name
• You can use the same traffic policy for more than one
domain name
Traffic flow terminology
• Traffic policy – rules routing to endpoints
• Traffic policy record – domain name with an applied
traffic policy version
Traffic flow: endpoints
• Hybrid/low level infrastructure: IP address or CNAME
• ELB Classic Load Balancer / Application Load Balancer
• Amazon S3 website
• Amazon CloudFront distribution
• AWS Elastic Beanstalk environment
Traffic flow: basic rules
• Failover
• Primary/secondary
• Weighted
• Round robin across multiple items
Traffic flow: geo
• Routes traffic to endpoints based on location
• Location is [Continent [Country [Subdivision]]]
• Used for:
• Specializing content
• Balancing traffic distribution between data centers
• Pre-optimizing network link selection
Under the hood: geo
• Identify request
• DNS resolver address
• EDNS0 client subnet option (if available)
• Check geo database for location – continent, country,
subdivision
• Find the most specific entry in the rule items to match
the location
Traffic flow: latency
• Routes traffic to closest AWS Region based on latency
• Good default choice for routing between endpoints in
AWS Regions
Under the hood: latency
• Identify request
• Check latency database for preference order of all
regions
• Go to top healthy choice among the rule items
Where is the latency data from?
• Large scale experimentation system
• Measures web client latency to different AWS Regions
• Relates latency data to the DNS resolver used by the
client
• Output: IP address -> region preference vector
Test record set
• Simple “what-if” testing for troubleshooting policy
configurations
• Check a source based on resolver or client subnet
• Works for traffic policy records or regular records
Case study: Amazon VPN endpoints
• AWS and Amazon retail are highly operationally focused
• VPN enables 24/7 support of all our services
• Need to manage endpoints for end users to connect to
the VPN
• Multi-regional highly available service
The problem
• Manual region selection by users
• Overloaded VPN servers
• Single server faults require retry by users in order to
switch to a healthy server
Implementing with traffic flow
• Model the desired flow based on available endpoints
• Convert the model into a traffic policy
User’s country
Default: closest region
South Africa
Romania
…
[AWS Region]
US East 1
US West 1
…
Server round robin
1.1.1.1
2.2.2.2
…
Case study: endpoint selection improvement
“Our fleet consists of different types of hardware and
Route 53 allows us to send more connections to VPN
servers with higher capacity than to the ones with
lower capacity”
Case study: endpoint health checking
Pre-create per endpoint health checks for non-AWS
endpoints
“With the use of Route 53 policies and health checks
we have been able to avoid bad user experience in
cases of downtime in the VPN servers”
How does DNS failover work?
Route 53 health checks provide highly available failover
with a predictable window
Failover Time =
Interval (30s or 10s) * Failure Threshold (min: 1) +
Health check aggregation time (10s) +
Record TTL (60s is typical for dynamic domain names)
~= 70 to 90 seconds
Failover considerations
For planned downtime, let traffic drain to avoid the failover
window and accommodate non-conforming DNS resolvers:
• Update policy record to remove the endpoint from DNS
• Wait for server request metric to go down
• Deactivate the server
Note on circular dependencies
• When designing for high availability, don’t forget a fail-
safe
• The VPN client includes an option that bypasses Route
53 as a way to break out of the dependency cycle for
Route 53 operators
Case study: management benefits
“Route 53 has helped to make simple and easy the
execution of operational tasks in our VPN fleet: with
the use of traffic policies we are able to add or remove
VPN servers from production before maintenance
without user impact”
Related Sessions
• NET202 - DNS Demystified: Getting Started with
Amazon Route 53, featuring Warner Bros.
Entertainment
• NET203 - From EC2 to ECS: How Capital One uses
Application Load Balancer Features to Serve Traffic at
Scale
• NET403 - Elastic Load Balancing Deep Dive and Best
Practices
Amazon Route 53 survey
Give us your feedback about Route 53’s features and
usability at http://amzn.to/Route53_300
Meet the Route 53 team and get Route 53 swag at the
Networking, Content Delivery, & Media Solutions booth.
top related