more nines for your dimes: improving availability and lowering costs using auto scaling and amazon...
DESCRIPTION
Running your Amazon EC2 instances in Auto Scaling groups allows you to improve your application's availability right out of the box. Auto Scaling replaces impaired or unhealthy instances automatically to maintain your desired number of instances (even if that number is one). You can also use Auto Scaling to automate the provisioning of new instances and software configurations as well as to track of usage and costs by app, project, or cost center. Of course, you can also use Auto Scaling to adjust capacity as needed - on demand, on a schedule, or dynamically based on demand. In this session, we show you a few of the tools you can use to enable Auto Scaling for the applications you run on Amazon EC2.TRANSCRIPT
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
More Nines for Your Dimes: Improving Availability and Lowering Costs using Auto Scaling
Ran Tessler, AWS Solutions Architecture ([email protected])
September 17, 2014
Topics We’ll Cover Today• Auto Scaling introduction• Console demo
• Maintaining application response times and fleet utilization• Handling cyclical demand, unexpected “weather events”
• Auto Scaling for 99.9% Uptime• Single-instance groups
• The opportunity cost of NOT scaling• Auto Scaling to reduce costs
AWS
The Weather Channel
Nokia
Dreambox
Ways You Can Use Auto Scaling
Launch EC2 instances
and groups from
reusable templates
Scale up and down as
needed automatically
Auto-replace
Instances and
maintain EC2 capacity
Launch Configurations Auto Scaling Groups Auto Scaling Policies
Demo
Learn the new terms:
Launch Configuration
Auto Scaling Group
Scaling Policy
Amazon CloudWatch Alarm
Amazon SNS Notification
What’s New in Auto Scaling
Better integration
• EC2 console support
• Scheduled scaling policies in
CloudFormation templates
• ELB connection draining
• Auto-assign public IPs in VPC
• Spot + Auto Scaling
More APIs
• Create groups based on running instances
• Create launch configurations based on running
instances
• Attach or detach running instances from a
group
• Perform lifecycle actions on group instances
• Place instances in standby state for
troubleshooting
Why Auto Scaling?
Scale Up Control CostsImprove Availability
Why Auto Scaling?
Scale Up Control CostsImprove Availability
The Weather Company• Top 30 web property in the U.S.• 2nd most viewed television
channel in the U.S.• 85% of U.S. airlines depend on
our forecasts• Major retailers base marketing
spend and store displays based on our forecasts
• 163 million unique visitors across TV and web
Wunderground Radar and Maps100 million hits a day One Billion data points per day
Migrated real-time radar mapping system wunderground.com to AWS Cloud
30,000 PersonalWeatherStations
Source: Wunderground, Inc. 2013
Why Auto Scaling?
Why Auto Scaling?
Why Auto Scaling?
Why Auto Scaling?
Why Auto Scaling?Hurricane Sandy
Before Migration – Traditional IT Model doesn’t scale well
Server Count(110 Servers)
Avg. CPU Load HTTP Response Latency(~6000 ms)
HTTP Response Latency(5-15ms)
Server Count(from 110 to 170 Instances)
Avg. CPU Load
After Migration - Wunderground Radar App
Radar on AWS Auto Scaling Architecture
Radar on AWS
CPU Utilization
Radar on AWS
Host Count
Radar on AWS
Radar on AWS
Radar on AWS
Scale up to ensure consistent performance during high-demand
Why Auto Scaling?
Scale Up Control CostsImprove Availability
Auto Scaling for 99.9% Uptime
Here.com Local Search Application
• Local Search app• First customer facing
application on AWS• Obvious need for
Uptime
Here.com Local Search Architecture
US-East-1
US-West-2
EU-West-1
US-East-1a
Zookeeper1
Zookeeper2
Zookeeper3
Frontend Group
BackendGroups
US-East-1b
Zookeeper1
Zookeeper2
Zookeeper3
Frontend Group
Backend Groups
AP-Southeast-1
Here.com Local Search Architecture
US-East-1
US-West-2
EU-West-1
US-East-1a
Zookeeper1
Zookeeper2
Zookeeper3
Frontend Group
BackendGroups
US-East-1b
Zookeeper1
Zookeeper2
Zookeeper3
Frontend Group
Backend Groups
AP-Southeast-1
Single-Instance Auto Scaling Groups (Zookeeper)
1. Auto-healing: Instances auto-register in
DNS via Route53
2. Dynamic: Auto Scaling Group Names
are used for cluster-node lookups
(cluster1-zookeeper1)
3. Used Standard Tools such as DNS
instead of Queries or Elastic IPs
Here.com Local Search Success
• Increased Uptime to 99.9%• All detected health
problems have been successfully replaced by Auto Scaling with zero intervention.
• Zookeeper setup has performed flawlessly
“We’ve been paranoid so it still pages us; It’s beginning to feel silly.”
Why Auto Scaling?
Scale Up Control CostsImprove Availability
A little background on our application
• Ruby on Rails• Unicorn• We teach kids math!
A workload well suited for auto scaling
The opportunity cost of NOT scaling• Our usage curve
from 3/20• Low of about 5
concurrent users• High of about
10,000 concurrent users
The opportunity cost of NOT scaling• No autoscaling• 672 instance hours• $302.40 at on-
demand prices
The opportunity cost of NOT scaling• Autoscaling four
times per day• 360 instance hours• $162 at on-demand
prices• 46% savings vs no
autoscaling
The opportunity cost of NOT scaling• Autoscaling as
needed, twelve times per day
• 272 instance hours• $122.40 at on-
demand prices• 24% savings vs
scaling 4 times per day
• 60% savings vs no autoscaling
The opportunity cost of NOT scaling
$302/day
$162/day
$122/day
Demand curve hugs the usage curve…
“Auto Scaling saves us a lot of money; with a little bit of math, flexibility of AWS allows us to further save by aligning our demand curve with usage curve.” -- Dreambox
Why Auto Scaling?
Scale Up Control CostsImprove Availability
Key Takeaways
• Maintaining application response times and fleet utilization• Scaling up and handling unexpected “weather events”
• Auto Scaling for 99.9% Uptime• Single-instance groups
• The opportunity cost of NOT scaling• Auto Scaling to reduce costs
The Weather Channel
Nokia
Dreambox
Common Scenarios
• Schedule a one-time scale out and flip to production
• Follow daily, weekly, or monthly cycles
• Provision capacity dynamically by scaling on CPU, memory,
request rate, queue depth, users, etc.
• Auto-tag instances with cost center, project, version, stage
• Auto-replace instances that fail ELB or EC2 checks
• Auto-balance instances across multiple zones.
Prepare for a Big Launch
Fit Capacity to Demand
Be Ready for Spikes
Simplify Cost Allocation
Maintain Stable Capacity
Go Multi-AZ
Thank You!