Download - (ISM402) Cost Optimization at Scale
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Evan Crawford
Commercial Optimization - APAC Lead
Michael Fuller
Principal Systems Engineer - Atlassian
October 8, 2015 | Las Vegas, NV
Cost Optimization at ScaleISM 402
What to Expect from the Session
All of our customers…
Only pay for what they use
What to Expect from the Session
Some of our customers…
Only pay for what they need
What to Expect from the Session
we will share tips from our largest and most
innovative customers who only pay for what
they need.
Your business can save lots with these tips!
Example 1
Financial Services Enterprise
A Financial Services Enterprise
In twelve months…
A Financial Services Enterprise
Increased its
CPU usage
nine-fold
A Financial Services Enterprise
Only increased
its spend
four-fold
$260k saving per month!
Example 2
Technology Company
A Technology Company
In the last three
months…
A Technology Company
Doubled the CPU
and traffic used by
its Web servers
A Technology Company
Reduced its
instance spend
by 33%
$72k saving per month!
What levers did they pull?
Commercial Optimization Levers
Elasticity Step 1
Turn off Non
Production
Step 2
Auto Scale
Production
Commercial Optimization Levers
Right-Sizing Step 1
Use the Cheapest
Available Instance
Commercial Optimization Levers
Reserved Capacity Step 1
Cover always on
resources.
Target = 70%
always on covered
Step 2
Leverage RI
flexibility to
increase utilization
Target = 95% RI
Utilization
Example 1
Financial Services Enterprise
Elastic Compute Unit (ECU)
A consistent measure of CPU
processing power
Financial Services Enterprise
What Apr ‘14 Apr ‘15 Δ
Peak Compute
Usage
1,601k
ECU
13,957k
ECU+772%
Instance Costs $59k pm $244k pm +313%
$270k
saving
per
month!
Financial Services Enterprise
Unit Cost:
Elastic Compute Unit
Per Hour
60% Reduction
in Unit Cost
Financial Services Enterprise
Apr 1 2014 Apr 30 2014
Consistent 1,000 to
1,100 ECUs provisioned
Financial Services Enterprise
Apr 1 2015 Apr 30 2015
Financial Services Enterprise
40% Reduction
in Unit Cost
Financial Services Enterprise
30% Reduction
in Unit Cost
Financial Services Enterprise
Financial Services Enterprise
Example 2
Technology Company
Technology Company
What June Aug Δ
Data Out (TB) 36 95+163%
Compute584k
ECU
1,192k
ECU+104%
Instance Costs $36k pm $24k pm -33%
$72k
saving
per
month!
Technology Company
Cost:
Elastic Compute Unit
Per Hour
70% Reduction
in Unit Cost
Technology Company
C4 On Demand
= $0.02 / ECU
m1 On Demand
= $0.07 / ECU
Technology Company
Technology Company
60% Reduction
in Unit Cost
Technology Company
30% Reduction
in Unit Cost
Technology Company
How to use those levers
at scale?
Automation
Understand Opportunities
Automation
What we need to do
1. Auto-tag resources
2. Identify ‘always on’ Non Prod
3. Identify instances to down-size
4. Recommend RIs to purchase
5. Dashboard our status
6. Report on savings
Action Changes
1. Allocate costs by tag &
account
2. Turn off Non-Prod instances
daily
3. Quickly change instance
sizes
4. Move underutilized RIs
Automation
What we need to do
How to set up Automated Tools
Dashboards
https://github.com/evancraw/AWSOptimizationTemplates
A Reasonably Optimized Dashboard
A Dashboard ripe with opportunity
Reserved Instances and Right-Sizing
Build Your Own
Reserved Capacity Recommendations
Right-Sizing Recommendations
https://github.com/evancraw/AWSOptimizationTemplates
From
Automatic Tagging
To
Tagging Governance with AWS Config
Right-size with cloud native provisioning
AWS CloudFormation
AWS OpsWorks
Handy Tools
Move RIs automatically
https://github.com/jros2300/reservedinstances
Tableau Templates
https://github.com/evancraw/AWSOptimizationTemplates(Dashboards, right-sizing, reserved capacity)
Start / Stop Non-Prod Daily
ape.gs/PowerCycleReInvent
$
$
$
$
$
$
$
$
Something missing?
Unit
Cost
Why will those levers be used?
A Lean Culture
What Lean Culture Looks Like
Users: • Understand
• Take responsibility for
• Act to lower
The costs of their usage
as a normal part of their day
Build a Lean Culture
Targets and Metrics Cloud Competency
Center
AWS Enterprise
Support
A Cycle of Cost Optimization
✔
✔
✔
✔✘
✘
✘
✘
$
$
$
$
$
Metrics
1. % Instances turned off daily
2. % Instances right-sized
3. % Always On Resources Covered by RI
4. % RI Utilization
All weighted by ECU
What Works
Think Big
Start Small
Learn Fast
Trust and Verify
Lean Culture
Watch out for
Wait until you get Big
Analysis Paralysis
Try to do it all at once
Constrain innovation by
making teams wait
How do we explain the costs?
How do we allocate the charges to the right team?
How do we save money?
Whose responsibility is it to save money?
2 Years Ago
How do we explain the costs?
How do we allocate the charges to the right team?
How do we save money?
Whose responsibility is it to save money?
Cloud Engineering Team
1UP
Finance
Team
Track Down AWS Accounts
Consolidated Billing
Account A
Account B
Account C
Total
$$
$$$$
$$$
$$$$$$$
Single Monthly Invoice
Centralize AWS Account Creation
What is in the Accounts?
TAGS
Create a Tagging Policy
CSV
JSON
?
Our Tags
Cost Center
Responsible
Owner
Service
Name
Number of Tags vs Enough Information
Report on Tagging Progress
Tag at Resource Creation
Automate Tagging
Build Reports from the Tags
How do we Allocate the Charges to the Right Team?
Untagged Costs
How do we explain the costs?
How do we allocate the charges to the right team?
How do we save money?
Whose responsibility is it to save money?
Reserved Instances
Reserved Instances without Statistics
Statistics on Usage and Costs
Statistics on Usage and Costs
EC2 Usage by Hour
Always-On Load
Elastic Load
Convert Always-On into Elastic Load
Convert Always on to Elastic
EC2 Instance (M4.Large) Running Cost Savings
4 Weeks On Demand 24/7 $85 $0
Convert Always on to Elastic
EC2 Instance (M4.Large) Running Cost Savings
4 Weeks On Demand 24/7 $85 $0
4 Weeks Reserved Instance 24/7 $50 $35
Convert Always on to Elastic
EC2 Instance (M4.Large) Running Cost Savings
4 Weeks On Demand 24/7 $85 $0
4 Weeks Reserved Instance 24/7 $50 $35
4 Weeks On Demand Mon-Fri 10hrs/day $25 $60
Scheduled Run
Times
Scheduled Run Times
Scheduled Run
Times
(Batch Jobs)
Autoscaling
Rightsizing
Reserved Instances
Reserved Instances
Reserved Instances
Results
How do we Explain the Costs?
Tags Reports
How do we Allocate the Charges to the Right Team?
Reports
$
$
$
How do we Save Money?
Volume
Discounts
Reserved
Instances
Elastic
Load
Rightsizing
Whose Responsibility is it to Save Money?
$
Whose Responsibility is it to Save Money?
$
Thank you!
What to do next
Next Steps
Set up a Cloud
Competency Center
Bring in the right
tools
Use metrics to
reinforce behavior
Use partners to
accelerate!
Remember
There is a lot of money to save
Thank you!
Remember to complete
your evaluations!
Related Sessions
ISM206 - Modern IT Governance Through Transparency
and Automation
ISM207 - The Lean Enterprise: How the Principles of
Lean Are Transforming Corporate Innovation
ISM208 - The Science of Saving with AWS RIs
ARC307 - Infrastructure as Code