AWS Summit 2013 Tel Aviv Oct 16 – Tel Aviv, Israel
Cost Optimization, TCO and ROI
Steffen Krause
Technology Evangelist @AWS_Aktuell [email protected]
Agenda
1. TCO comparison between cloud and traditional IT
2. How to save money on your AWS bill
3. Customer Spotlight: Time to Know
When Calculating TCO…
AWS lets you pay for only the infrastructure you need… …and only when you need it
On-Premise
(or “Private Cloud”)
Metered, Pay As You Go Model
Use only what you need,
using on-demand, reserved, or spot
Flexible
Capital Expense Model
High upfront capital cost,
high cost of ongoing support
Inflexible
AWS includes everything in the price
Hardware Vendor Offering
3 or 5 Year Amortization
Use 3-Year Heavy RIs
Use Volume RI Discounts
Understand Usage Patterns
Ratios (VM:Physical, Servers:Racks, People:Servers)
Consider Tiered Pricing (Less expensive at every Tier)
Cost Benefits of Automation (Auto scaling,
APIs, TA, Optimization)
DOs
DON’Ts
BONUS
In your TCO analysis
Forget Power/Cooling (compute, storage, shared network)
Forget Administration Costs (procurement,
design, build, operations, network, security personnel)
Forget Rent/Real Estate (building deprecation, taxes, shared services staff)
Forget Virtualization licensing and Software
Maintenance Costs
Forget to mention Cost of “Redundancy”,
Multi-AZ Facility
DOs
DON’Ts
BONUS
In your TCO analysis
Time from ordering to procurement (Releasing early = Increased Revenue)
Cost of “capacity on shelf” (top of step)
Incremental cost of adding an on-premises
server when physical space is maxed out
Real cost of resource shortfalls (bottom of
step)
Cost of disappointed or lost customers when
unable to scale fast enough
DOs
DON’Ts
BONUS
In your TCO analysis
Build Cost-aware architectures Continuous optimization in your architecture
results in recurring savings as early as your next month’s bill
USE ONLY WHAT YOU NEED And pay only for what you use!
When you turn off your cloud resources, you actually stop paying for them
Use only what you need AWS cost savings opportunities
• Right-size – Select appropriate resources – Scale up and down as appropriate – Turn off unused resources
• Payment models
– Flexibility vs. predictability – Mixing payment models
• Measure and manage
– Monitor for saving options
Standard
High-CPU
High-Memory
Micro
Cluster Compute
Cluster GPU
High I/O
High Storage
High Cluster Memory
Most Apps, Low-cost, App
Server / Web Server
Databases, Databases
Databases…
Compute + Network
Throughput
Scale-out Compute, Batch
Processing
For Starters, Low throughout,
Websites
Parallel Processing
OLAP, Hadoop, File
Systems
NoSQL, Best for Random
IOPS
In-memory Apps and DBs.
Best $/RAM
Right-size: broad EC2 selection
Optimize your storage choice too S3 & Glacier
• S3 and Glacier are both: – Secure
– Flexible
– Low-cost
– Scalable: over 2 trillion customer objects
– Durable: 99.999999999% (11 “9”s)
Amazon Glacier
Choosing between S3 and Glacier
• Amazon Simple Storage Service (S3) – Designed to serve static content
• high volumes, low latency, frequent access
– From 5.5¢/GB/Month: 11 9’s Durability – From 3.7¢/GB/Month: 4 9’s Durability (reduced redundancy)
• Amazon Glacier – Designed for long-term cold storage/archiving
• infrequent access, long retrieval times (3-5 hrs)
– From 1¢/GB/Month • But retrieving data is slower and more expensive than on S3
S3 and Glacier tips
• Optimize access – Reduce payload size
– # of accesses (e.g., consolidated logs)
• Monitor for unexpected access/growth patterns – Misconfigured log archiving
• Set Lifecycle Policies – Object expiration dates
– Auto-move S3 files to Glacier
Use only what you need AWS cost savings opportunities
• Right-size – Select appropriate resources – Scale up and down as appropriate – Turn off unused resources
• Payment models
– Flexibility vs. predictability – Mixing payment models
• Measure and manage
– Monitor for saving options
EC2 pricing plans On-Demand
Instances
Reserved Instances
Spot Instances
Pay as you go for computing power
Flat hourly rate, no up-front commitments
Pay an up-front fee for a capacity reservation and a lower hourly rate (up to 72% savings)
1-year or 3-year terms
RI Marketplace: sell RIs you no longer need; buy RIs at a discount
Pay what you want for spare EC2 capacity: your instances run if your bid exceeds the Spot price
Potential for large scale at low cost: When they’re available, take advantage of 1,000s of Spot Instances at up to 90% savings
10:00
10:05
10:10
10:15
Use a spectrum of payment models Frontend Applications
on On-Demand/Reserved Instances
+
Backend Applications* on Spot Instances
* e.g., batch video transcoding
Sample Cash Flow Summary from RI Analysis, Aggregate of Light, Medium & Heavy RIs
The breakeven for RIs is surprisingly quick
• 1yr and 3yr RIs don’t mean that you must keep them 1 or 3 years
• In many cases, you save money way before
Sample Cash Flow Summary from RI Analysis
Other simple optimization tips
• Don’t forget to… – Disassociate unused EIPs
– Delete unassociated Amazon EBS volumes
– Delete older Amazon EBS snapshots
– Leverage Amazon S3 Object Expiration
– Defer batch activity (e.g., Hadoop) to periods when your RIs are regularly underutilized
– (For Enterprise-level support, Trusted Advisor can help with some of these.)
MEASURE AND MANAGE
“If you cannot measure it, you cannot improve it.”
- Lord Kelvin
AWS Monitoring and Management Services
• Detailed cloud monitoring and management – Consolidated Billing (in “Account Activity”)
– CloudWatch (in AWS Management Console)
– Billing Alerts (in “Account Activity”)
– Trusted Advisor (in “Support Center”)
– Other APIs: tags, programmatic access, etc.
• Third-party services are also available
Consolidated Billing
• One Bill for multiple accounts
• Easy Tracking of account charges (e.g., download CSV of cost data)
• Group Activities by Paying Account (e.g., Dev, Stage, Test, Prod)
• Volume Discounts can be reached faster with combined usage
• Reserved Instances are shared across accounts (including RDS Reserved DBs)
• AWS Credits are combined to minimize your bill
CloudWatch to monitor & manage usage
• Monitor your resource utilization – Are you using the right instance type? – Have you left instances idle? – Is your instance usage level or bursty?
• Manage your resource utilization
– Move bursty workloads to other instances – Rebalance your worker nodes – Scale nodes automatically with Auto
Scaling
Use CloudWatch to create Billing Alerts
• Alert when estimated charges reach threshold
• Track an individual developer, or your whole business
• Set up your billing alarm and actions
Trusted Advisor Enterprise Strength Monitoring/Optimization
• Monitors and recommends optimizations for: – Cost – Security – Fault Tolerance – Performance
• Available to customers with Business and Enterprise-level support
http://aws.amazon.com/premiumsupport/trustedadvisor/
Third-party services to optimize your AWS usage
This document and the information set forth herein are the property of Time To Know Limited and are to be held in confidence. Publication, duplication, disclosure or use for any purpose not expressly authorized in writing by Time To Know Limited is prohibited.
Copyright © 2013 Time To Know, Inc.
AWS Tel Aviv Summit Cost Optimization
Rami Levi October 2013
30
The T2K Digital Teaching Platform is designed to promote student acquisition of 21st century skills.
The platform enables the smart and effective generation of localized content using T2K’s smart Content Generation Studio.
Time To Know proven results show higher student achievement levels and stronger motivation for learning.
Time To Know
31
The T2K Digital Teaching Platform is designed to promote student acquisition of 21st century skills.
The platform enables the smart and effective generation of localized content using T2K’s smart Content Generation Studio.
Time To Know proven results show higher student achievement levels and stronger motivation for learning.
Time To Know
32
High hosting costs in traditional computing.
Decentralized management of our environments.
Difficult to predict growth of business and adjust infrastructure and H.W accordingly
Reducing cost while off hours.
Reduce hosting locations to minimum.
Challenges
60K 90K
35K 19K
$0
$20
$40
$60
$80
$100
2010 2011 2012 2013
Cost (K)
Cost (K)
Using AWS at T2K No formal workflow
with AWS
Inefficient use of instances and storage Consolidated billing
Reducing instances and forming work procedures
Multiple zones for fewer clusters
Reducing instances and forming work procedures
Internal tool development using C# SDK for elasticity
Cloud operations optimization (newvem)
Reserved Instances
Joining AWS
34
Elasticity – using the servers when we need – saving costs by more than 40%
Reserve instances - saving our production costs by 38% compared to on demand.
Reducing our investment and space needed on infrastructure and hardware.
Central management for all our environments. Variety of API’s which allow us to develop internal tool for
operational use. Easy scale up. Multiple regions for better user experience. Auto scaling – using auto scaling API’s.
Operational Benefits
35
Using accounts and consolidated billing for easily tracking your bills.
Use CloudWatch and SNS for billing monitoring.
Virtual private cloud – elasticity and easy control of your environment.
Using regions for covering all our potential customers.
Tools for optimizing operation use – like Newvem.
Amazon SDK&API’s – developing internal tool for optimizing operation use.
Best Practices
www.timetoknow.com This document and the information set forth herein are the property of Time To Know Limited and are to be held in confidence.
Publication, duplication, disclosure or use for any purpose not expressly authorized in writing by Time To Know Limited is prohibited. Copyright © 2013 Time To Know, Inc.
SCALE OPPORTUNISTICALLY Opportunity favors the prepared application
Time-to-Result Case 1: Value of result quickly diminishes
Example: Engineering simulation Delay Loss of productivity, project slips
Time-to-Result Case 2: Result is valuable…until it’s not
Example: Weekend regression tests Delay Minimal impact until 8:00AM Monday
Spot Instances for greater savings and scale
• Spot in a nutshell – Spot instances run when Your Bid ≥ Spot Price – Spot instances = Spare EC2 instances – Spot instances might be interrupted at any time
• Benefits – Savings: Up to 90% off On-Demand – Scale: Access up to 1,000s of EC2 instances
• To use Spot – Decide on a bid price – Launch via Console, API, Auto Scaling – Monitor Bid Statuses via Console/API
What applications work on Spot?
• Good Spot applications are: – Delayable: to balance SLA/cost – Scalable: “embarrassingly parallel” – Fault-tolerant: can be terminated without losing all work – Portable across regions, AZs, instance types
• Examples:
– MapReduce (Hadoop, Amazon EMR) – Scientific Computing (Monte Carlo simulations) – Batch Processing (video transcoding) – Financial Computing (high-frequency trading algorithm backtesting) – and many others…
Use Auto Scaling to dynamically scale your app
• Auto Scaling auto-sizes your cluster
– Based on preset triggers and schedules
• Integrates with CloudWatch metrics
• Use Auto Scaling to
– Improve customer experience, application performance
– Maximize CPU/IO/Memory utilization
– Optimize other metrics
Scale with Real-Time Demand
Auto-Scaling Example: Netflix
Follow the Money vs. Follow the Customer
• Optimize utilization – Auto Scale on utilization metrics: CPU, memory, requests,
connections, …
• Optimize price paid – Scale with Spot instances when Spot prices are low
– e.g., Run batch processes off-peak (nights, weekends) when Spot prices are lower
Follow the Money vs. Follow the Customer
• Optimize customer experience with Auto Scaling
• Example 1: Scale resources to meet customer demand – Video service Auto Scales instances to respond to customer web service
requests
• Example 2: Scale resources to ensure fresh results – A scientific paper search engine Auto Scales on queue depth (# of new
docs to crawl) – 10 instances steady state and up to 5,000+ to ensure minimum
throughput time
• Example 3: Scale resources preemptively before large demand – A TV show marketing site scales up before the show and back down after
Conclusion (Part I): Fit the cloud to your product and business model
• Use Only What You Need (and pay only for what you use!)
• Measure and Manage
• Scale Opportunistically
http://aws.amazon.com/architecture/
3. Scale Opportunistically: Auto Scale worker
nodes based on size of input queue 1. Pay Only
for What You Use: Right-size your
cloud resources
2. Monitor and Manage your system
with CloudWatch, Billing Alerts, Trusted
Advisor
AWS Resources
Whitepapers available at http://aws.amazon.com/whitepapers TCO Online Calculator http://aws.amazon.com/tco-calculator AWS Simple Calculator http://aws.amazon.com/calculator
Conclusion (Part II): Use the cloud to create new products & business models
On-Premises
• Failure is
expensive
• Experiment
infrequently
• Less Innovation
Optimized Cloud
• Failure is
inexpensive
• Experiment early
and often
• More Innovation
THANK YOU [email protected]