distributed systems meet economics: pricing in the cloud authors: hongyi wang, qingfeng jing, rishan...

28
Distributed Systems Meet Economics: Pricing In The Cloud Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou Presenter: Sajala Rajendran

Upload: stephanie-parks

Post on 22-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Distributed Systems Meet Economics: Pricing In The Cloud

Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou

Presenter: Sajala Rajendran

AbstractPricing Scheme in cloud computing – bridge

that decouples users from cloud providersRelationship between Cloud computing and

pricing has brought a significant change to the system design and optimization

Studies conducted on Amazon EC2 and on a local cloud computing testbed

IntroductionPay-as-you-go model:

Cloud Providers have a pricing scheme for their users.

Users utilize cloud at a very low cost Profit for providersVariety of applications – storage backup, e-

commerce, high performance computing“two-party” computation with pricing as the bridgePricing depends on two factors

System Design and Optimization Fairness and Competitive pricing

Contd…Pricing induced interplay between systems and

economicsCost as an explicit and measurable system metricPricing fairnessEvolving system dynamicsCost of failures

Experiments conducted on Amazon EC2 and Spring have the following results:Optimization for cost is hard for userPricing unfairnessDifferent system configuration significantly imapcts cost

and profitFailure occurrences

Background on PricingPricingPay-as-you-go model

Pricing

Fairness Competition

Personal Social

Pay-as-you-go ModelPricing helps to shape how systems are usedAmazon charges $0.095/virtual machine hourMany pricing schemes are introducedSeveral alternative pricing schemes have

been proposedE.g. Gurmeet Singh and Carl Kesselman

suggested dynamic pricing on resource consumption.

WorkloadsPostmark

I/O intensive benchmark Measures transaction rates for a workload

approximating an Internet email server For experiment : File size 5 GB and number

of transactions is 1000PARSEC (Princeton Application Repository

for Shared Memory Computers) Benchmark suite for chip-multiprocessors Composed of multithreaded programs

9 applications and 3 kernels Blackscholes – High performance computing Dedup – Storage archival

For experiment: 184 MB input data for Dedup and 10 million options for Blackscholes

Hadoop Hadoop 0.20.0 for large scale data processing WordCount and StreamSort For experiment: Input data set is 16 GB

MethodologiesAmazon EC2

Charged according to the pricing scheme of Amazon Cost user = Price x t

t : total running time of the task (Hours) Price : price per virtual machine hour

Excluding storage and data transfer costsSpring System

Provides virtual machines to the users Consists of two modules – VMM (Virtual machine

monitor) and Auditor Provider Profit = Payment from users – Total

provider expense

Hamilton’s EstimationsTotal cost of full burdened power consumption Cost full = p x Praw x PUE

p - Electricity price (dollars/KWh)

Praw - Total energy consumption of servers and

routers PUE – PUE value of the data center

Total provider cost = (Cost full + Cost amortized ) x Scale Scale = Estimated total cost -------------------------------- Cost full + Cost amortized Cost amortized = C amortizedUnit x t server

C amortizedUnit - Amortized cost per hour per server t server - Elapsed time on the server (hours)

Estimation of Praw

For a server, the energy consumption is calculated based on resource utilization

Pserver = Pidle + ucpu x c0 + uio x c1

ucpu - CPU utilization(%) uio - I/O bandwidth (MB/sec) c0 and c1 - coefficients in the model

Experiment Setup – Amazon EC22 virtual machine types – Small and Medium

instances

Experiment Setup – SpringVirtual box is usedHost OS – Windows Serer 2003 and guest OS

is Fedora 10.

Eight core machine for evaluating single-machine benchmarks

Cluster consisting of 32 four-core machines for evaluating Hadoop

Power meter used for measuring power consumption of a server

Total dollar cost is calculated based on Hamilton’s estimations on a data center of 50,000 servers. (PUE =1.7 , Scale = 2.24 , Energy price = $0.07/kWh , C amortized

Unit = $0.08/hr

Contd..An Intel 80 GB X25-M SSD is used to replace

a SATA hard drive adjusting the amortized cost in the machine with an SSD to $0.09/hr.

System throughput = Number of tasks finished/hr + user costs + provider profits.

Efficiency of Provider’s investment ROI = Profit/Cost provider x 100 %

User Optimization on EC2Choosing suitable instance type is important

for both performance and cost

Provider Optimization on SpringBased on varying the number of concurrent

VM’s from one to four on the same physical machine.

ObservationsConsolidation reduces power consumption of

150% and 21% on Praw for Blackscholes and Postmark respectively

Decrease of power cost and increase of user cost, increases provider’s profit significantly.

ROI increases to 180% on Postmark and 340% on Blackscholes.

Suitable consolidation strategy is necessary

Flaw : degradation of system throughput upto

64%.

Multi-machine Benchmarks on HadoopIncrease in provider’s profit of about 135% and

118% on ROI for WordCount and StreamSort respectively.

Degradation of system throughput with a reduction of 12% and 350%

Pricing FairnessPersonal Fairness

Social FairnessCoefficient of variation , cv= stdev -------------- X 100% mean

Maximum Difference = Hi- Lo

------------- X 100% Lo

Variations of different runs on the same instances in Amazon EC2

Each single machine benchmark is run ten timesAs more VM’s are consolidated onto the same physical

machine users need to pay more money.

Postmark incurs 40% more cost than its best case

Cost of running Postmark ten times on three different instances on EC2

Different Hardware ConfigurationsElapsed times of Postmark are 180 and 400

seconds on SSD and hard disk respectively.SSD reduces user’s cost by 120% and

decreases provider’s ROI from 40% to -44%

FailuresExecuting Hadoop in Spring was successful

but resulted in one exception with a message “ Address already in use “ on Amazon EC2.

Transient failures also occur. Running StreamSort using Hadoop on eight VMs in Spring, resulted in a eight time increase in the total elapsed time.

All these could lead to higher user costs

ConclusionCloud computing bridges distributed systems

and economics by using a pricing scheme that connects providers with users.

Experiments conducted on Amazon EC2 and spring have shown that cost variations on both result in social unfairness of the current pricing scheme

Setting that achieves minimum cost differ from that of the best performance.

Providers need to fine-tune its pricing structure to balance between their profits and the users.

Thank You !!!