dynamic and decentralized approaches for optimal allocation of multiple resources in virtualized...

Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in

Virtualized Data Centers

Wei Chen, Samuel Hargrove, Heh Miao, Liang HongCollege of Engineering, Technology & Computer Science

Tennessee State University

PDPTA 2011

Outline Virtualized Data Center and Computation Resources Problem Statement Market Model and Control Objective Decentralized Control Decisions

Localized Optimal Decisions Heuristics and Global Optimization

Resource Allocation Algorithm Simulation and Evaluation Summary and Future Work

PDPTA 2011

Virtualized Data Center and Computation Resources

Fig. 1 The architecture of the Virtualized Data Center Server-m

)(11 tQ

VM1m

VM2m

Data C

enter Netw

ork

dss

Netw

ork

VM11

VM21

VMn1

)(21 tQ

)(1 tQn 1A

2A

nA

Server-1 Router

)(1 tQ m

)(2 tQ m

)(tQnm VMnm

Virtualized Data Center model• m servers • n applications: • Server j provides a virtual machine for each application • Each virtual machine has a set of resources: CPU, bandwidth, disk, memory, …• Request stream from each application arrive according to a random arrival process• Requests are admitted into buffers via the network router

nAAA ,..., 21

iAijVM

Request streams from n applications

Virtual Machines

Buffers for admitted application requests

PDPTA 2011

Problem StatementChallenges:• Reduce the infrastructural and operational costs in the data centers while simultaneously increasing resource utilization to meet service requirements.• Resources are dynamically shared and applications are unpredictably interacted across.

Proposed Approaches: 1. A market based model that simplifies the control scheme and enable real-

time control decision making based on each server's queue information. 2. A resource allocation scheme combines local optimization and heuristics

for global optimization. 3. In order to avoid high complexity, a simple reinforcement learning method

is used to achieve unknown optimal resource utility level.

PDPTA 2011

Market Model and Control ObjectiveMarket Model• Customers pay for the virtual machines: a specific set of resources at each virtual machine for the guaranteed throughput• Profit of the data center = payments from customers – costs of used resources

Profit at timeslot t in one server (hosting n applications)

resourcefor unit per price :)(cost

n applicatiofor used resource ofamount :)(

)( eperformanc required with then applicatio frompayment :))((

))(cost)())((()(1

rr

i rtr

ti tpay

rtrtpaytprofit

i

ii

n

i Rrii

Profit from application i

Total profit at timeslot t in the whole data center ( having m servers )

))(cost)())((()(1 1

m

j

n

i Rrijij rtrtpaytTprofit Profit from server j

Control Objective Maximize Tporfit(t) at each timeslot t.

PDPTA 2011

Decentralized Control Decisions(when should the resources be increased and decreased?)

Localized OptimizationOptimize , the profit at timeslot t from one application i using resource r at one server.

)(cost)())((),,( rtrtpayritprofit ii

case ndardOthers/sta ))((

)( timeslotslast during eperformanc real &

)( eperformanc real if :case (burst) award

)())((

)( resource assigned&

)( eperformanc real if :casepenalty

)())((

))((

tbid

tT

t

tawardtbid

trr

t

tpenaltytbid

tpay

i

i

i

ii

i

i

ii

i

.)( eperformanc requiredfor costomer thefrom bid :))((

.n applicatiofor leasedcostomer that the resource ofamount :)(

. at time n applicatiofor eperformanc required :)(

ttbid

irtr

tit

ii

i

i

PDPTA 2011


Theorem 1 When the performance is proportional to the resource, the function profit(t,i,r) is monotone increasing on in the penalty case and award case, and is a monotone decreasing on in the standard case.

Control Decision at timeslot t:

Calculate the real throughput (the number of the requests in the buffer at the beginning of timeslot t and at the end of the timeslot t)

(1) Penalty case: if the real throughput is smaller than required throughput & there are enough requests in the buffer, increase the amount of resource r.

(2) Award case (for supporting burst): if the real throughput is larger than the required throughput but it is smaller than the required throughput in the last T timeslots, increase the amount of resource r.

(3) Others: reduce the amount of resource r.

Localized Optimization – Continue Optimize , the profit at timeslot t from one application i using resource r at one server.

)(cost)())((),,( rtrtpayritprofit ii

PDPTA 2011


Heuristics and Global Optimization

Heuristic I In order to avoid wasting the resource, if increasing resource r for application i didn’t raise the performance, the amount of r should be reduced in next timeslot so that the reduced resource can be used for other applications in the same server, or even for other servers. Action:

1.t at timeslo resource reduce ),1()(&)1()( if ,lot each timesAt trtttrtrt iiii

Heuristic II If the buffer for application i in server p has fewer requests left, it indicates that p processed more requests. Therefore, it should admit more requests in the next timeslot. On the other side, in order to save the resources the server in sleep mode should keep sleep if possibleAction:At every timeslot, the buffers for application i at the active servers admit the requests fully in increasing order of the buffer size; the inactive servers admit the remaining requests if any.

PDPTA 2011

Resource Allocation Algorithm

Otherwise )1,(

decreased be toneeds resource theif

)1,( )1(

increased be toneeds resource if

)1,()1(

),(

tr

r

tr

r

tr

tr

i

i

i

i

),()1()( trtrtr iii

Using reinforcement learning to adaptively achieve the unknown optimal utility level

Question: Though we know when the resources should be increased and decreased, but how much?

PDPTA 2011

Resource r for application i at timeslot t = resource r for application i at timeslot t – 1 × estimated usage of resource r at timeslot t

is the estimated usage adjusted as follows ( ): ),( tri 1

Simulation and Evaluation

PDPTA 2011

One of the experiments: power and bandwidth are both considered. • 5 servers hosting 4 heterogeneous applications A1 through A4 with time-varying workloads. A1: each request gets 10KB files from the database and does encryption, and it uses 0.095% of the full power (240W). A2: each request gets/sends 80KB files from/to the database without doing encryption, and it uses 0.233% of the full power. A3 and A4: Web applications. A3 is an auction-web tier. Each request uses 0.53 % of the full power and 10.3KB bandwidth. A4 is an auction-database tier. Each request uses 0.2% of the full power and 5.2KB bandwidth. • Required performance for each application is 400 requests per second. • The bandwidths for the five application host servers, the web server and the database server are all set to 70MB. The arriving requests at the data center are scheduled as the following table.

Number of Arriving Requests for Each Application in the Experiment 1 – 50s 51 – 100 s 101 – 150 s 151 – 200s 201 – 250 s 251 – 300 s

A1 300 700 500 250 500 750

A2 300 500 700 500 300 750

A3 300 300 450 600 500 750

A4 300 500 300 500 700 750

TimeApplications

Simulation and Evaluation

1 16 31 46 61 76 91 106

121

136

151

166

181

196

211

226

241

256

271

286

0

100

200

300

400

500

600

700

800Performance/Throughput of Each Application

Application1 Application2Application3 Application4

timeslot in second

num

ber

of r

eque

sts

Total Power

0

200400

600

800

10001200

1400

1 27 53 79 105 131 157 183 209 235 261 287

timeslot in second

wat

ts

Explanation: According to the request table, the workload is unbalanced and there are bursts. • The left figure shows the performance of each application in all five servers that matches the numbers

of arriving requests. For example, during 51-100 seconds, the number of arriving requests (700) for A1 is larger than the requested performance (400). However, since the average number of arriving requests is not larger than 400, all 700 requests are processed. In the last 50 second, the number of the arriving requests for each application is too large that it is beyond the power and bandwidth that each server can deal with. Therefore, only 675 requests on average for each application are processed instead of 750 arriving requests.

• On the other hand, during the first 50 seconds, only 300 requests arrive for each application. Therefore, the right figure shows that two servers are at sleep mode during this period to save power.

PDPTA 2011

Summary and Future WorkSummary Dynamic and decentralized approaches for allocating multiple resources in

virtualized data center that has time-varying workload and heterogeneous applications.

Tackled the problem with market based approaches that simplified the control scheme and enabled real-time control decision making.

Resource allocation scheme combines local optimization and heuristics for global optimization.

Future Work

More experiments in real data centers.

PDPTA 2011

dynamic and decentralized approaches for optimal allocation of multiple resources in virtualized...

Documents