overbooking.ppt

22
1 Computer Science Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe UMASS Amherst and Intel Research

Upload: webhostingguy

Post on 11-Jul-2015

794 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: overbooking.ppt

1Computer Science

Resource Overbooking and Application Profiling in Shared

Hosting PlatformsBhuvan Urgaonkar

Prashant Shenoy Timothy Roscoe †

UMASS Amherst and Intel Research †

Page 2: overbooking.ppt

2Computer Science

Motivation

❒ Proliferation of Internet applications❍ Electronic commerce, streaming media, online

games, online trading,…

❒ Commonly hosted on clusters of servers❍ Cheaper alternative to large multiprocessors

ClientsInternetStreaming

Games

E-commercecluster

Page 3: overbooking.ppt

3Computer Science

Hosting Platforms

❒ Hosting platform: server cluster that runs third-party applications

❒ Application providers pay for server resources❍ CPU, disk, network bandwidth, memory

❒ Platform provider guarantees resource availability❍ Performance guarantees provided to applications

❒ Central challenge: Maximize revenue while providing resource guarantees

Page 4: overbooking.ppt

4Computer Science

Design Challenges

❒ How to determine an application’s resource needs?

❒ How to provision resources to meet these needs?

❒ How to map applications to nodes in the platform?

❒ How to handle dynamic variations in the load?

Page 5: overbooking.ppt

5Computer Science

Talk Outline➾ Introduction

❒ Inferring Resource Requirements

❒ Provisioning Resources

❒ Handling Dynamic Load Variations

❒ Experimental Evaluation

❒ Related Work

Page 6: overbooking.ppt

6Computer Science

Hosting Platform Model❒ Hosting Platforms: Dedicated vs Shared

❍ Dedicated: Applications get integral # nodes❍ Shared: Applications may get fractional # nodes

❒ Our focus: Shared Hosting Platforms❍ Nodes may have competing applications

❒ Capsule: component of an application running on a node❍ Example: e-commerce application: HTTP server, app

server, database server

Page 7: overbooking.ppt

7Computer Science

Provisioning By Overbooking❒ How should the platform allocate resources?

❍ Provision resources based on worst-case needs❒ Worst-case provisioning is wasteful

❍ Low platform utilization❒ Applications may be tolerant to occasional violations

❍ E.g., CPU guarantees should be met 99% of the time

❒ Possible to provide useful guarantees even after provisioning less than worst-case needs

ð Idea: Improve utilization by overbooking resources

Page 8: overbooking.ppt

8Computer Science

Application Profiling

❒ Use the Linux trace toolkit

time

Begin CPU quantum End CPU quantum

ON OFF

❒ Profiling: process of determining resource usage❍ Run the application on an isolated set of nodes❍ Subject the application to a real workload❍ Model CPU and network usage as ON-OFF processes

Page 9: overbooking.ppt

9Computer Science

Resource Usage Distribution

time

Measurement Interval

Cum

ulat

ive

Pr

obab

ilit

y

Fractional usage0 1

1

r(100)

0.99

r(99)

Prob

abil

ity

Fractional usage0 1

Page 10: overbooking.ppt

10Computer Science

Capturing Burstiness: Token Bucket

❒ Token Bucket (σ, ρ)❍ Resource usage over t ≤ σ.t + ρ

Algorithm by Tang et al

❒ Additional parameter T❍ Satisfy token bucket guarantees only for t ≥ T

ρ1

ρ2

time

usage

σ1.t + ρ1

σ2.t + ρ2

Page 11: overbooking.ppt

11Computer Science

Profiles of Server Applications

❒ Applications exhibit different degrees of burstiness ❍ May have a long tail

❒ Insight: Choose (σ, ρ) based on a high percentile

0

0.02

0.04

0.06

0.08

0.1

0 0.2 0.4 0.6 0.8 1

Postgres Server, 10 clients

Pro

bab

ilit

y

Fraction of CPU

0

0.05

0.1

0.15

0.2

0.25

0.3

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Apache Web Server, 50% cgi-bin

Pro

bab

ilit

y

Fraction of CPU

0

0.05

0.1

0.15

0.2

0.25

0.3

0 0.1 .2 0.3 0.4 0.5 0.6 0.7 0.8

Streaming Media Server, 20 clients

Pro

bab

ilit

y

Fraction of NW bandwidth

Page 12: overbooking.ppt

12Computer Science

Resource Overbooking❒ Applications specify overbooking tolerance O

❍ Probability with which capsule needs may be violated

❒ Controlled overbooking via admission control:

ΣK (σk ·Tmin + ρk)·(1 - Ok) ≤ C·Tmin

Pr (ΣKUk > C) ≤ min (O1,…,Ok)

❒ A node that has sufficient resources for a capsule is feasible for it

Page 13: overbooking.ppt

13Computer Science

Mapping Capsules to Nodes

❒ A bipartite graphs of capsules and feasible nodes❍ Greedy mapping: consider capsules in non-decreasing

order of degrees: O( c . Log c ) ❍ Guaranteed to find a placement if one exists!❍ Multiple feasible nodes => best fit, worst fit, random…

1

2

3

1234

capsules nodes capsules nodes

1

33

1

2

4

Final Mapping

Page 14: overbooking.ppt

14Computer Science

Handling Flash Crowds❒ Detect overloads by online profiling

0

0.05

0.1

0.15

0.2

0.25

0.3

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Apache Web Server, Overload

Pro

bab

ilit

y

Fraction of CPU

0

0.05

0.1

0.15

0.2

0.25

0.3

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Apache Web Server, Expected Workload

Pro

bab

ilit

y

Fraction of CPU

0

0.05

0.1

0.15

0.2

0.25

0.3

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Apache Web Server, Offline Profile

Pro

bab

ilit

y

Fraction of CPU

❒ Reacting to overloads (ongoing work)❍ Compute new allocations❍ Change allocations, move capsules, add servers

Page 15: overbooking.ppt

15Computer Science

Talk Outline➾ Introduction

➾ Inferring Resource Requirements

➾ Provisioning Resources

➾ Handling Dynamic Load Variations

❒ Experimental Evaluation

❒ Related Work

Page 16: overbooking.ppt

16Computer Science

The SHARC Prototype

❒ A Linux-based Shared Hosting Platform

❍ 6 Dell Poweredge 1550 servers

❍ Gigabit Ethernet link

❒ Software Components❍ Profiling

Vanilla Linux + Linux trace toolkit

❍ Control plane Overbooking, placement

❍ QoS-enhanced Linux kernel HSFQ schedulers

Page 17: overbooking.ppt

17Computer Science

Experimental Setup

❒ Prototype running on a 5 node cluster❍ Each server: 1 GHz PIII with 512MB RAM and Gigabit

ethernet❍ Control plane runs on a dedicated node❍ Applications run on the other four nodes

❒ Workload: mix of server applications❍ PostgreSQL database server with pgbench (TPC-B) benchmark❍ Apache web server with SPECWeb99 (static & dynamic HTTP)❍ MPEG streaming server with 1.5 Mb/s VBR MPEG-1 clients❍ Quake I game server with “terminator” bots

Page 18: overbooking.ppt

18Computer Science

Resource Overbooking Benefits

❒ Small amounts of overbooking can yield large gains❍ Bursty applications yields larger benefits

0

50

100

150

200

250

300

350

0 20 40 60 80 100 120 140

Placement of Streaming Media Servers

No OvbOvb=1%Ovb=5%

Num

ber

of

Ap

ps

Pla

ced

Number of Nodes

0

200

400

600

800

1000

1200

1400

0 20 40 60 80 100 120 140

No OvbOvb=1%Ovb=5%

We

b S

erv

ers

Pla

ced

Number of Nodes

Placement of Apache Web Servers

Page 19: overbooking.ppt

19Computer Science

Capsule Placement Algorithms

❒ Diverse requirements: worst-fit outperforms others❒ Similar requirements: all perform similarly

0

20

40

60

80

100

16 32 64

Placement Algorithms, Ovb=5%

RandomBest-fitWorst-fit

Num

ber

of

Ap

ps

Pla

ced

Number of Nodes

0

500

1000

1500

2000

2500

3000

3500

16 32 64

Placement Algorithms, Ovb=5%

RandomBest-fitWorst-fit

Nu

mb

er o

f A

pp

s P

lace

dNumber of Nodes

Page 20: overbooking.ppt

20Computer Science

Performance with Overbooking

❒ Performance degradation is within specified overbooking tolerance

5.230.590.3100Viol (sec)

Streaming

9.0421.7822.2122.4622.8Tput(trans/s)

PostgreSQL

39.864.8166.9167.5167.9Tput(req/s)

Apache

Avg95th 99th 100th IsolatedMetricApplication

Page 21: overbooking.ppt

21Computer Science

Related Work❒ Single node resource management

❍ Proportional share schedulers: WFQ, SFQ, BVT, …❍ Reservation based schedulers: Nemesis, Rialto, …

❒ Cluster-based resource management❍ Cluster Reserves [Aron00], Aron thesis [Aron00]❍ MUSE [Chase01]: economic approach ❍ Oceano [IBM], Planetary computing [HP]❍ Clusters for high availability: Porcupine [Saito99]❍ Grid computing

Page 22: overbooking.ppt

22Computer Science

Concluding Remarks

❒ Resource management in shared hosting platforms❍ Application profiling to determine resource usage❍ Revenue maximization using controlled overbooking ❍ Ability to handle dynamic workloads (ongoing work)

❒ URL: http://lass.cs.umass.edu