dynamic resource allocation for database servers running ...disk bandwidth want to use all resources...

Dynamic Resource Allocation for Database Servers Running on Virtual Storage

Gokul Soundararajan, Daniel Lupei, Saeed Ghanbari, Adrian Daniel Popescu, Jin Chen, Cristiana Amza

University of Toronto

Multi-tier Resource Allocation

Consolidated Environment

Web Server

Application Server

Database Server

Composed of several tiers

Application-BApplication-A

Share resources in each tier

Can lead to interference

Storage

Database

Our Focus: Storage Hierarchy

Application-A Application-B

Buffer Pool

StorageCache

DiskBandwidth

Want to use all resources efficiently

Disk is a bottleneck for Database Apps

Network

State of the Art

‣ Previous work studied resources in isolation

- Memory Partitioning: MRC [ASPLOS’04]

- Disk Bandwidth: Facade [FAST’03], Argon [FAST’07], etc.

- ... and many more

‣ Want to use the storage hierarchy efficiently

‣However, performance depends on all layers

- Interdependency between resources

- E.g., Increasing buffer pool reduces number of storage accesses

Motivating Scenario

Small Large

Buffer Pool

StorageCache

DiskBandwidth

Cache Friendly1 Outstanding I/O

Cache Un-Friendly10 Outstanding I/O

Using Oracle ORION I/O tool

Motivating Scenario

Shared Cache Disk Cache & Disk

Small Large6

Benefits cache-friendly workload

Avoids disk interference

Has best performance

Contributions

‣ Build performance models dynamically

- Account for interdependencies between resources

- Lightweight but still accurate

‣Multi-level Resource Allocator

- Uses performance models to guide resource allocation

- Corrects model errors through runtime sampling

- Uses global utility (SLOs) to partition resources

- Minimize sum of application latencies

Approach

‣ Build performance models

- One per application

- Derive function to predict application latency given configuration

‣ Find resource partitioning setting

- Minimize sum of application latencies

- Find best setting using hill climbing

Lavg = f(ρc, ρs, ρd)

Outline

‣ Online Performance Models

- What are they?

- Why are they hard to build?

‣ Prototype Implementation

‣ Experimental Results

‣Conclusions

One-Level Cache Model

Cache size

Allocate in 32MB chunks

32 64 512

m=CacheSize/ChunkSize=1GB/32MB=32 choices

32 choices

Choose 512MB

... ...

MRC Cache Model

Cache size

32 64 512 1G... ...

Computes miss-ratio

given an I/O trace

Multiply by I/O latency gets Avg. Latency

Two-Level Cache Model

‣ Performance affected by

- DB Buffer Pool Size (m choices)

- Storage Cache (n choices)

‣ Performance model

- Needs to consider all parameters (m*n choices)

- 1GB caches allocated in 32MB chunks

- m = 1GB/32MB = 32 settings

- m*n = 1024 distinct settings

Changes the I/O trace at

storage

Two-Level Cache Model

Latency Surface

Buffer Pool Size (MB)

Storage Cache Size (MB)

2 caches create a 3D surface

Buffer Pool Size

Storage

Cache Size

32x32=1024 data points!

32 data points

HighLatency

LowLatency

Overall Performance Model

Application

Buffer Pool

StorageCache

DiskBandwidth

Needs 32x32x10=10240

samples

15 mins/sample takes 3 months!

Outline

‣ Online Performance Models

- Building performance models

- Allocating resources using models

‣ Prototype Implementation

‣ Experimental Results

‣Conclusions

Key Observations

‣ Known cache replacement policies

- Most cache replacement algorithms are LRU

- Only as effective as the largest cache (cache inclusiveness)

‣Disk is a closed loop system

- Rate of responses is same as rate of requests

- Performance proportional to the disk bandwidth fraction

Cache Inclusiveness

8 8 1 6 8 4 5 6 3 48

I/Os: 0I/Os: 1

Cache Inclusiveness

8 8 1 6 8 4 5 6 3 48 8 1 6

I/Os: 6

8 4 5 6 3 4

34 56 8

Storage cache includes data in the buffer pool

Cache Inclusiveness

8 8 1 6 8 4 5 6 3 48 8 1 6

I/Os: 6

8 4 5 6 3 4

34 56 8Buffer pool

includes data in the storage cache

Approximate Single Cache Model (LRU)

8 8 1 6 8 4 5 6 3 48 8 1 6 8 4 5 6 3 4

I/Os: 6

Same Number of I/Os

34 56 8

Mc(max[ρc, ρs])

34 56 8 LRU

Cache Model (DEMOTE)

‣ Maintain cache exclusiveness

- E.g., using DEMOTEs [USENIX’02]

- Every block brought into buffer pool is not cached below

- Only evictions from buffer pool cached in storage cache

‣ Approximate performance using single cache

- Mc(ρc + ρs)

Find Best Partitioning Setting

Find Best Resource Allocation Setting

Buffer Pool Size

Storage

Cache Size

Buffer Pool Size

Storage

Cache Size

ncy Minimize sum of

application latencies

‣ Observation: Closed loop system

- Rate of responses same as rate of requests

- Use interactive response time law

‣ Performance proportional to disk bandwidth fraction

- Measure base disk latency:

- Predict latency for smaller bandwidth fractions

Disk Model

Ld(ρd) =Ld(1)

Putting it All Together

Application

Mc(ρc)Ms(ρc, ρs)N

Storage Cache

Buffer Pool

ApproximateSingle-Level

CacheCan now be

solved using MRC

=Mc(max[ρc, ρs])N

Putting it all Together

Application

Hc(ρc)Lc

Mc(ρc)Hs(ρc, ρs)Lnet

Mc(ρc)Ms(ρc, ρs)Ld(ρd)

Inaccuracies in the Model

‣ Cache Model

- Approximations to LRU, i.e., CLOCK

- Large fraction of writes in the workload

‣Disk Model

- Using Quanta-based scheduler [Wachs et. al, FAST’07]

- Interference due to disk seeks at small quanta

‣ Inaccuracies localized in known regions

- E.g., Small disk quanta

Iterative Refinement

‣ Build model

- Use trace collected at the database buffer pool

‣ Refine the model

- Use cross-validation to measure quality

- Selectively sample where error is high

- Interpolate computed and measured samples

- Using regression (SVM)

Virtual Storage Prototype

StorageMySQL

CLIENT

Block Layer

Buffer Pool

SERVER

Block Layer

DiskDisk

Cache Quanta

Experimental Setup

‣ Benchmarks

- UNIFORM (microbenchmark), TPC-W and TPC-C

‣ LAMP Architecture

- Linux, Apache 1.3, MySQL/InnoDB 5.0, and PHP 5.0

‣ Cache Configuration

- MySQL buffer pool = 1GB

- Storage cache = 1GB

- Using InnoDB cache replacement in MySQL, CLOCK in storage cache

Our Algorithms

‣ GLOBAL

- Gather trace at the buffer pool

- Measure base disk latency

- Compute performance using performance model

‣ GLOBAL+

- Run GLOBAL

- Evaluate model accuracy

- Refine model using runtime samples

Algorithms for Comparison

‣ MRC

- Partition cache (independently) using miss-ratio curves

‣DISK

- Partition caches equally, determine best disk quanta

‣MRC+DISK

- Run MRC then DISK

‣ IDEAL*

- Build model with SVM using 16*16*5=1280 sampled configurations

Roadmap of Results

‣ Multi-level cache allocator

- Using LRU and DEMOTE cache replacement policies

‣ Multi-level cache and disk

‣ Accuracy of computed models

Miss-Ratio Curves

0 128 256 384 512 640 768 896 1024

TPC-WTPC-C

UNIFORM

Multi-Level Caching (LRU)

Optimal

HeatMapLighter: BetterDarker: WorseOptimal

Storage Cache Size(A)

1G2 TPC-W Instances

Multi-Level Caching (DEMOTE)

Optimal

Storage Cache Size(A)

1G2 TPC-W Instances

Roadmap of Results

- Using two identical applications

- Using different applications

UNIFORM/UNIFORM

GLOBAL GLOBAL+ MRC DISK MRC+DISK IDEAL*

UNIFORM UNIFORM

Allocate caches to 50/50

Matches GLOBAL

TPC-W/UNIFORM

TPC-W UNIFORM

Allocate caches to 50/50

Not enough buffer pool to

UNIFORM Compensate for MRC settings

TPC-W/TPC-C

TPC-W TPC-C

Corrects model at runtime

Corrects imbalance in

TPC-C allocated more in both

Roadmap of Results

- Cache model

- Disk model

Cache Model Accuracy (TPC-W)

0 128 256 384 512 640 768 896

0 128 256 384 512 640 768 896 1024

Localized in the middle

Disk Model Accuracy (TPC-W)

0 0.2 0.4 0.6 0.8 1

Disk Quota

MeasuredComputed

Conclusions

‣ Problem

- Need to consider resources on multiple tiers

- Independent cache/disk allocators are not sufficient

‣ Dynamic allocation of cache hierarchy and disk

- Build performance models dynamically

- Iteratively refine (if necessary)

- Use models for global resource partitioning

‣ Performance up to 2.9 better than single resource allocators

Thank you.

dynamic resource allocation for database servers running ...disk bandwidth want to use all resources...

Documents

secure hierarchy-aware cache replacement policy...

timing-predictability of cache replacement policies

acer aspire v5-571 hdd (hard disk drive) replacement ·...

memory hierarchy. hierarchy list registers l1 cache l2 cache...

© 2011 cisco all rights reserved.cisco confidential 1 app...

emulating optimal replacement with a shepherd cache

svm failed disk replacement

cache replacement policies -...

outperforming lru with an adaptive replacement cache...

learning cache replacement with cacheus

cache invalidation and replacement strategies for location

outperforming lru with an adaptive replacement cache...

cache memori (bagian 1) -...

efficient randomized web-cache replacement schemes using

cache replacement policies

distributed multimedia proxy cache replacement...

“aching in” for sharepoint€¦ · web part caching...

hp ux disk replacement procedures

a 32kb secure cache memory with dynamic replacement

winter 2002cse 141 - cache improving memory with caches cpu...