effects and implications of file size/service time correlation on web server scheduling policies

30
Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern University + Ask Jeeves, Inc.

Upload: jagger

Post on 07-Jan-2016

19 views

Category:

Documents


0 download

DESCRIPTION

Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies. Dong Lu * + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern University + Ask Jeeves, Inc. Outline. Quick review of size-based scheduling Motivation and approach - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

Effects and Implications of File Size/Service Time Correlation on Web

Server Scheduling Policies

Dong Lu*+

Peter Dinda*Yi Qiao*

Huanyuan Sheng*

*Northwestern University+Ask Jeeves, Inc.

Page 2: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

2

Outline

• Quick review of size-based scheduling

• Motivation and approach

• Correlation between file size and service time: a measurement study

• Performance of SRPT scheduling under real workload

• Domain-based scheduling

Page 3: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

3

Quick Review of Size-based Scheduling

• SRPT– Shortest Remaining Processing Time– Assuming perfect knowledge of service times

• FSP – Fair Sojourn Protocol– Assuming perfect knowledge of service times

• Typical non-size-based scheduling– Processor Sharing (PS)– First Come First Serve (FCFS)

Page 4: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

4

SRPT

• Always serve the job with minimum remaining processing time first, preemptive scheduling– Performance: Minimum mean response time

[Schrage, Operations Research, 1968]

– Fairness: performance gains of SRPT over PS do not usually come at the expense of large jobs, in other words, it is fair for heavy-tail job size distribution [Bansal and Harchol-Balter, Sigmetrics ‘01]

Page 5: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

5

FSP

• Combined SRPT with PS, preemptive scheduling. [Friedman, et al, Sigmetrics ‘03]

– SRPT + the longer a job stay in the queue, the higher its priority

– Performance: Mean response time is close to that of SRPT

– Fairness: Fairer than PS

Page 6: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

6

Outline

• Quick review of size-based scheduling

• Motivation and approach

• Correlation between file size and service time: a measurement study

• Performance of SRPT scheduling under real workload

• Domain-based scheduling

Page 7: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

7

Motivation• Current implementation of SRPT and FSP

– Use file size as service time (sorting jobs using file size)

• Is file size a good estimator of service time?• What is the performance of SRPT and FSP using

file size as service time? And how to improve?

Service time: the time needed to send requested data in the absence of other requests in the system

Page 8: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

8

Trace-driven Simulation

• Simulator:– C++– Supports G/G/n/m queuing model– Driven by enhanced web server traces – Validation

• Little’s law• Repeat the simulations in the FSP paper [Friedman, et al,

Sigmetrics ‘03]

• Compare with available theoretical results [Bansal and Harchol-Balter, Sigmetrics ‘01]

Page 9: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

9

Scheduling Policies Studied

• SRPT: Ideal SRPT • SRPT-FS: File size as service time• SRPT-D: Domain-estimated service time

• FSP: Ideal FSP• FSP-FS: File size as service time• FSP-D: Domain-estimated service time

• PS: Processor sharing

Page 10: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

10

Outline

• Quick review of size-based scheduling

• Our approach and questions answered

• Correlation between file size and service time: a measurement study

• Performance of SRPT-FS and FSP-FS scheduling under real workload

• Domain-based scheduling

Page 11: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

11

Correlation is Weak on a Typical Web Server

• Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale)

R ≈ 0.14

Service time

File

S

ize

Request from the whole Internet

Page 12: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

12

Correlation is Weak on Web Cache Servers

• Measurement on 10 Squid web cache servers: – www.ircache.net

Correlation Coefficient R Between File size and Service time

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

P[R

>x]

0

0.5

1.0

Page 13: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

13

Main reason for the weak correlation

• End-to-end path diversity

Web Server

Client 1

Client 2

Client 3

Client 4

Page 14: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

14

Outline

• Quick review of size-based scheduling

• Our approach and questions answered

• Correlation between file size and service time: a measurement study

• Performance of SRPT-FS and FSP-FS scheduling under real workload

• Domain-based scheduling

Page 15: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

15

Mean Response Time Much Worse Than Expected

Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load).

Load on the queue

0 0.5 1.0 1.5 2.0

Mea

n R

espo

nse

Tim

e (m

illis

ec)

100

300

500

700

900

PS

SRPT-FS

FSP-FS

Ideal SRPT and FSP

Page 16: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

16

Mean Queue Length Much Worse Than Expected

Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load).

Load on the queue

Mea

n Q

ueue

Le

ngth

0 0.5 1.0 1.5 2.0

1000

2000

3000

4000

5000

FSP-FS

SRPT-FSPS

Ideal SRPT and FSP

Page 17: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

17

Outline

• Quick review of size-based scheduling

• Our approach and questions answered

• Correlation between file size and service time: a measurement study

• Performance of SRPT-FS and FSP-FS scheduling under real workload

• Domain-based scheduling

Page 18: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

18

Requirements For A Better Service Time Estimator

• Low overhead– Passive measurement – Low computation complexity– Low / adjustable memory usage

• Effective– Approximate the correct ordering of the service

times. High correlation.

Page 19: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

19

Domain-based estimator• Divide Internet into smaller “domains” by leveraging

CIDR (Classless Inter-domain Routing)• Hosts in the same domain are likely to share

same/similar routes to web server, and thus similar throughput

Web Server

Page 20: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

20

Supporting Facts

• Statistical Internet stability and locality– Routing stability [Paxson, Sigcomm 1996]

– TCP throughput locality and stability [Balakrishnan, et

al, Sigmetrics 1997]; [Seshan, et al, USITS 1997]; [Myers, et al , Infocom 1999]

• Classless Inter-domain Routing– implies that routes from machines in the domain

to a server outside the domain will share many hops.

Page 21: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

21

Algorithm

• Use high order k bits of client IP address to classify clients into 2k domains

• For each domain, calculate R = F/S– R: representative service rate– F: sum of file sizes delivered to domain– S: sum of corresponding service times

• For each request, first extract its domain, then service time can be estimated as B/R– B: requested file size– R: representative service rate obtained before

Page 22: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

22

Higher Correlation Can Be Achieved

0 8 16 24 32

Cor

rela

tion

Coe

ffic

ient

R0.1

0.3

0.5

0.7

Bits used to define a domain

Page 23: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

23

Much Lower Service TimesCan Be Achieved

Bits used to define a domain

0 8 16 24 32

Mea

n R

espo

nse

time

(mili

sec)

100

300

500

700

900

PS

FSP-D

SRPT-FS

FSP-FS

SRPT-D

SRPT and FSP

Page 24: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

24

Much Lower Queue LengthsCan Be Achieved

Bits used to define a domain

0 5 10 25 3515 20 30

Mea

n qu

eue

leng

th

1000

2000

3000 FSP-D

FSP-FS

SRPT-FS

PS

SRPT-D

SRPT and FSP

Page 25: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

25

Conclusions

• File size may not be a good estimator of service time for many regimes

• File size-based SRPT and FSP can perform worse than PS in these regimes

• Domain-based scheduling brings the benefits of size-based scheduling to these regimes

Page 26: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

26

For more information

• Prescience Lab at Northwestern University– www.presciencelab.org

Page 27: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

27

Jeeves’ Invitation …

• Have you ever seen the whole Web at once? • Did you ever wonder how to rein the power of

thousands of machines?

• We are hiring talents for Internet Search– Software Engineer– Development Manager

Send us your Resume: [email protected]

Page 28: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

28

Page 29: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

29

Correlation is Weak on a Typical Web Server

• Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale)

R ≈ 0.14

Service time

File

S

ize

Service time

File

S

ize

R ≈ 0.25

Request from the whole Internet Request from a “/16” IP network

Page 30: Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

30

Future Work

• The “back-filling” queuing model

Web Server

Bandwidth

Time

Bottleneck

Web Requests1

2

3

4

5

6

7