qops: a qos based scheme for parallel job scheduling m. islamp. balaji p. sadayappan and d. k. panda...
TRANSCRIPT
QoPS: A QoS based Scheme for Parallel Job Scheduling
M. Islam P. Balaji
P. Sadayappan and D. K. Panda
Computer and Information Science
The Ohio State University
Presented by Gerald Sabin
The Ohio State University 206/24/2003
Independent Parallel Job Scheduling Model– Dynamically arriving Independent Parallel Jobs
– Resource mapping: Submitted Jobs to Resources present
Number of Techniques studied over the years– Backfilling (Ex: Conservative, EASY, No Guarantee)
– Priority based scheduling Differentiated service to different classes of jobs
Soft Real-time or Best Effort guarantees to the completion time
Hard Real-Time or “Deadline-based” scheduling– Allow Users to specify the deadline they desire
– Cost model based on Resources Used AND Deadline Specified
– Requires a deadline-based scheduling algorithm: LONG OVERDUE !
Job Schedulers Today
The Ohio State University 306/24/2003
QoS for Job Scheduling
Two Components in providing QoS– Cost Model Component
Based on Resources Used AND Deadline Specified
More urgent jobs are charged more
Guarantees the service requested
– Job Scheduling Component Admission Control
– Can we meet the specified deadline?
Once admitted, cannot miss the specified deadline
We only deal with the Job Scheduling Component
The Ohio State University 406/24/2003
Overview
Related Work
The QoPS Algorithm
Simulation Approach
Experimental Results
Conclusions and Future Work
The Ohio State University 506/24/2003
Related Work Feitelson’s Slack-Based (SB) Scheduling [feit97]
– Focused on improving Utilization and Turnaround time
– Jobs have an associated slack, based on their priority This determines how much they can be delayed
Ramamritham’s Real-Time (RT) Scheduling [krithi90]– Deadline-based scheduling algorithm
– Non-periodic Single Processor Jobs
– Statically available at start time
[feit97]: “Supporting Priorities and Improving Utilization of the IBM SP2 Scheduler using Slack based Backfilling”, D. Talby, D. G. Feitelson, IPPS, Apr ’97
[krithi90]: “Efficient Scheduling Algorithms for Real-Time Multiprocessor Systems”, K. Ramamritham, J. A. Stankovic, P-F. Shiah, TPDS, Apr ‘90
The Ohio State University 606/24/2003
Slack-Based (SB) Scheduling Algorithm
When a job (JN+1) arrives
– Calculate its slack (based on its priority)
– If J1, J2, …, JN are already present and scheduled in that order
– Try placing the job (JN+1) in each possible position in this list
– For each of the N+1 schedules feasible, calculate a cost function ‘f’ A schedule is feasible if no job exceeds the slack given to it
– Choose the schedule with the “best cost function value”
J1J2J3J4J5J6
J7
J1J2J3J4J5J6 J7J1J2J3J4J5J6 J7 J1J2J3J4J5J6 J7Cost Function EvaluationCost Function EvaluationCost Function Evaluation
f0 f1 f2 f3 f4 f5 f6fbest
The Ohio State University 706/24/2003
Real-Time (RT) Scheduling Algorithm
Static Scheme, so there’s no concept of new jobs arriving
Sort jobs based on a heuristic function
Start from a NULL schedule
For each of the jobs– If placing the job in the current schedule misses its deadline
Backtrack to the last known feasible schedule
– If (number of backtracks > p) Discard the Schedule
If all jobs have been placed within their deadlines– Accept the Schedule
The Ohio State University 806/24/2003
J4J4
J2J2J4J4J3J3
J2J2
J1J1
Working of the RT AlgorithmJN JN-1 JN-2 . . . J3 J2 J1
Sorted by Earliest Deadline first (EDF)
NULLNULL
J3J3
The Ohio State University 906/24/2003
Modified Slack-Based (MSB) Algorithm
Modified Slack-Based (MSB) Algorithm
– Motivation of SB: To improve Utilization and Response Time
– SB assigns slack to jobs based on job priorities
– MSB assigns slack to jobs based on the deadline specified
– Rest of the algorithm is unchanged
The Ohio State University 1006/24/2003
Modified Real-Time (MRT) Algorithm
Modified Real-Time (MRT) Algorithm
– RT was designed for non-periodic uni-processor jobs
– All jobs are Statically available at the start of the execution
– MRT involves two modifications to RT
To allow dynamically arriving jobs
– Run the algorithm every time a job arrives
To allow scheduling of parallel jobs
– Allowing backfilling of jobs
The Ohio State University 1106/24/2003
Overview
Related Work
The QoPS Algorithm
Simulation Approach
Experimental Results
Conclusions and Future Work
The Ohio State University 1206/24/2003
The Basic QoPS Algorithm Similar to the MSB algorithm, but…
– Provides more flexibility in reordering scheduled jobs
When a job (JN+1) arrives
– If J1, J2, …, JN are already present and scheduled in that order
– Place the job (JN+1) at the start of all jobs Try scheduling the jobs in that order
– If all jobs are able to meet their deadlines, Great ! Admit it !– If some job fails, we have two options:– Option1:
Consider the failed job as a critical job Push the failed job to the start of the schedule and retry ‘k’ number of such re-orderings of existing jobs are allowed If (number of re-orderings > k) switch to option 2
– Option2: Back off exponentially in the position at which you try placing job (JN+1) and retry
The Ohio State University 1306/24/2003
J3 J2J2 J1
Working of the QoPS Algorithm
J12 J11 J7J8 J6 J5J10 J9 J4 J1
J13
J13
J13
J1
J2
J3
J3J13
J13
J1
J2
J3
J1 J13 J2J3
J13
J1
J4
J3
J2
J6 J5 J4J1J13 J2J3
Max. Violations Allowed = 2
Current Violations = 0Current Violations = 1Current Violations = 2Current Violations = 0
The Ohio State University 1406/24/2003
Overview
Related Work
The QoPS Algorithm
Simulation Approach
Experimental Results
Conclusions and Future Work
The Ohio State University 1506/24/2003
Simulation Approach
CTC/SDSC Trace
Load Variation
Deadline Calculator
Deadline-based Trace
QoPSSimulation
MSBSimulation
MRTSimulation
EASYSimulation
Duplication/Expansion
The Ohio State University 1606/24/2003
Trace Generation Many job logs available, but no associated deadlines
Synthetic Deadline Generation– Generate a schedule for the job trace using EASY
– For any job J, if the Turnaround time in this schedule is T
– Deadline for J = Arrival Time + max (runtime, (1-SF) x T)
– SF is the “Stringency factor” (0 < SF < 1) 0 would give the least stringent deadlines and 1 the most stringent
Some jobs might not come with deadlines– Very lax deadlines to prevent starvation
– If ‘T’ is the current expected Turnaround time, Deadline = Arrival Time + max (24hrs, R x T)
– R is the “Relaxation Factor” of the schedule
The Ohio State University 1706/24/2003
Overview
Related Work
The QoPS Algorithm
Simulation Approach
Experimental Results
Conclusions and Future Work
The Ohio State University 1806/24/2003
Experimental Results Two evaluation scenarios
– Scenario1: All jobs have deadlines
Pure comparison of the three algorithms
– Scenario2: Mixed jobs: Some have deadlines, others are artificially provided
More realistic
Tests Conducted:– Job Acceptance rate
– Impact on Non-deadline Jobs
– Utilization Variation, etc
The Ohio State University 1906/24/2003
Admittance Capacity ComparisonUnadmitted Jobs Vs Load
0
5
10
15
20
25
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
% U
na
dm
itte
d J
ob
s
QoPS
MRT
MSB
Unadmitted Proc. Secs Vs Load
0
5
10
15
20
25
30
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
% U
na
dm
itte
d P
roc.
Se
cs.
QoPS
MRT
MSB
• All jobs have deadlines; Stringency Factor = 0.2; CTC Trace• QoPS admits the most number of jobs (and Processor Seconds)
The Ohio State University 2006/24/2003
Utilization ComparisonUtilization Vs Load
(Stringency Factor = 0.2)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
Util
iza
tion
QoPSMRTMSBEASY
• All jobs have deadlines; CTC Trace• Deadline-based schemes lose about 10% Utilization
Utilization Vs Load(Stringency Factor = 0.5)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
Util
iza
tion
QoPSMRTMSBEASY
The Ohio State University 2106/24/2003
Admittance Capacity Comparison (Mixed Jobs)
Unadmitted Jobs Vs Load
0
2
4
6
8
10
12
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
% U
na
dm
itte
d J
ob
s
QoPS
MRT
MSB
Unadmitted Proc. Secs Vs Load
0
1
2
3
4
5
6
7
8
9
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
% U
na
dm
itte
d P
roc.
Se
cs. QoPS
MRT
MSB
• 20% jobs have deadlines; Stringency Factor = 0.2; CTC Trace• QoPS admits the most number of jobs (and Processor Seconds)
The Ohio State University 2206/24/2003
Response Time and Slow Down Vs Load
Response Time Vs Load
0
5000
10000
15000
20000
25000
30000
35000
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
Re
spo
nse
Tim
e (
secs
)
QoPSMRTMSBEASY
• 20% jobs have deadlines; Stringency Factor = 0.2; CTC Trace
• QoPS gives the best slow-down in spite of accepting more jobs; Unfair to EASY
Slow Down Vs Load
0
5
10
15
20
25
30
35
40
45
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
Slo
w D
ow
n
QoPSMRTMSBEASY
The Ohio State University 2306/24/2003
Utilization Vs Load (Mixed Jobs)Utilization Vs Load
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
1 1.1 1.2 1.3 1.4 1.5 1.6
Load
Util
iza
tion QoPS
MRT
MSB
EASY
• EASY has a higher Utilization• Accepts more (all) jobs; Unfair to the deadline-based schemes
The Ohio State University 2406/24/2003
Response Time and Slow Down Vs Utilization
• 20% jobs have deadlines; Stringency Factor = 0.2; CTC Trace
• Fairer Comparison; QoPS still performs better in most cases, especially Slow Down
The Ohio State University 2506/24/2003
Overview
Related Work
The QoPS Algorithm
Simulation Approach
Experimental Results
Conclusions and Future Work
The Ohio State University 2606/24/2003
Conclusions “Deadline-based” scheduling is desirable
– No such scheme for parallel jobs
– Previous schemes can be extended, but… Not proposed for this kind of scheduling
Might not fit in perfectly
– Proposed the QoPS algorithm
Allows jobs to specify required deadlines– Admission control checks admissibility
– Job Scheduler schedules admitted jobs
Outperforms extended previous schemes (MSB and MRT)– But, the main idea is not performance
– Deadline Scheduling is a necessity and QoPS is an effort to meet it
The Ohio State University 2706/24/2003
Future Work
Cost Metric component in QoS
Currently using a first fit mechanism– Best fit is expected to do much better
Job Shedding Vs Non Job Shedding– If deadline can’t be met
Should we reject the job (will the user try again?)
Should we give it the best available deadline
Grid based extensions to QoPS
The Ohio State University 2806/24/2003
Thank You !
Backup Slides
The Ohio State University 3006/24/2003
Admittance Capacity for SDSC traceUnadmitted Jobs Vs Load
0
5
10
15
20
25
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
% U
na
dm
itte
d J
ob
s
QoPS
MRT
MSB
Unadmitted Proc. Secs Vs Load
0
5
10
15
20
25
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
% U
na
dm
itte
d P
roc.
Se
cs.
QoPS
MRT
MSB
• All jobs have deadlines; Stringency Factor = 0.2; CTC Trace• QoPS admits the most number of jobs (and Processor Seconds)
The Ohio State University 3106/24/2003
Admittance Capacity with Job Expansion
Unadmitted Jobs Vs Load
5
5.5
6
6.5
7
7.5
8
8.5
9
9.5
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Expansion)
% U
na
dm
itte
d J
ob
s
QoPS
MRT
MSB
Unadmitted Proc. Secs Vs Load
15
16
17
18
19
20
21
22
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Expansion)
% U
na
dm
itte
d P
roc.
Se
cs.
QoPS
MRT
MSB
• All jobs have deadlines; Stringency Factor = 0.2; CTC Trace• QoPS admits the most number of jobs (and Processor Seconds)
The Ohio State University 3206/24/2003
Impact of Relaxation FactorAverage Response Time Vs Load
0
5000
10000
15000
20000
25000
30000
35000
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
Ave
rag
e R
esp
on
se T
ime
(se
cs)
Factor = 2
Factor = 5
Factor = 10
Average Slow Down Vs Load
0
2
4
6
8
10
12
14
1 1.1 1.2 1.3 1.4 1.5 1.6
Load (Duplication)
Ave
rag
e S
low
Do
wn
Factor = 2
Factor = 5
Factor = 10
• 80% jobs have deadlines; Stringency Factor = 0.2; CTC Trace• With low “R”, Longer jobs perform better (reflects in Resp.
Time)