an efficient algorithm for scheduling instructions with deadline constraints on ilp machines wu hui...

22
An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University of Singapore

Upload: junior-harris

Post on 03-Jan-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP

Machines

Wu Hui Joxan Jaffar

School of Computing

National University of Singapore

Page 2: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

2

What is an ILP machine?

• Multiple functional units of different types.

• Issue an instruction every machine cycle on each functional unit.

• Multiple instructions executed in parallel.

• Latencies exist between instructions.

• Two categories: Superscalar and VLIW (Very Long Instruction Word).

• Typical Example: Intel Itanium processor (http://developer.intel.com/design/ia64/microarch_ovw/index.htm)

Page 3: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

3

What is the problem?

Given a problem instance P: a set of n UET instructions in a basic block with the following constraints:

• precedence-latency constraints: DAG G = (V, E, W), where each latency lij -1, • deadline constraints: individual pre-assigned deadlines, and• m functional units with p different types,

compute a feasible schedule which satisfies all constraints whenever one exists, or a valid schedule with minimum lateness if no feasible schedule exists.

Page 4: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

4

v1 [4] v2 [4]

v4 [5] v5 [5]

Example 1. A problem instance P with two functional units of different types.

01

v3 [4]

v6 [5]

01

0 1

0

v11 [6] v12 [6]v9 [6]

v7 [5]

v8 [6] v10 [6]

00 0 0 0

Table 1. A feasible schedule for P.

FU1

FU2

FU1 v1 v2 v7 v6 v10 v11

FU2 v3 v4 v5 v8 v9 v12

0 1 2 3 4 5 6

Page 5: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

5

What does our algorithm achieve?

Our scheduling algorithm computes a feasible schedule whenever one exists for any problem instance of the following special cases. 1) Arbitrary DAG, latencies of 0 and two functional units of different types. 2) Monotone interval graph, latencies -1 and multiple functional units of different types. 3) In-forest, equal latencies and multiple functional units of different types.

Page 6: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

6

In the case that there is no feasible schedule, our algorithm computes a schedule with minimum lateness for all the above special cases.

Furthermore, by setting all deadlines to a constant, our algorithm will compute a schedule with minimum completion time for

• any instance of the above special cases and

• any instance of the special case of out-forest, equal latencies and multiple functional units of different types.

Page 7: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

7

An in-tree. An out-tree

4

32 2

1 2 1

A monotone interval graph.

v1 v3v2

v4 v5

v6

v1

v2 v3

v4 v5 v6

v3

v1 v2

v4 v5

v6 v7

3

-1

Page 8: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

8

What is the Time Complexity ?

Given the transitive closure of the precedence graph,

• O(ne+nd) for the general model, where d is the maximum latency.

• O(min{ne, de}+nd) if no latency of -1 exists.

• O(n2) if for each instruction the latencies between it and all its immediate successors are equal.

Transitive closure can be computed in O(min(ne, n2.367)) time.

Page 9: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

9

What has been done in the past?

• Palem and Simon’s algorithm on identical processors [ACM TOPLAS, 1993].

• Wu, Joxan and Yap’s algorithm on identical processors [PACT 2000]. • Berstein, Rodeh and Gertner’s work on two processors of different types [IEEE TOC, 1989].

Page 10: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

10

What are the contributions of our work?

• Propose an efficient polynomial algorithm which solves several special cases for each of which no polynomial algorithm was known before.

• Present the first approximation ratio, i.e. for any greedy algorithm, the length of any schedule computed never exceeds p+1, where p is the number of types of functional units.

Page 11: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

11

What are the main ideas of our algorithm?

• Compute the lmax(vi)-successor-tree-consistent deadline for each instruction vi, where lmax(vi) is the maximum latency between vi and all its immediate successors.

• Compute a schedule by using list scheduling, where the priority of each instruction is its successor-tree-consistent deadline and a smaller number implies higher priority.

Page 12: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

12

What is the lmax(vi)-successor-tree-consistent

deadline?

•For each sink instruction, its lmax(vi)-successor-tree-consistent deadline d´i is equal to its pre-assigned deadline.

•For a non-sink instruction vi, d´i is the upper bound on its latest completion time in any feasible schedule for the relaxed problem instance P(i).

Page 13: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

13

What is P(i)?

P(i) consists of a set V(i)={vi} Succ(vi) of instructions with following new constraints.

• Precedence-latency constraints: The lmax(vi)-successor-

tree of vi.

• Deadline constraints: The deadline of each instruction vj in Succ(vi) is its lmax(vj)-successor-tree-consistent deadline and the deadline of vi is its pre-assigned deadline.

Page 14: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

14

What is the k-successor-tree of vi ?

Given a weighted graph G=(V, E, W), an integer k and vi V, the k-successor-tree of vi is a subgraph G= (V, E, W), where

• V ={vi} {vj: vj Succ(vi)},• E={(vi, vj): vj Succ(vi)} and • each edge weight l´ij in W is defined as follows. 1) In the case that k= -1, if l+

ij = -1, then l´ij = -1; otherwise l´ij = 0. 2) In the case that k -1, if l+

ij < k, then l´ij = l+ij;

otherwise, l´ij = k.

Page 15: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

15

v1 v2

v3 v4 v5

v6v7 v8

2 -11

41

1 0 1

Figure 1: The precedence-latency constraints.

v3 v6 v4 v7 v5 v8

4 4 1 2 -1 1

Figure 2: The 4-successor tree of v2.

v2

Page 16: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

16

How to compute lmax(vi)-successor-tree-consistent deadline for vi ?

Key idea: Backward Scheduling

•At any time t, among all ready instructions, an instruction vk with the largest latency in P(i) is chosen and scheduled as

late as possible on a functional unit of the same type. In case of ties, among all instructions with the same latency, an instruction with the latest deadline is chosen.

A schedule computed by backward scheduling is called a backward schedule.

Page 17: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

17

v2[5] v3[6] v4[5] v5 [3] v6[4] v7[3]

3 3 1 2 -1 1

v1 [2]

Example 2: A relaxed problem instance P(1).

FU1 v7 v4 v2 v3

FU2 v5 v6

0 1 2 3 4 5 6

Table 2. A backward schedule for P(1).

FU2

FU1

Page 18: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

18

Scheduling Algorithm

repeat choose an instruction vi satisfying that 1) its lmax(vi)-successor-tree-consistent deadline d´i has not been computed; and 2) either vi is a sink or the successor-tree-consistent deadlines of all its successors have been computed; if vi is a sink then d´i = di; else { if vi has only one immediate successor vj and lij -1 then d´i = min{di, dj - lij - 1};

else { compute a backward schedule b for P(i); d´i = min{di, min{b(vj) - lij : vj Succ(vi) }}; } }until the successor-tree-consistent deadlines of all instructions have been computed;

use list scheduling to compute a schedule for P, where the priority of each instruction vi is d´i and a smaller number implies higher priority;

Page 19: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

19

Example 1. A problem instance P with two functional units of different types.

V5 [5] V 6[5] V8 [6] V9 [6] V11 [6]

Figure 4: The relaxed problem P(1).

0 1 1 1 1

V4 [4] V10 [6]

V1[4]

0 1

v4 [5, 4] v5 [5, 5]

01

v6 [5, 5]

01

0 1

0

v11 [6, 6] v12 [6, 6]v9 [6, 6]

v7 [5, 5]

v10 [6, 6]

00 0 0 0

v2 [4] v3 [4]v1 [4, ?]

v8 [6, 6]

FU2

FU1

Page 20: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

20

Since min{b(vj) - l1j : vj Succ(v1)}= 2, the lmax(v1 )-

successor-tree-consistent deadline of v1 is

min{d1, 2}= min{4, 2}= 2.

FU1 V6 V10 V11

FU2 V4 V5 V8 V9

0 1 2 3 4 5 6

Table 3: A backward schedule b for Succ(v1).

Page 21: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

21

v4 [5, 4] v5 [5, 5]

Example 1. A problem instance P with two functional units of different types.

01

v6 [5, 5]

01

0 1

0

v11 [6, 6] v12 [6, 6]v9 [6, 6]

v7 [5, 5]

v10 [6, 6]

00 0 0 0

v2 [4, 3] v3 [4, 3]v1 [4, 2]

v8 [6, 6]

FU1 v1 v2 v7 v6 v10 v11

FU2 v3 v4 v5 v8 v9 v12

0 1 2 3 4 5 6

Table 3. A feasible schedule computed by list scheduling.

FU1

FU2

Page 22: An Efficient Algorithm for Scheduling Instructions with Deadline Constraints on ILP Machines Wu Hui Joxan Jaffar School of Computing National University

22

Conclusion

K-successor-tree-consistency:

•A general technique for instruction scheduling problem.

•Approximating precedence-latency constraints by using priorities which are k-successor-tree consistent.

•Successfully used to solve several open instruction scheduling problems such as two processor scheduling with equal execution times and release time-deadline constraints.

Open Problem:

•What is the tight worst-case approximation ratio of our algorithm (Conjecture: Lours / Lopt = 4/3)?