non-preemptive scheduling policy design for tasks with stochastic execution times* chris gill...

Non-Preemptive Scheduling Policy Design for Tasks with Stochastic

Execution Times*

Chris GillAssociate Professor

Department of Computer Science and EngineeringWashington University, St. Louis, MO, USA

[email protected]

The University of PennsylvaniaMonday November 23, 2009

*Research supported by NSF grants CNS-0716764 (Cybertrust) and CCF-0448562 (CAREER) and driven by numerous contributions from

doctoral students Robert Glaubius and Terry Tidwell; undergraduates Braden Sidoti, David Pilla, Justin Meden, and

Cameron Cross; and Prof. William D. Smart

2 - Gill et al. – 04/20/23

Washington University in St. Louis

3 - Gill et al. – 04/20/23

Dept. of Computer Science and Engineering

24 faculty members and 71 Ph.D. students working in:

real-time and embedded systems, robotics, graphics, HCI, AI, bioinformatics, networking, high-performance architectures, chip multi-processors, mobile computing, sensor networks, distributed systems, optimization

PhD students are fully funded, and we emphasize individual mentorship and interdisciplinary work

Recent graduates are on faculty at U. Mass, UT-Austin, Rochester, RIT, CMU, Michigan St., and UNC-Charlotte

Graduate study application deadline for Fall 2010 is January 15: http://www.cse.wustl.edu

4 - Gill et al. – 04/20/23

Motivation

Systems are increasingly being designed to interact with the physical world

This trend offers compelling new research challenges that motivate our work

Consider for example the domain of mobile robotics

my name is

LewisMedia and Machines Laboratory


5 - Gill et al. – 04/20/23

Motivation

As in many other systems, resources must be shared among competing tasks

Fail-safe modes may reduce consequences of resource-induced timing failures, but precise scheduling matters

The physical properties of some resources motivate new models and techniques

my name is



6 - Gill et al. – 04/20/23

Motivation

For example, sharing a camera between navigation and surveying tasks

(1) in general doesn’t allow efficient preemption

(2) involves stochastically distributed durations

Other scenarios also raise scalability questions, e.g., multi-robot heterogeneous real-time data transmission



7 - Gill et al. – 04/20/23

System Model Assumptions To begin, time is modeled as being discrete

» E.g., some multiple of the Linux jiffy is the time quantum

Separate tasks require a shared resource» Access is mutually exclusive (a task binds the

resource)» Binding durations are independent and non-

preemptive» Each task’s distribution of durations can be known» Each task is always available to run

Goal: precise resource allocation among the tasks» E.g., 2:1 utilization share targets for tasks A vs B» Need a deterministic scheduling policy (decides

which task gets the resource when) that best fits that goal

8 - Gill et al. – 04/20/23

Towards Optimal Policies

A Markov decision process (MDP) is a 4-tuple (X,A,C,T) that matches our system model well:X: a finite set of states (e.g., utilizations of 8 vs. 17

quanta)A: the set of actions (giving resource to a particular task)C: cost function for taking an action in a stateT: transition function (probability of moving from one

state to another state based on the action chosen)

Solving the MDP gives a policy that maps each state to an action to minimize long term expected costs

However, to do that we need a finite set of states

9 - Gill et al. – 04/20/23

Share Aware Scheduling

A system state: cumulative resource usage of each task

Dispatching a task moves the system stochastically through the state space according to that task’s duration

(8,17)

10 - Gill et al. – 04/20/23

Share Aware Scheduling

Utilization target induces a ray {u:0} through the state space

Encode each state’s “goodness” (relative to the share) as a cost

Require that costs grow with distance from utilization ray

u

u=(1/3,2/3)

11 - Gill et al. – 04/20/23

Transition Structure

Transitions are state-independent

I.e., relative distribution over successor states is the same in each state

12 - Gill et al. – 04/20/23

Cost Structure

States along same line parallel to the utilization ray have equal cost

13 - Gill et al. – 04/20/23

Equivalence Classes

Transition and cost structure thus induce equivalence classes

Equivalent states have the same optimal long-term cost and policy!

14 - Gill et al. – 04/20/23

Periodicity

Periodic structure allows us to represent each equivalence class with a single exemplar

15 - Gill et al. – 04/20/23

Wrapping the State Model

Remove all but one exemplar from each equivalence class

Actions and costs remain unchanged

Remap any dangling transitions (to removed states) to the corresponding exemplar

(0,0)

16 - Gill et al. – 04/20/23

c(x)=

c(x)=

Truncating the State Model

Inexpensive states are nearer the utilization target

Good policies should keep costs small

Can truncate the state space by bounding sizes of costs considered

17 - Gill et al. – 04/20/23

Bounding the State Model

Map any dangling transitions produced by truncation, to a high-cost absorbing state

This guarantees that we will be able to find bounded-cost policies if they exist

Bounded costs also guarantee bounded deviation from the resource share (precision)

18 - Gill et al. – 04/20/23

A Scheduling Policy Design Approach

Iteratively increase the bounds and re-solve the resulting MDP

As the bounds increase, the bounded model solution converges towards the optimal wrapped model policy

19 - Gill et al. – 04/20/23

Automating Model Discovery

ESPI: Expanding State Policy Iteration

1. Start with a policy that only reaches finitely many states from (0,…,0).

E.g., always run the most underutilized task.

2. Enumerate enough states to evaluate and improve that policy

3. If policy can not be improved, stop4. Otherwise, repeat from (2) with newly improved

policy

20 - Gill et al. – 04/20/23

Policy Evaluation Envelope

Enumerate states reachable from the initial state

Explore state space breadth-first under the current policy, starting from the initial state(0,0)

21 - Gill et al. – 04/20/23

Policy Improvement Envelope

Consider alternative actions

Close under the current policy using breadth-first expansion

Evaluate and improve the policy within this envelope

22 - Gill et al. – 04/20/23

ESPI Termination

As long as the initial policy has finite closure, each ESPI iteration terminates (this is satisfied by starting with the heuristic policy that always runs the most underutilized task)

Policy strictly improves at each iteration

Anecdotally, ESPI terminates on all of the task scheduling MDPs to which we have applied it

23 - Gill et al. – 04/20/23

Comparing Design Methods

Policy performance is shown normalized and centered on the ESPI solution data

Larger bounded state models yield the ESPI solution

24 - Gill et al. – 04/20/23

What About Scalability?

MDP representation allows consistent approximation of the optimal scheduling policy

Empirically, bounded model and ESPI solutions appear to be near-optimal

However, approach scales exponentially in number of tasks so while it may be good for (e.g.) sharing an actuator, it won’t apply directly to larger task sets

25 - Gill et al. – 04/20/23

What our Policies Say about Scalability

To overcome limitations of MDP based approach, we focus attention on a restricted class of appropriate scheduling policies

Examining the policies produced by the MDP based approach gives insights into choosing (and into parameterizing) appropriate policies

26 - Gill et al. – 04/20/23

Two-task MDP Policy

Scheduling policies induce a partition on a 2-D state space with boundary parallel to the share target

Establish a decision offset d to identify the partition boundary

Sufficient in 2-D, but what about in higher dimensions?

27 - Gill et al. – 04/20/23

Time Horizons Suggest a Generalization

H0 H1 H2 H3 H4

Ht={x : x1+x2+…+xn=t}

H0

H1

H2

(0,0) (2,0,0)

(0,2,0)

(0,0,2)

u

u

28 - Gill et al. – 04/20/23

Three-task MDP Policy

Action partitions meet along a decision ray that is parallel to the utilization ray

Action partitions are roughly cone-shaped

t =10 t =20 t =30

29 - Gill et al. – 04/20/23

Parameterizing a Partition

Specify a decision offset at the intersection of partitions

Anchor action vectors at the decision offset to approximate partitions

A conic policy selects the action vector best aligned with the displacement between the query state and the decision offset

a1a2

a3

x

30 - Gill et al. – 04/20/23

Conic Policy Parameters

Decision offset dAction vectors a1,a2,…,an

Sufficient to partition each time horizon into n regions

Allows good policy parameters to be found through local search

31 - Gill et al. – 04/20/23

Comparing Policies

Policy found by ESPI (for small numbers of tasks)πESPI(x) – chooses action at state x per solved MDP

Simple heuristics (for all numbers of tasks)πunderused(x) – runs the most underutilized task

πgreedy(x) – minimizes immediate cost from state x

Conic approach (for all numbers of tasks)πconic(x) – selects action with best aligned action vector

32 - Gill et al. – 04/20/23

Policy Comparison on a 4 Task Problem

Task durations: random histograms over [2,32]100 iterations of Monte Carlo conic parameter

searchESPI outperforms, conic eventually approximates

well

33 - Gill et al. – 04/20/23

Policy Comparison on a Ten Task Problem

Repeated the same experiment for 10 tasksESPI is omitted (intractable here)Conic outperforms greedy & underutilized

heuristics

34 - Gill et al. – 04/20/23

Comparison with Varying #s of Tasks

100 independent problems for each # (avg, 95% conf)

ESPI only tractable through all 2 and 3 task casesConic approximates ESPI, then outperforms

others

35 - Gill et al. – 04/20/23

ConclusionsWe have developed new techniques for designing

non-preemptive scheduling policies for tasks with stochastic resource usage durations

MDP-based methods provide good approximations to optimal policies for 2 or 3 tasks

Conic policy performance is competitive with ESPI for smaller problems, and for larger problems improves on underutilized and greedy policies

Future work will focus on applying and evaluating our results in different cyber-physical systems, and on extending them further in design and verification

36 - Gill et al. – 04/20/23

For Further InformationR. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Policy Design for

Autonomic Systems”, International Journal on Autonomous and Adaptive Communications Systems, 2(3):276-296, 2009

R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Design and Verification for Open Soft Real-Time Systems”, RTSS 2008

R. Glaubius, T. Tidwell, B. Sidoti, D. Pilla, J. Meden, C. Gill, and W.D. Smart, “Scalable Scheduling Policy Design for Open Soft Real-Time Systems”, Tech. Report WUCSE-2009-71, 2009 (Under Review for RTAS 2010)

R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Design with Unknown Execution Time Distributions or Modes”. Tech. Report WUCSE-2009-15, 2009

T. Tidwell, R. Glaubius, C. Gill, and W.D. Smart, “Scheduling for Reliable Execution in Autonomic Systems”, ATC 2008

C. Gill, W.D. Smart, T. Tidwell, and R. Glaubius, “Scheduling as a Learned Art”, OSPERT, 2008

Project web site: http://www.cse.wustl.edu/~cdgill/Cybertrust/

Thank you! Chris GillAssociate Professor of

Computer Science and Engineering

38 - Gill et al. – 04/20/23

Appendix: Comparison to EDF Scheduling

Earliest-Deadline-First (EDF) scheduling:» Enforces timeliness by

meeting task deadlines.» Not share aware.

We introduce deadlines as a function of worst-case execution time.

Miss rate is a function of deadline tightness.

39 - Gill et al. – 04/20/23

Appendix: Varying Temporal Resolution

40 - Gill et al. – 04/20/23

Appendix: Stable Conic Policies

Guaranteed that stable conic policies exist.

For example, set each action vector to point opposite its corresponding vertex.

Induces a vector field that stochastically orbits the decision ray.

(0,0,t)

(t,0,0) (0,t,0)

41 - Gill et al. – 04/20/23

Appendix: Stable Conic Policies

Guaranteed that stable conic policies exist.

For example, set each action vector to point opposite its corresponding vertex.

Induces a vector field that stochastically orbits the decision ray.

(0,0,t)

(t,0,0) (0,t,0)

42 - Gill et al. – 04/20/23

Appendix: More Tasks Implies Higher Cost

Simple problem: Fair-share scheduling of n deterministic tasks with unit duration

Trajectories under round robin scheduling:2 tasks: E{c(x)} = 1/2.

Trajectory: (0,0)(1,0)(1,1)(0,0)Costs: c(0,0)=0; c(1,0)=1.

3 tasks: E{c(x)} = 8/9.Trajectory: (0,0,0)(1,0,0)(1,1,0)(1,1,1)(0,0,0)Costs: c(0,0,0)=0; c(1,0,0)=4/3; c(1,1,0)=4/3

n tasks: E{c(x)} = (n+1)(n-1)/(3n)

43 - Gill et al. – 04/20/23

Appendix: Share Complexity

non-preemptive scheduling policy design for tasks with stochastic execution times* chris gill...

Documents

smartwashington university

surveying tasks

time quantumseparate

consequences of resource

distributed systems

precise resource allocation

optimizationphd students

knowneach task