scheduling theory and real-time systems

SCHEDULING THEORY !and !

REAL-TIME SYSTEMS!

Sanjoy Baruah

The University of North Carolina at Chapel Hill

SCHEDULING THEORY Is a foundational cornerstone of the discipline of real-time computing!

OUTLINE of presentation!

1.  The role of scheduling theory

2.  An example

3.  What constitutes “good” scheduling theory?

4.  Some recent research

A history Lesson… !

Safety-critical systems should be correct and have efficient implementations.

Real-time systems: origins in defense and aerospace

•  safety-critical: correctness matters

•  resource-constrained platforms: need efficient implementation

In the beginning…

•  safety-critical systems were simple

•  implemented as simple software, on predictable processors

Real-time systems became more complex

•  new abstractions were needed

-which details to hide?


The increasing prominence of safety-critical infrastructure - automotive systems; avionics; medical devices

- cyber-physical systems: intelligent highways; next-generation ATC; smart grid

Correctness considerations remain paramount…

Efficiency considerations matter less - computing capacity plentiful and inexpensive (Moore’s Law)

… but are more difficult to ensure

Most current design techniques focus more on correctness than on efficiency.

Design techniques to facilitate provable correctness

•  the synchrony hypothesis •  models for timed execution: synchronous reactive; logical execution time; timed automata; etc. •  model-based design

Difficult to implement efficiently on advanced platforms


Current practice: over-provision computing resources

The window of scarcity

platform resources

system specifications

impossible to implement

easy to implement

Safety-critical systems should be correct and have efficient implementations. Most current design techniques focus more on correctness than on efficiency. It is necessary to reintegrate efficiency considerations.

The window of scarcity

platform resources

System model (with correctness proof)

an implementation

easy to implement


Current practice

platform resources

scheduling penalty

scheduling penalty


Current practice

logarithmic scale Scheduling penalty increases with system complexity

platform resources

scheduling penalty

This penalty matters more: SWaP considerations

- Platform size and weight

- Energy considerations - Environmental concerns - Thermal considerations - Limitations on mobility

Size, Weight, and Power


- greater dollar cost

Scheduling penalty increases with system complexity

scheduling penalty

The narrative thus far:

- In the beginning, efficiency mattered

- Over time, correctness became more important

- Today efficiency matters again - Scheduling theory ; formal methods

An example!

int x int y

int s := 0

A program is composed of reactive blocks

y := x + s s := x

Execution proceeds in rounds or time-steps

- these executions occur instantaneously – they take zero execution time

The Synchronous Reactive (SR) programming paradigm

- the initialization code is executed during round 0 - the reaction description is executed during each round

0 1 2 y := x[0] s := x[0]

y := x[1] + x[0] s := x[1]

y := x[2] + x[1] s := x[2]

s := 0

Variables

initialization reaction description

Actions (executions)

input state output

The synchrony assumption:

A program is composed of reactive blocks


f1

f2

f3

f4 f5

f6

I1

I2

I3

O1

O2

Why Synchronous Reactive Programming?

* A higher level abstraction than the jobs-and-tasks models

- Analogous to the synchronous abstraction in (clocked) digital circuit design


state variables = storage elements (registers, flip-flops)

rounds = clock ticks

reaction description = combinatorial circuit

* Simplifications in abstraction facilitates tool development

- Easy-to-use design tools (e .g., Stateflow/ Simulink)

- Models may be verified and validated

* Higher level abstraction enables the correct design of more complex systems


The SR design process yields a model of the system

- Based on the synchrony assumption: actions execute instantaneously - Correctness proofs depend on the synchrony assumption

An implementation of the model will not satisfy the synchrony assumption

Current practice: extensive testing of the implementation

- Correctness proofs not valid for the implementation

- May fail to detect the worst case

- Add a safety margin: resource over-provisioning

Scheduling theory can help!


: related to directed acyclic graph (DAG) scheduling to minimize makespan －－ determine the minimum time needed to execute a given DAG on a specified multiprocessor platform

Implementation problem: Given the duration of a time-step, choose a platform

f1

f2

f3

f4 f5

f6

I1

I2

I3

A single processor: easily solved…

Multiprocessor/ multicore platform

•  Validate functional correctness •  Timing analysis

-  upper bound on duration between rounds

•  Code implementing functional blocks •  Determine worst-case execution times


Periods: Input Ij changes every pj’th round

f1

f2

f3

f4 f5

f6

I1

I2

I3

p1 = 2

p2 = 6

p3 = 3

f1 f2 f3 f4 f5 f6

f1 f4 f5 f6

f3 f6

f1 f4 f5 f6

May assign offsets to schedule multiple such programs on a shared platform

Number of executed blocks reduced by more than half

f1 f2 f3 f4 f5 f6

0 1 2 3 4 5 6

(In rounds 0-5: 36 blocks to 16)



f1

f2

f3

f4 f5

f6

I1

I2

I3

Deadlines: An acceptable delay between input change and output change

0 1 2 3 4 5 6 f1 f2 f3

f4 f5 f6

p1 = 2

p2 = 6

p3 = 3

deadline = 1

f1 f4 f5

f3 f6

f1 f4

f5

f6

- 15 blocks are executed

- at most 3 blocks/ time-step



f1

f2

f3

f4 f5

f6

I1

I2

I3

Deadlines: An acceptable delay between input change and output change

0 1 2 3 4 5 6 f1 f2 f3

f4 f5 f6

p1 = 2

p2 = 6

p3 = 3

deadline = 1

f1 f4

f3 f6

f1 f4

- 15 blocks are executed

- at most 3 blocks/ time-step Summarizing this example:

- The SR model of computation deals with correctness of models

- Scheduling theory is concerned with efficiency of implementation

- Apply scheduling theory to SR models…

f5

… to obtain efficient implementations of correct models

f5

f6

What constitutes “good” scheduling theory for !real-time computing? !

Should potentially be useful

The “pure” sciences legitimately take the discovery of facts and laws as an end in itself. A new fact, a new law is an accomplishment, worthy of publication. But in [computer science and engineering] novelty in itself has no merit. We test our artifacts by their usefulness and their costs, not their novelty.

- Frederick P Brooks, Jr. (UNC) The computer scientist as toolsmith. (CACM 1996)

2. need not be useful immediately

1. basic research has its place… - but present them elsewhere (e.g., theory conferences)

Caveats

Should potentially be useful

At the appropriate level of abstraction

* Therefore, on abstract task/ machine models

- highlights a few salient features or principles

- E.g., for hard-real-time schedulability analysis

* Described using appropriate abstractions

Obtaining appropriate abstractions is an important research area

* Computationally tractable abstractions

* Should address general principles and relational structures of resource-sharing

Mixed-Criticality scheduling theory!

Modern safety-critical systems - are very complex - are implemented upon non-deterministic platforms - need rigorous correctness proofs

Need significant resource over-provisioning

Example: x := a + b on the Motorola PowerPC 755 - Best case: 3 cycles - Worst case: 321 cycles

(In the 1990’s: the Motorola 64K had best case = worst case = 20 cycles)

Example: resource reservation in avionics applications

Mixed Criticalities

Modern safety-critical systems - are very complex - are implemented upon non-deterministic platforms - need rigorous correctness proofs

Need significant resource over-provisioning

Resource over-provisioning leads to inefficient resource usage at run-time

Integrated Computing Environments: functionalities of different criticalities upon a shared platform

- use resource over-provisioning to validate critical functionalities at high levels of assurance

- design-time resource reclamation to validate non-critical functionalities at lower levels of assurance

Models of real-time systems

Recurrent tasks - finite (a priori known) number of them -  generate the jobs

Jobs executing on a specified platform

- job J = (release time, execution requirement, deadline)

Jobs executing on a specified platform

- job J = (release time, execution requirement, deadline)

The Liu & Layland (LL) sporadic task model

What is a job? - Where do the execution requirements come from?

What is a sporadic task? - How do we determine the periods?

Worst-Case Execution Time (WCET)

Jobs model code

Executable Code

WCET-analysis

tool

Target architecture

WCET

Execution requirement: execution duration for the code

Determining the execution durations: WCET-analysis tools

WCET-analysis

tool

Target architecture

WCET

WCET-analysis tools determine upper bounds on maximum execution time

Different tools are more or less conservative

Conservative tools may be based on static analysis

- compute bounds at higher levels of assurance

Less conservative tools often measurement based

Non-critical functionalities need to be validated to lower levels of assurance

Worst-Case Execution Time (WCET)

Executable Code

Given a collection of jobs, each job characterized as critical or not, and two WCET values

The multiple-WCET mixed-criticality scheduling problem

Determine a single scheduling strategy

- If each job executes for ≤ its smaller WCET, then all the jobs complete by their deadlines

- If each job executes for ≤ its larger WCET, then all the critical jobs complete by their deadlines

Steve Vestal. Preemptive scheduling of multi-criticality systems with varying degrees of execution time assurance. RTSS 2007

An additional factor: CPU speeds may change

Advanced hardware features - detect if signals are late at the circuit level; recover by delaying next clock tick

GALS: Globally Asynchronous Locally Synchronous - locally synchronous modules that communicate asynchronously - local clocks may be paused, stretched, or data-driven

cloc

k f

requ

ency

time

s slb

Given a collection of jobs, each job characterized as critical or not, a single WCET value, and two processor speeds: s and slb


- If processor speed remains ≥ s, then all the jobs complete by their deadlines

- If processor speed remains ≥ slb, then all the critical jobs complete by their deadlines

The variable proc.-speed mixed-criticality scheduling problem

Task τi = (Ci,Ti)

Models repeatedly executed code

Ci: maximum execution duration of the code

Ti (the “period” parameter)

must be estimated

- determined using WCET tools

The Liu & Layland (LL) sporadic task model

-time-triggered periodic tasks

-event-triggered sporadic tasks

: (fixed) duration between executions

: minimum duration between executions

-more pessimistic (smaller) periods for validating safety-critical functions -less conservative (larger) periods for validating non-critical functions

Given a collection of sporadic tasks, each task characterized as critical or not, a single WCET value, and two period parameters


- If successive jobs of each task arrive ≥ its larger period apart, then all jobs complete by their deadlines

- If successive jobs of each task arrive ≥ its smaller period apart, then all critical tasks’ jobs complete by their deadlines

The multiple-period mixed-criticality scheduling problem

Scheduling theory is applied to models

The big picture

… that are conservative

Critical and non-critical functions co-exist in mixed-criticality systems

- upper bound on the execution time of code

- lower bound on the processor speed

- lower bound on duration between external interrupts

1. Validate safety-critical functionalities using conservative approximations

2. Validate all functionalities using less pessimistic approximations

MC scheduling: two independent analyses of a single system

Mixed-criticality scheduling

Multiple dimensions to such modeling

- … - … - …

performance penalties due to being conservative are increasing

Collection of independent jobs, on a varying-speed preemptive processor

Jobs: -  critical or not -  release time -  worst-case execution requirement -  deadline

Processor: -  minimum speed 1 -  maximum speed unbounded

Desired run-time behavior:

•  all critical jobs must meet deadlines

•  all remaining jobs should meet deadlines

Example: Scheduling analysis of a mixed-criticality system

Collection of independent jobs, on a varying-speed preemptive processor

Jobs: -  critical or not -  release time -  worst-case execution requirement -  deadline

Processor: -  minimum speed 1 -  maximum speed unbounded

SYSTEM DESIGNER

Testing-based validation - EDF optimal for meeting all deadlines - extensive testing of EDF: no deadline miss

(WCET estimations are not needed)

CERTIFICATION AUTHORITY

- WCET estimates are needed

Must prove that all critical jobs will complete on time, assuming speed-1 processor

Prior proof of correctness

Observation: EDF acceptable iff all the jobs, with (pessimistic) WCET’s, are shown feasible on a speed-1 processor


job critical? ai ci di

J1 Y 0 3 4

J2 Y 1 1 3

J3 N 0 ? ?

0 1 2 3 4

J1 Y

J2 Y

J3 N

time

Necessary: critical jobs must be feasible on a minimum speed processor



J1 Y 0 3 4

J2 Y 1 1 3

0 1 2 3 4 Necessary: critical jobs must be feasible on a minimum speed processor

Processor speed

J3 N 0 ? ?


1


J1 Y 0 3 4

J2 Y 1 1 3

0 1 2 3 4 On a faster processor (with unknown speed variation)

Processor speed

J3 N 0 ? ?

Would like to also complete the non-critical job[s]


1


J1 Y 0 3 4

J2 Y 1 1 3

0 1 2 3 4 Processor speed

J3 N 0 ? ?

- if d3 ≥ 4, EDF is fine

(> 1 unit) (< 2 units)


On a faster processor (with unknown speed variation)


1


J1 Y 0 3 4

J2 Y 1 1 3


J3 N 0 ? ?

- if d3 =1:

(1 unit) (2 units)




1

(1 unit)

(1 unit) job critical? ai ci di

J1 Y 0 3 4

J2 Y 1 1 3


J3 N ? ? ?

- if d3 =2:



(2 units)


1


PRIOR TO RUN-TIME

Algorithm for scheduling a mixed-criticality collection of independent jobs on a varying-speed preemptive processor

1. Construct a scheduling table of the critical jobs on a minimum-speed processor, with all execution occurring as late as possible

2. Partition the time-line into disjoint intervals, according to (all) jobs’ release dates and deadlines

DURING RUN-TIME

1. Execute all critical jobs’ executions assigned to that interval in the scheduling table

within each interval

2. Execute non-critical jobs according to EDF

3. Execute critical jobs according to EDF


PRIOR TO RUN-TIME

Algorithm for scheduling a mixed-criticality collection of independent jobs on a varying-speed preemptive processor

1. Construct a scheduling table of the critical jobs on a minimum-speed processor, with all execution occurring as late as possible

2. Partition the time-line into disjoint intervals, according to (all) jobs’ release dates and deadlines

DURING RUN-TIME

1. Execute all critical jobs’ executions assigned to that interval in the scheduling table

within each interval

2. Execute non-critical jobs according to EDF

3. Execute critical jobs according to EDF

Run-time complexity:

-  O(n2) prior to run-time -  Same as EDF during run-time

OPTIMALITY property:

-  always completes all critical jobs (on a minimum-speed processor)

-  If any such on-line algorithm completes all non-critical jobs, does so, too

The Summing Up !

1. Safety-critical systems must

Be correct Have efficient implementations

- rigorous design methodologies - scheduling theory

2. Example: the Synchronous Reactive (SR) model of computation

3. Identifying relevant results from scheduling theory

4. Some current relevant research: mixed-criticality scheduling

scheduling theory and real-time systems

Documents