scheduling theory and real-time systems
TRANSCRIPT
SCHEDULING THEORY !and !
REAL-TIME SYSTEMS!
Sanjoy Baruah
The University of North Carolina at Chapel Hill
SCHEDULING THEORY Is a foundational cornerstone of the discipline of real-time computing!
OUTLINE of presentation!
1. The role of scheduling theory
2. An example
3. What constitutes “good” scheduling theory?
4. Some recent research
A history Lesson… !
Safety-critical systems should be correct and have efficient implementations.
Real-time systems: origins in defense and aerospace
• safety-critical: correctness matters
• resource-constrained platforms: need efficient implementation
In the beginning…
• safety-critical systems were simple
• implemented as simple software, on predictable processors
Real-time systems became more complex
• new abstractions were needed
-which details to hide?
Safety-critical systems should be correct and have efficient implementations.
The increasing prominence of safety-critical infrastructure - automotive systems; avionics; medical devices
- cyber-physical systems: intelligent highways; next-generation ATC; smart grid
Correctness considerations remain paramount…
Efficiency considerations matter less - computing capacity plentiful and inexpensive (Moore’s Law)
… but are more difficult to ensure
Most current design techniques focus more on correctness than on efficiency.
Design techniques to facilitate provable correctness
• the synchrony hypothesis • models for timed execution: synchronous reactive; logical execution time; timed automata; etc. • model-based design
Difficult to implement efficiently on advanced platforms
Safety-critical systems should be correct and have efficient implementations.
Current practice: over-provision computing resources
The window of scarcity
platform resources
system specifications
impossible to implement
easy to implement
Safety-critical systems should be correct and have efficient implementations. Most current design techniques focus more on correctness than on efficiency. It is necessary to reintegrate efficiency considerations.
The window of scarcity
platform resources
System model (with correctness proof)
an implementation
easy to implement
Safety-critical systems should be correct and have efficient implementations. Most current design techniques focus more on correctness than on efficiency. It is necessary to reintegrate efficiency considerations.
Current practice
platform resources
scheduling penalty
scheduling penalty
Safety-critical systems should be correct and have efficient implementations. Most current design techniques focus more on correctness than on efficiency. It is necessary to reintegrate efficiency considerations.
Current practice
logarithmic scale Scheduling penalty increases with system complexity
platform resources
scheduling penalty
This penalty matters more: SWaP considerations
- Platform size and weight
- Energy considerations - Environmental concerns - Thermal considerations - Limitations on mobility
Size, Weight, and Power
Safety-critical systems should be correct and have efficient implementations. Most current design techniques focus more on correctness than on efficiency. It is necessary to reintegrate efficiency considerations.
- greater dollar cost
Scheduling penalty increases with system complexity
scheduling penalty
The narrative thus far:
- In the beginning, efficiency mattered
- Over time, correctness became more important
- Today efficiency matters again - Scheduling theory ; formal methods
An example!
int x int y
int s := 0
A program is composed of reactive blocks
y := x + s s := x
Execution proceeds in rounds or time-steps
- these executions occur instantaneously – they take zero execution time
The Synchronous Reactive (SR) programming paradigm
- the initialization code is executed during round 0 - the reaction description is executed during each round
0 1 2 y := x[0] s := x[0]
y := x[1] + x[0] s := x[1]
y := x[2] + x[1] s := x[2]
s := 0
Variables
initialization reaction description
Actions (executions)
input state output
The synchrony assumption:
A program is composed of reactive blocks
The Synchronous Reactive (SR) programming paradigm
f1
f2
f3
f4 f5
f6
I1
I2
I3
O1
O2
Why Synchronous Reactive Programming?
* A higher level abstraction than the jobs-and-tasks models
- Analogous to the synchronous abstraction in (clocked) digital circuit design
The Synchronous Reactive (SR) programming paradigm
state variables = storage elements (registers, flip-flops)
rounds = clock ticks
reaction description = combinatorial circuit
* Simplifications in abstraction facilitates tool development
- Easy-to-use design tools (e .g., Stateflow/ Simulink)
- Models may be verified and validated
* Higher level abstraction enables the correct design of more complex systems
The Synchronous Reactive (SR) programming paradigm
The SR design process yields a model of the system
- Based on the synchrony assumption: actions execute instantaneously - Correctness proofs depend on the synchrony assumption
An implementation of the model will not satisfy the synchrony assumption
Current practice: extensive testing of the implementation
- Correctness proofs not valid for the implementation
- May fail to detect the worst case
- Add a safety margin: resource over-provisioning
Scheduling theory can help!
The Synchronous Reactive (SR) programming paradigm
: related to directed acyclic graph (DAG) scheduling to minimize makespan -- determine the minimum time needed to execute a given DAG on a specified multiprocessor platform
Implementation problem: Given the duration of a time-step, choose a platform
f1
f2
f3
f4 f5
f6
I1
I2
I3
A single processor: easily solved…
Multiprocessor/ multicore platform
• Validate functional correctness • Timing analysis
- upper bound on duration between rounds
• Code implementing functional blocks • Determine worst-case execution times
The Synchronous Reactive (SR) programming paradigm
Periods: Input Ij changes every pj’th round
f1
f2
f3
f4 f5
f6
I1
I2
I3
p1 = 2
p2 = 6
p3 = 3
f1 f2 f3 f4 f5 f6
f1 f4 f5 f6
f3 f6
f1 f4 f5 f6
May assign offsets to schedule multiple such programs on a shared platform
Number of executed blocks reduced by more than half
f1 f2 f3 f4 f5 f6
0 1 2 3 4 5 6
(In rounds 0-5: 36 blocks to 16)
The Synchronous Reactive (SR) programming paradigm
Periods: Input Ij changes every pj’th round
f1
f2
f3
f4 f5
f6
I1
I2
I3
Deadlines: An acceptable delay between input change and output change
0 1 2 3 4 5 6 f1 f2 f3
f4 f5 f6
p1 = 2
p2 = 6
p3 = 3
deadline = 1
f1 f4 f5
f3 f6
f1 f4
f5
f6
- 15 blocks are executed
- at most 3 blocks/ time-step
The Synchronous Reactive (SR) programming paradigm
Periods: Input Ij changes every pj’th round
f1
f2
f3
f4 f5
f6
I1
I2
I3
Deadlines: An acceptable delay between input change and output change
0 1 2 3 4 5 6 f1 f2 f3
f4 f5 f6
p1 = 2
p2 = 6
p3 = 3
deadline = 1
f1 f4
f3 f6
f1 f4
- 15 blocks are executed
- at most 3 blocks/ time-step Summarizing this example:
- The SR model of computation deals with correctness of models
- Scheduling theory is concerned with efficiency of implementation
- Apply scheduling theory to SR models…
f5
… to obtain efficient implementations of correct models
f5
f6
What constitutes “good” scheduling theory for !real-time computing? !
Should potentially be useful
The “pure” sciences legitimately take the discovery of facts and laws as an end in itself. A new fact, a new law is an accomplishment, worthy of publication. But in [computer science and engineering] novelty in itself has no merit. We test our artifacts by their usefulness and their costs, not their novelty.
- Frederick P Brooks, Jr. (UNC) The computer scientist as toolsmith. (CACM 1996)
2. need not be useful immediately
1. basic research has its place… - but present them elsewhere (e.g., theory conferences)
Caveats
Should potentially be useful
At the appropriate level of abstraction
* Therefore, on abstract task/ machine models
- highlights a few salient features or principles
- E.g., for hard-real-time schedulability analysis
* Described using appropriate abstractions
Obtaining appropriate abstractions is an important research area
* Computationally tractable abstractions
* Should address general principles and relational structures of resource-sharing
Mixed-Criticality scheduling theory!
Modern safety-critical systems - are very complex - are implemented upon non-deterministic platforms - need rigorous correctness proofs
Need significant resource over-provisioning
Example: x := a + b on the Motorola PowerPC 755 - Best case: 3 cycles - Worst case: 321 cycles
(In the 1990’s: the Motorola 64K had best case = worst case = 20 cycles)
Example: resource reservation in avionics applications
Mixed Criticalities
Modern safety-critical systems - are very complex - are implemented upon non-deterministic platforms - need rigorous correctness proofs
Need significant resource over-provisioning
Resource over-provisioning leads to inefficient resource usage at run-time
Integrated Computing Environments: functionalities of different criticalities upon a shared platform
- use resource over-provisioning to validate critical functionalities at high levels of assurance
- design-time resource reclamation to validate non-critical functionalities at lower levels of assurance
Models of real-time systems
Recurrent tasks - finite (a priori known) number of them - generate the jobs
Jobs executing on a specified platform
- job J = (release time, execution requirement, deadline)
Jobs executing on a specified platform
- job J = (release time, execution requirement, deadline)
The Liu & Layland (LL) sporadic task model
What is a job? - Where do the execution requirements come from?
What is a sporadic task? - How do we determine the periods?
Worst-Case Execution Time (WCET)
Jobs model code
Executable Code
WCET-analysis
tool
Target architecture
WCET
Execution requirement: execution duration for the code
Determining the execution durations: WCET-analysis tools
WCET-analysis
tool
Target architecture
WCET
WCET-analysis tools determine upper bounds on maximum execution time
Different tools are more or less conservative
Conservative tools may be based on static analysis
- compute bounds at higher levels of assurance
Less conservative tools often measurement based
Non-critical functionalities need to be validated to lower levels of assurance
Worst-Case Execution Time (WCET)
Executable Code
Given a collection of jobs, each job characterized as critical or not, and two WCET values
The multiple-WCET mixed-criticality scheduling problem
Determine a single scheduling strategy
- If each job executes for ≤ its smaller WCET, then all the jobs complete by their deadlines
- If each job executes for ≤ its larger WCET, then all the critical jobs complete by their deadlines
Steve Vestal. Preemptive scheduling of multi-criticality systems with varying degrees of execution time assurance. RTSS 2007
An additional factor: CPU speeds may change
Advanced hardware features - detect if signals are late at the circuit level; recover by delaying next clock tick
GALS: Globally Asynchronous Locally Synchronous - locally synchronous modules that communicate asynchronously - local clocks may be paused, stretched, or data-driven
cloc
k f
requ
ency
time
s slb
Given a collection of jobs, each job characterized as critical or not, a single WCET value, and two processor speeds: s and slb
Determine a single scheduling strategy
- If processor speed remains ≥ s, then all the jobs complete by their deadlines
- If processor speed remains ≥ slb, then all the critical jobs complete by their deadlines
The variable proc.-speed mixed-criticality scheduling problem
Task τi = (Ci,Ti)
Models repeatedly executed code
Ci: maximum execution duration of the code
Ti (the “period” parameter)
must be estimated
- determined using WCET tools
The Liu & Layland (LL) sporadic task model
-time-triggered periodic tasks
-event-triggered sporadic tasks
: (fixed) duration between executions
: minimum duration between executions
-more pessimistic (smaller) periods for validating safety-critical functions -less conservative (larger) periods for validating non-critical functions
Given a collection of sporadic tasks, each task characterized as critical or not, a single WCET value, and two period parameters
Determine a single scheduling strategy
- If successive jobs of each task arrive ≥ its larger period apart, then all jobs complete by their deadlines
- If successive jobs of each task arrive ≥ its smaller period apart, then all critical tasks’ jobs complete by their deadlines
The multiple-period mixed-criticality scheduling problem
Scheduling theory is applied to models
The big picture
… that are conservative
Critical and non-critical functions co-exist in mixed-criticality systems
- upper bound on the execution time of code
- lower bound on the processor speed
- lower bound on duration between external interrupts
1. Validate safety-critical functionalities using conservative approximations
2. Validate all functionalities using less pessimistic approximations
MC scheduling: two independent analyses of a single system
Mixed-criticality scheduling
Multiple dimensions to such modeling
- … - … - …
performance penalties due to being conservative are increasing
Collection of independent jobs, on a varying-speed preemptive processor
Jobs: - critical or not - release time - worst-case execution requirement - deadline
Processor: - minimum speed 1 - maximum speed unbounded
Desired run-time behavior:
• all critical jobs must meet deadlines
• all remaining jobs should meet deadlines
Example: Scheduling analysis of a mixed-criticality system
Collection of independent jobs, on a varying-speed preemptive processor
Jobs: - critical or not - release time - worst-case execution requirement - deadline
Processor: - minimum speed 1 - maximum speed unbounded
SYSTEM DESIGNER
Testing-based validation - EDF optimal for meeting all deadlines - extensive testing of EDF: no deadline miss
(WCET estimations are not needed)
CERTIFICATION AUTHORITY
- WCET estimates are needed
Must prove that all critical jobs will complete on time, assuming speed-1 processor
Prior proof of correctness
Observation: EDF acceptable iff all the jobs, with (pessimistic) WCET’s, are shown feasible on a speed-1 processor
Example: Scheduling analysis of a mixed-criticality system
job critical? ai ci di
J1 Y 0 3 4
J2 Y 1 1 3
J3 N 0 ? ?
0 1 2 3 4
J1 Y
J2 Y
J3 N
time
Necessary: critical jobs must be feasible on a minimum speed processor
Example: Scheduling analysis of a mixed-criticality system
job critical? ai ci di
J1 Y 0 3 4
J2 Y 1 1 3
0 1 2 3 4 Necessary: critical jobs must be feasible on a minimum speed processor
Processor speed
J3 N 0 ? ?
Example: Scheduling analysis of a mixed-criticality system
1
job critical? ai ci di
J1 Y 0 3 4
J2 Y 1 1 3
0 1 2 3 4 On a faster processor (with unknown speed variation)
Processor speed
J3 N 0 ? ?
Would like to also complete the non-critical job[s]
Example: Scheduling analysis of a mixed-criticality system
1
job critical? ai ci di
J1 Y 0 3 4
J2 Y 1 1 3
0 1 2 3 4 Processor speed
J3 N 0 ? ?
- if d3 ≥ 4, EDF is fine
(> 1 unit) (< 2 units)
Would like to also complete the non-critical job[s]
On a faster processor (with unknown speed variation)
Example: Scheduling analysis of a mixed-criticality system
1
job critical? ai ci di
J1 Y 0 3 4
J2 Y 1 1 3
0 1 2 3 4 Processor speed
J3 N 0 ? ?
- if d3 =1:
(1 unit) (2 units)
Would like to also complete the non-critical job[s]
On a faster processor (with unknown speed variation)
Example: Scheduling analysis of a mixed-criticality system
1
(1 unit)
(1 unit) job critical? ai ci di
J1 Y 0 3 4
J2 Y 1 1 3
0 1 2 3 4 Processor speed
J3 N ? ? ?
- if d3 =2:
Would like to also complete the non-critical job[s]
On a faster processor (with unknown speed variation)
(2 units)
Example: Scheduling analysis of a mixed-criticality system
1
Example: Scheduling analysis of a mixed-criticality system
PRIOR TO RUN-TIME
Algorithm for scheduling a mixed-criticality collection of independent jobs on a varying-speed preemptive processor
1. Construct a scheduling table of the critical jobs on a minimum-speed processor, with all execution occurring as late as possible
2. Partition the time-line into disjoint intervals, according to (all) jobs’ release dates and deadlines
DURING RUN-TIME
1. Execute all critical jobs’ executions assigned to that interval in the scheduling table
within each interval
2. Execute non-critical jobs according to EDF
3. Execute critical jobs according to EDF
Example: Scheduling analysis of a mixed-criticality system
PRIOR TO RUN-TIME
Algorithm for scheduling a mixed-criticality collection of independent jobs on a varying-speed preemptive processor
1. Construct a scheduling table of the critical jobs on a minimum-speed processor, with all execution occurring as late as possible
2. Partition the time-line into disjoint intervals, according to (all) jobs’ release dates and deadlines
DURING RUN-TIME
1. Execute all critical jobs’ executions assigned to that interval in the scheduling table
within each interval
2. Execute non-critical jobs according to EDF
3. Execute critical jobs according to EDF
Run-time complexity:
- O(n2) prior to run-time - Same as EDF during run-time
OPTIMALITY property:
- always completes all critical jobs (on a minimum-speed processor)
- If any such on-line algorithm completes all non-critical jobs, does so, too
The Summing Up !
1. Safety-critical systems must
Be correct Have efficient implementations
- rigorous design methodologies - scheduling theory
2. Example: the Synchronous Reactive (SR) model of computation
3. Identifying relevant results from scheduling theory
4. Some current relevant research: mixed-criticality scheduling