a predictable rtos - university of victoria · · 2009-12-22a predictable rtos mantis cheng...

A Predictable RTOSMantis Cheng

Department of Computer ScienceUniversity of Victoria

Outline

I. Analysis of Timeliness Requirements

II. Analysis of IO Requirements

III. Time in Scheduling

IV. IO in Scheduling

V. A Predictable Scheduler

What is an RTOS?

• Linux, VxWorks and QNX all support a pre-emptive, prioritized scheduler, and threads.

• They all claim to have low interrupt latency.

• So, what makes QNX real time, and Linux not?

• “Real time” must have something to do with time. Does it mean “really fast”?

Definitions

• A system is real time if its correctness depends also on its timely responses.

• A real time system is predictable if it guarantees the timeliness of all its stimuli/responses.

• A real time system is determinate if its predictability is processor speed independent.

Analysis of Timeliness

Requirements

Timeliness

• To a control engineer, it means decision cycle time of a control loop.

• To a hardware engineer, it means sampling rates (A-D or D-A) and precise clocking (synchronization).

• To an embedded system programmer, it means fast interrupt response time.

Time in Practice

• Timeliness is not the same as priority/urgency. (“Sample every 5 msec ...” and “When an alarm is raised …” are not the same thing.)

• A task with a higher priority does not imply that it runs more accurately on time. (A lower rate task is no less timely than a higher rate task.)

• “Sleep” is typically the only mechanism to introduce timing delay, periodic or one-time.

Periodic Sleeping

• Typical periodic “polling/sampling” loop:

while (not done) {

do something; // running

sleep( 10 ); // waiting}

• It is expected that this loop is executed once every 10 milliseconds. Does it?

“Sleep” Jitter

• To introduce periodic delays, a task sleeps repeatedly.

• A sleep queue is a general mechanism for all timing purposes.

• For periodic delays, “sleep” is not reliable because timing stops when a task is ready or running (i.e., waiting and execution times are excluded).

Process States

Ready Running

Block

dispatch

sleeptimeout

(timer running)

(timer stopped) (timer stopped)

Difficulties in Practice

• Typical schedulers are priority-based, where priority implies urgency.

• To deal with timeliness, timer interrupts are used frequently, which are not scheduleable.

• Too many timer interrupts may lead to unpredictable latency.

• Priority inversion problems could destroy timeliness completely.

Priority Inversion

P3

shared

P1

P2

wait

hold

(running)

(ready)

(blocked)

Pri

ori

ty l

evel

Priority Inversion

Soft Time vs Real Time

• Real time means human time, which is continuous and concurrent.

• Soft (ware) time is discrete and sequential.

• A scheduler “bridges” soft time with real time, i.e., all IOs must eventually be done at the right time in real time.

• But, most schedulers do not guarantee timeliness.

Analysis of IO Requirements

IO in Practice

• IO bandwidth, latency or jitter typically are not guaranteed by a scheduler.

• As a result, large IO buffers are needed to deal with unpredictable latency or jitter.

• Without bandwidth control, tasks can interfere each other’s progress, e.g., one task (e.g., BitTorrent) may consume more than 80% of the network bandwidth.

Scheduling Requirements

• Precise timing and jitter control are becoming necessary in many embedded systems (e.g., tele-robotics, multimedia devices).

• Today, applications have insatiable demands for IO bandwidth.

• QoS parameters, such as bandwidth, jitter, and latency, should be supported by our schedulers.

Performance Bottleneck• For IO-intensive applications, DMA is

essential.

• Memory protection and user-level buffers could introduce delays due to unnecessary memory copying.

• Shared buffer synchronization could destroy all timeliness requirements.

• Without control, buffer over-runs or under-runs are difficult to prevent.

Summary

Challenges

• Priority is simple; it is all relative.

• Timeliness requires analysis; it is absolute.

• They do not seem to mix well together. Could we combine them in a predictable way?

• QoS is traditionally a “network” issue. But, without end-to-end control, we cannot achieve smooth integrated solutions.

Design Goals• We want to combine timeliness, QoS and

priority into a single scheduler.

• Complex scheduling decisions should be avoided, i.e., minimal scheduling overhead.

• The API must be consistent, i.e., features don’t interfere each other.

• Finally, the solution must be implementable efficiently.

Time in Scheduling

Time in Scheduling

• In most preemptive prioritized schedulers, a timer is used to maintain fairness.

• Time is typically sub-divided into quanta, which are then allocated to tasks.

• Equal priority tasks share a processor fairly.

• Sleeping tasks are sorted in a sleep queue; upon time-outs, they are scheduled based on their priorities.

Timeliness vs Priority

• Which one to run? a. “a task that must be run now” or b. “a task that has the highest priority”.

• A task is periodic if it must run at a fixed rate (e.g., once every 15 msec.)

• When a task wakes up from “sleeping”, it is not clear whether it is periodic or just ready.

• Hence, most schedulers will choose (b).

Wake Up Waiting

• A periodic task wakes up from “sleeping” may not be the highest priority ready task.

• While waiting, “its time to wake up next” (soft time) and “its time to next period” (real time) are drifting.

T = running + sleeping (real time)

T’ = waiting + running + sleeping (soft time)

Periodic Timing Requirement

• A periodic task must be scheduled based on real time, not soft time.

T = waiting + running + delaying (real time)

• Periodicity should be independent of priority.

• For real time software, timeliness must be guaranteed.

IPC and Scheduling

• For coordination and communication, there are many blocking and non-blocking IPC primitives.

• For example, a task that waits on a semaphore may be blocked.

• Synchronization is necessary to prevent race conditions.

• Blocked tasks are typically served fairly.

Timely Tasks and IPC

• What to do with timely tasks, those with periodic timing requirements?

• If not careful, they could miss all timeliness requirements; blocking time is hard to predict.

T = waiting + running + blocking (real time)

• Periodic tasks should not use blocking IPCs (e.g., signal a semaphore/event, asynchronous read/write on FIFOs).

IO in Scheduling

IO in Scheduling

• Other than synchronization (e.g., reading an empty buffer), IO bandwidth, latency and jitter are typically not scheduling parameters.

• A task’s IO performance depends critically on its buffers.

• The scheduler does not know: how much a task reads, how often it reads, or when it reads?

Bandwidth and Buffering

• The larger the buffer, the more a task can process without waiting on IO.

• Without any bandwidth control, one can easily run out of shared buffers.

• The maximum allowable bandwidth is always bounded at some level.

• Bandwidth control is essentially task-level buffer management.

Latency and Buffering• A bandwidth of 64Kbps (8KB/s), without

latency control, could mean 8KB any time within a second.

• By breaking into four 2KB segments, we can control latency without increasing bandwidth.

8KB 8KB

Constant vs Variable Rate

• A latency of 250 msec for 64Kbps means 2KB every 1/4 of a second.

• Two buffers of 2KB are all we need if we guarantee reading/writing at a fixed rate.

• We may need an 8KB buffer if reading/writing is more bursty every second.

• To accommodate both situations, an IO task must specify constant or variable rate.

IO-based Scheduling

• Knowing the bandwidth requirements for each IO task, we simplify our buffer management.

• Buffer sizes could be set to limit IO bandwidth consumption; buffer full/empty conditions trigger scheduling decisions.

• By specifying additional latency requirements, IO tasks have a soft priori “deadline”.

A Predictable Scheduler

Main Features

• It is a preemptive, prioritized, time-based and bandwidth-based scheduler.

• There are 3 scheduling levels: 1) PERIODIC, 2) IO, and 3) SPORADIC.

• Top 2 levels have precise execution rates; they are used for timely activities.

• The 3rd level is executing whenever there is available processing time.

Main Features (2)

• Periodic tasks (levels 1 and 2) are always ready to run, but may be delayed due to “not the right time yet’’. (Note: They can never block, i.e., waiting for something to occur.)

• IO tasks (level 2) have io latency (how often) and bandwidth (how much) requirements.

• Sporadic tasks (level 3) have no timeliness requirements but are ordered by their urgency.

Thread Types

Threads

Timely SPORADIC

PERIODIC IOperiod and

jitterbandwidth and

latency

urgency

A Timely Thread

thread P() { while (not done) {

do something; // running next; // N.B. no timing specification }}

main() { create( P, PERIODIC, 10 msec, 5 % );}

Timely Thread States

Ready Running

Delayed

dispatch

nexttime’s up

(timer running)

(timer running)(timer running)

An IO Thread

thread Q() { while (not done) {

n = read( buffer, size ); if (n > 0) consume buffer; }}

main() { create( Q, IO, 64 Kbps, 125 msec );}

IO Thread States

Ready Running

Delayed

dispatch

bandwidth consumedlatency time’s up

(timer running)

(timer running)(timer running)

IPC

• We support both synchronous and asynchronous IPC.

• All timely tasks (levels 1 and 2) cannot block; hence, they must use asynchronous IPC only.

• SPORADIC tasks may use any IPC, such as Counting Semaphores, Mutexes, Condition queues, RW Semaphores, Events, FIFOs, etc.

• All tasks may be suspended/resumed at will.

Implementations

• We have a version of our RTOS for the ARM-based and TI-DSP-based processors.

• The kernel is about 4000 lines of C, and the IPC library is about 1000 lines of C.

• We have an LCD driver, a Bluetooth asynchronous packet and a Ethernet packet driver.

Periodic Threads Demo

IO Threads Demo

Performance Monitoring• CPU utilization of periodic tasks and bandwidth

consumption of IO tasks are critical in making performance prediction.

• Our RTOS provides online real time collection of vital statistics.

• User-adjustable limits may trigger “violating” tasks to be reported and then optionally aborted.

• Infeasible timing constraints will be detected and reported.

Concluding Remarks• Engineering real time systems is about

predictability, reliability and performance.

• Timing support in typical RTOS is insufficient and inefficient; IO-based scheduling has been mostly overlooked.

• We proposed a simple way of combining both in a scheduler; thus, designing embedded applications will become simpler and more predictable.

The End

a predictable rtos - university of victoria · · 2009-12-22a predictable rtos mantis cheng...

Documents