real-time programming in...

Real-time Programming in RTCore

FSMLabs, Inc. Copyright Finite State Machine Labs Inc. 2001-2005All rights reserved.

27th December 2005

Contents

1 Introduction 111.1 Some background . . . . . . . . . . . . . . . . . . . . . . . . . 131.2 How the book works . . . . . . . . . . . . . . . . . . . . . . . 13

I RTCore Basics 15

2 Introductory Examples 172.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Using RTCore . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.1 Hello world . . . . . . . . . . . . . . . . . . . . . . . . 182.2.2 Multithreading . . . . . . . . . . . . . . . . . . . . . . 192.2.3 Basic communication . . . . . . . . . . . . . . . . . . . 212.2.4 Signaling and multithreading . . . . . . . . . . . . . . 23

2.3 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Real-time Concepts and RTCore 273.1 RTOS kingdom/phylum/order . . . . . . . . . . . . . . . . . . 27

3.1.1 Non-real-time systems . . . . . . . . . . . . . . . . . . 273.1.2 Soft real-time . . . . . . . . . . . . . . . . . . . . . . . 283.1.3 Hard real-time . . . . . . . . . . . . . . . . . . . . . . 29

3.2 The RTOS design dilemma . . . . . . . . . . . . . . . . . . . . 303.2.1 Expand an RTOS . . . . . . . . . . . . . . . . . . . . . 303.2.2 Make a general purpose OS real-time capable . . . . . 313.2.3 The RTCore approach to the problem . . . . . . . . . . 32

3.3 Interrupt emulation . . . . . . . . . . . . . . . . . . . . . . . . 323.3.1 Flow of control on interrupt . . . . . . . . . . . . . . . 333.3.2 Limits of interrupt emulation . . . . . . . . . . . . . . 34

3

4 CONTENTS

3.4 Services Available to Real-Time Code . . . . . . . . . . . . . . 353.4.1 Memory management . . . . . . . . . . . . . . . . . . . 353.4.2 Memory protection . . . . . . . . . . . . . . . . . . . . 363.4.3 VxWorks compliance . . . . . . . . . . . . . . . . . . . 363.4.4 Networking - Ethernet and FireWire . . . . . . . . . . 363.4.5 Integration with other services . . . . . . . . . . . . . . 373.4.6 What’s next . . . . . . . . . . . . . . . . . . . . . . . . 37

4 The RTCore API 394.1 POSIX compliance . . . . . . . . . . . . . . . . . . . . . . . . 39

4.1.1 The POSIX PSE 51 standard . . . . . . . . . . . . . . 404.1.2 Roadmap to future API development . . . . . . . . . . 40

4.2 POSIX threading functions . . . . . . . . . . . . . . . . . . . . 414.2.1 Thread creation . . . . . . . . . . . . . . . . . . . . . . 414.2.2 Thread joining . . . . . . . . . . . . . . . . . . . . . . 434.2.3 Thread destruction . . . . . . . . . . . . . . . . . . . . 444.2.4 Thread management . . . . . . . . . . . . . . . . . . . 444.2.5 Thread attribute functions . . . . . . . . . . . . . . . . 45

4.3 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . 474.3.1 POSIX spinlocks . . . . . . . . . . . . . . . . . . . . . 474.3.2 Comments on SMP safe/unsafe functions . . . . . . . . 474.3.3 Asynchronously unsafe functions . . . . . . . . . . . . 484.3.4 Cancel handlers . . . . . . . . . . . . . . . . . . . . . . 48

4.4 Mutexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.4.1 Locking and unlocking mutexes . . . . . . . . . . . . . 514.4.2 Mutex creation and destruction . . . . . . . . . . . . . 514.4.3 Mutex attributes . . . . . . . . . . . . . . . . . . . . . 52

4.5 Condition variables . . . . . . . . . . . . . . . . . . . . . . . . 524.5.1 Creation and destruction . . . . . . . . . . . . . . . . . 534.5.2 Condition waiting and signalling . . . . . . . . . . . . . 534.5.3 Condition variable attribute calls . . . . . . . . . . . . 54

4.6 Semaphores . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.6.1 Creation and destruction . . . . . . . . . . . . . . . . . 554.6.2 Semaphore usage calls . . . . . . . . . . . . . . . . . . 554.6.3 Semaphores and Priority . . . . . . . . . . . . . . . . . 56

4.7 Clock management . . . . . . . . . . . . . . . . . . . . . . . . 564.8 Extensions to POSIX (* np()) . . . . . . . . . . . . . . . . . . 57

4.8.1 Advance timer . . . . . . . . . . . . . . . . . . . . . . . 57

CONTENTS 5

4.8.2 Processor selection . . . . . . . . . . . . . . . . . . . . 584.8.3 CPU reservation . . . . . . . . . . . . . . . . . . . . . 594.8.4 Enabling FPU access . . . . . . . . . . . . . . . . . . . 594.8.5 Concept of the extensions . . . . . . . . . . . . . . . . 59

4.9 ”Pure POSIX” - writing code without the extensions . . . . . 604.10 The RTCore API and communication models . . . . . . . . . 60

5 More concepts 615.1 Copying synchronization objects . . . . . . . . . . . . . . . . . 615.2 API Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . 625.3 Resource cleanup . . . . . . . . . . . . . . . . . . . . . . . . . 625.4 Deadlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.5 Synchronization-induced priority inversion . . . . . . . . . . . 635.6 Memory management . . . . . . . . . . . . . . . . . . . . . . . 645.7 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.7.1 Methods and safety . . . . . . . . . . . . . . . . . . . . 655.7.2 One-way queues . . . . . . . . . . . . . . . . . . . . . . 665.7.3 Atomic operations . . . . . . . . . . . . . . . . . . . . 71

6 Communication between RTCore and the GPOS 736.1 printf() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.2 rtl printf() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.3 Real-time FIFOs . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.3.1 Using FIFOs from within RTCore . . . . . . . . . . . . 746.3.2 Using FIFOs from the GPOS . . . . . . . . . . . . . . 756.3.3 A simple example . . . . . . . . . . . . . . . . . . . . . 756.3.4 FIFO allocation . . . . . . . . . . . . . . . . . . . . . . 786.3.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . 79

6.4 Shared memory . . . . . . . . . . . . . . . . . . . . . . . . . . 796.4.1 mmap() . . . . . . . . . . . . . . . . . . . . . . . . . . 806.4.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . 816.4.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . 86

6.5 Soft interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . 866.5.1 The API . . . . . . . . . . . . . . . . . . . . . . . . . . 886.5.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . 89

6 CONTENTS

7 Debugging in RTCore 937.1 Enabling the debugger . . . . . . . . . . . . . . . . . . . . . . 937.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

7.3.1 Overhead . . . . . . . . . . . . . . . . . . . . . . . . . 987.3.2 Intercepting unsafe FPU use . . . . . . . . . . . . . . . 997.3.3 Remote debugging . . . . . . . . . . . . . . . . . . . . 997.3.4 Safely stopping faulted applications . . . . . . . . . . . 1007.3.5 GDB notes . . . . . . . . . . . . . . . . . . . . . . . . 100

8 Tracing in RTCore 1018.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1018.2 Basic Usage of the Tracer . . . . . . . . . . . . . . . . . . . . 1028.3 POSIX Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

9 IRQ Control 1059.1 Interrupt handler control . . . . . . . . . . . . . . . . . . . . . 105

9.1.1 Requesting an IRQ . . . . . . . . . . . . . . . . . . . . 1059.1.2 Releasing an IRQ . . . . . . . . . . . . . . . . . . . . . 1069.1.3 Pending an IRQ . . . . . . . . . . . . . . . . . . . . . . 1069.1.4 A basic example . . . . . . . . . . . . . . . . . . . . . . 1069.1.5 Specifics when on NetBSD . . . . . . . . . . . . . . . . 108

9.2 IRQ state control . . . . . . . . . . . . . . . . . . . . . . . . . 1099.2.1 Disabling and enabling all interrupts . . . . . . . . . . 109

9.3 Spinlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

10 Writing Device Drivers 11310.1 Real-time FIFOs . . . . . . . . . . . . . . . . . . . . . . . . . 11310.2 POSIX files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

10.2.1 Error values . . . . . . . . . . . . . . . . . . . . . . . . 11710.2.2 File operations . . . . . . . . . . . . . . . . . . . . . . 117

10.3 Reference counting . . . . . . . . . . . . . . . . . . . . . . . . 11810.3.1 Reference counting and userspace . . . . . . . . . . . . 119

II RTLinuxPro Technologies 121

11 Real-time Networking 123

CONTENTS 7

11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

12 PSDD 12512.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12512.2 Hello world with PSDD . . . . . . . . . . . . . . . . . . . . . . 12512.3 Building and running PSDD programs . . . . . . . . . . . . . 12712.4 Programming with PSDD . . . . . . . . . . . . . . . . . . . . 12712.5 Preallocated Memory Support . . . . . . . . . . . . . . . . . . 13012.6 Standard Initialization and Cleanup . . . . . . . . . . . . . . . 13112.7 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . 13112.8 Example: User-space PC speaker driver . . . . . . . . . . . . . 13212.9 Safety Considerations . . . . . . . . . . . . . . . . . . . . . . . 13512.10PSDD API . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13512.11Frame Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . 135

12.11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 13512.11.2Command-line interface to the scheduler . . . . . . . . 13712.11.3Building Frame Scheduler Programs . . . . . . . . . . . 13812.11.4Running Frame Scheduler Programs . . . . . . . . . . . 139

12.12Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

13 VxIT 14113.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

14 Controls Kit (CKit) 14314.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14314.2 Operation of the Ckit . . . . . . . . . . . . . . . . . . . . . . . 14614.3 PD Controller Example . . . . . . . . . . . . . . . . . . . . . . 147

14.3.1 Entity Registration . . . . . . . . . . . . . . . . . . . . 14814.3.2 Program Execution . . . . . . . . . . . . . . . . . . . . 152

14.4 XML-RPC API . . . . . . . . . . . . . . . . . . . . . . . . . . 16014.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

15 RTLinuxPro Optimizations 16315.1 General optimizations . . . . . . . . . . . . . . . . . . . . . . 16315.2 RTCore-internal optimizations . . . . . . . . . . . . . . . . . . 16415.3 CPU management . . . . . . . . . . . . . . . . . . . . . . . . . 165

15.3.1 Targetting specific CPUs . . . . . . . . . . . . . . . . . 16515.3.2 Reserving CPUs . . . . . . . . . . . . . . . . . . . . . . 165

8 CONTENTS

15.3.3 Interrupt focus . . . . . . . . . . . . . . . . . . . . . . 16615.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 167

III Appendices 169

A List of abbreviations 171

B Terminology 175

C Familiarizing with RTLinuxPro 187C.1 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

C.1.1 Self and cross-hosted development . . . . . . . . . . . . 188C.2 Loading and unloading RTCore . . . . . . . . . . . . . . . . . 188

C.2.1 Running the examples . . . . . . . . . . . . . . . . . . 189C.3 Using the root filesystem . . . . . . . . . . . . . . . . . . . . . 189C.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

D In-kernel C++ Support 193D.1 Building C++ applications . . . . . . . . . . . . . . . . . . . . 193D.2 Running C++ applications . . . . . . . . . . . . . . . . . . . . 194D.3 Caveats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

E Important system commands 197

F Things to Consider 203F.0.1 System Management Interrupts (SMIs) . . . . . . . . . 203F.0.2 Drivers that have hard coded cli/sti . . . . . . . . . . 204F.0.3 Power management (APM) . . . . . . . . . . . . . . . 204F.0.4 Spurious IRQ7 . . . . . . . . . . . . . . . . . . . . . . 204F.0.5 Hardware platforms . . . . . . . . . . . . . . . . . . . . 205F.0.6 Floppy drives . . . . . . . . . . . . . . . . . . . . . . . 205F.0.7 ISA devices . . . . . . . . . . . . . . . . . . . . . . . . 206F.0.8 DAQ cards . . . . . . . . . . . . . . . . . . . . . . . . 206

G RTCore Drivers 209G.1 Digital IO Device Common API . . . . . . . . . . . . . . . . . 209G.2 PPS driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

G.2.1 Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

CONTENTS 9

G.2.2 Timing and how it works . . . . . . . . . . . . . . . . . 211G.2.3 Using the driver . . . . . . . . . . . . . . . . . . . . . . 212G.2.4 Caveats . . . . . . . . . . . . . . . . . . . . . . . . . . 213

G.3 RTLinux PCI Driver . . . . . . . . . . . . . . . . . . . . . . . 214G.4 IEEE-1284 – Parallel Port Digital IO Driver . . . . . . . . . . 216G.5 VME driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

G.5.1 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . 217G.5.2 Slave memory regions . . . . . . . . . . . . . . . . . . . 218G.5.3 Master memory regions . . . . . . . . . . . . . . . . . . 219G.5.4 DMA transfers . . . . . . . . . . . . . . . . . . . . . . 219G.5.5 Performance . . . . . . . . . . . . . . . . . . . . . . . . 220G.5.6 Chip specific notes . . . . . . . . . . . . . . . . . . . . 221

G.6 Serial driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222G.7 Video Framebuffer Driver . . . . . . . . . . . . . . . . . . . . 224

G.7.1 Calling Contexts . . . . . . . . . . . . . . . . . . . . . 224G.7.2 Operations on the Framebuffer . . . . . . . . . . . . . 224G.7.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 225

G.8 Intel 82C55 Digital IO . . . . . . . . . . . . . . . . . . . . . . 226G.8.1 Driver specifics . . . . . . . . . . . . . . . . . . . . . . 226

G.9 Marvell GT64260 and GT64360 Digital IO Driver . . . . . . . 227G.9.1 Driver specifics . . . . . . . . . . . . . . . . . . . . . . 227

G.10 Power Management Driver . . . . . . . . . . . . . . . . . . . . 227G.11 Frequency changing . . . . . . . . . . . . . . . . . . . . . . . . 227G.12 CPU Idle calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 227G.13 Additional Uses . . . . . . . . . . . . . . . . . . . . . . . . . . 228

H “New” RTCore Networking 229H.1 E1000 Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

H.1.1 Tested Revisions . . . . . . . . . . . . . . . . . . . . . 230

I The RTCore POSIX namespace 231I.1 Clean applications . . . . . . . . . . . . . . . . . . . . . . . . 231I.2 Polluted applications . . . . . . . . . . . . . . . . . . . . . . . 232I.3 PSDD users . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234I.4 Include hierarchies and rules . . . . . . . . . . . . . . . . . . . 235

I.4.1 app/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235I.4.2 rtcore/ . . . . . . . . . . . . . . . . . . . . . . . . . . . 236I.4.3 gpos bridge/ . . . . . . . . . . . . . . . . . . . . . . . . 236

10 CONTENTS

I.5 Including GPOS files . . . . . . . . . . . . . . . . . . . . . . . 237I.6 Quick rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

I.6.1 Older apps that must be polluted . . . . . . . . . . . . 237I.6.2 Older users that want to avoid pollution . . . . . . . . 238I.6.3 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . 238

J System Testing 241J.1 Running the regression test . . . . . . . . . . . . . . . . . . . 241

J.1.1 Stress testing . . . . . . . . . . . . . . . . . . . . . . . 242J.2 Jitter measurement . . . . . . . . . . . . . . . . . . . . . . . . 243

K Sample programs 245K.1 Hello world . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245K.2 Multithreading . . . . . . . . . . . . . . . . . . . . . . . . . . 245K.3 FIFOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

K.3.1 Real-time component . . . . . . . . . . . . . . . . . . . 246K.3.2 Userspace component . . . . . . . . . . . . . . . . . . . 248

K.4 Semaphores . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248K.5 Shared Memory . . . . . . . . . . . . . . . . . . . . . . . . . . 250

K.5.1 Real-time component . . . . . . . . . . . . . . . . . . . 250K.5.2 Userspace application . . . . . . . . . . . . . . . . . . . 253

K.6 Cancel Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . 254K.7 Thread API . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255K.8 One Way queues . . . . . . . . . . . . . . . . . . . . . . . . . 256K.9 Processor reserve/optimization . . . . . . . . . . . . . . . . . . 258K.10 Soft IRQs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259K.11 PSDD sound speaker driver . . . . . . . . . . . . . . . . . . . 261

Chapter 1

Introduction

Real-time software is needed to run cell phones, power grids, telescopes, ma-chine tools, robots, factory floors, and many other kinds of systems. TheRTCore hard real-time operating system has been used to control the me-chanical animals in the movie Dr. Doolittle, perform jet engine testing forthe Joint Strike Fighter, aim the telescope at Kitt Peak, run flight simulators,collect weather data for NASA, balance magnetic bearings, milk cows, con-trol Fujitsu’s humanoid robot, win robotic football tournaments, and more.The system is flexible enough that for one customer, it can control an engine,while for another it just as easily mimics the human hand playing a violin.

RTCore is designed to make real-time programming more convenient andless mysterious. Real-time programming is still pretty challenging, but oncethe basic ideas are understood and provided the programmer has some Cand UNIX background, programming RTCore applications should be, if notsimple, at least feasible. This book will cover the basic principles of the op-erating system (OS) and general real-time programming, stressing examplesand practical methods as much as possible.

RTCore follows the UNIX philosophy of making it convenient to buildcomplex applications by connecting existing pieces of software. One way tothink of RTCore is as a small operating system that runs a second operatingsystem as its lowest priority task. All the non-time-critical applications canbe put in the second operating system. For most programmers it is probablymore useful to think of RTCore as a special real-time process that runswithin a non-real-time operating system. It schedules itself and can alwayspre-empt both the operating system and any applications. The non real-timeoperating system is usually Linux (RTLinux), although it can also be BSD

11

12 CHAPTER 1. INTRODUCTION

UNIX (RTCoreBSD) or even a Java VM. 1

The real-time system is multithreaded using the POSIX threads API.Real-time applications are written as threads and signal handlers that canbe installed in the real-time process. These threads and signal handlers canbe scheduled with great precision and can respond to interrupts with verylow latency.

On a 1.2GHz Athlon (lower end hardware these days), an RTCore in-terrupt handler runs within 9 microseconds of the assertion of a hardwareinterrupt, under very heavy load. If a device on this system generates aninterrupt when the temperature gets too high, the signal handler connectedto that device will start running 9 microseconds after the interrupt is gen-erated at the worst case. On the same system, a periodic thread scheduledto run every millisecond will run at most 13 microseconds late after the sumof interrupt latency, scheduling overhead, and context switch times are ac-counted. In addition, if a data acquisition device is attached to this systemand the system polls it at a regular rate, it can be concluded that the pollingthread starts up within 13 microseconds of the scheduled time. 2 RTCoreis a hard real-time system, so all numbers shown are absolute worst-casetimes, not average or ”typical” times. Be wary of tests that demonstrateother approaches. Many of these tests are done against quiescent systems,for short periods of time or quote numbers that have no relevance to realsituations and are in no way indicative of real-world results. Later exampleswill discuss what to look for to weed out these irrelevant tests.

Of course, all this speed does no good if programming the system is toocomplicated. As a result, RTCore as been designed to meet two goals. First,the time critical software can be written in the familiar and well documentedPOSIX threads/signals API. Second, it is fairly simple to put non-time-critical software into the application operating system. A favorite exampleof the RTLinux developers is the ease of writing a data logging program thatmakes use of a single shell script line on the UNIX side:

./rtcore app > mylogfile

This code runs the real-time application and logs output to a non-real-time

1Generally, the term ”GPOS”, or General Purpose Operating System to genericallyrefer to as the non real-time system. The RTCore API and behavior remain the sameregardless of which GPOS is being used.

2Later chapters, will show how to reduce this time down to 0 microseconds, bypassinghardware jitter.

1.1. SOME BACKGROUND 13

Linux file.This book begins with some background of RTLinux followed by some

simple examples. It then takes a detour for an in-depth introduction tothe basic concepts of RTCore and an overview of the RTLinux API. Next,the available communication models for exchanging data between real-timethreads and the non real-time domain are presented. The sample programsprovided then use these mechanisms to show how they apply as solutions tosimple problems. The following chapters are devoted to stepping throughthese example programs, making every step as clear as possible, and requireslittle prior knowledge. Following that, several chapters are dedicated to themore advanced features of RTLinux Professional, or RTLinuxPro. Afterhaving covered the basic concepts, this book wraps up with a basic modelfor writing real-time drivers.

1.1 Some background

RTLinux began as a research project in 1995 to investigate a simple methodof providing hard real-time services within the context of a general purposeoperating system. Soon thereafter, it began to be used in a variety of do-mains. FSMLabs was formed to commit a dedicated effort to improvingthe technology and to provide top tier support for commercial users of theproduct.

RTLinuxPro was developed out of this effort and is licensed for commer-cial use. FSMLabs continues to move the technology forward via RTLin-uxPro and the RTCore OS, having dedicated many man years to providinga solid and integrated hard real-time component for commercial customers.The RTLinuxFree project, based on the GPL-released project, is commu-nity supported and developed. FSMLabs continues to provide the necessaryresources to support the RTLinuxFree community.

1.2 How the book works

The main body of each chapter discusses the principles of how the softwareexamples work. In each chapter, side notes describe how to implement theexamples in RTLinuxPro. The appendix emphasizes key points and includesfull source to all examples found in this book. As mentioned, the RTCore OS

14 CHAPTER 1. INTRODUCTION

can run different non real-time operating systems, but Linux and BSD UNIXwill generally be referred to by default. Ports to other operating systems arein development.

The target audience for this book is the engineer who is interested inlearning how to write real-time applications using the RTCore OS. The bookfocuses on getting the engineer spun up on each facet so they can becomeproductive quickly. Experience developing real-time applications is helpfulbut not necessary because RTCore uses the standard POSIX API. Users withsome knowledge of POSIX and UNIX should feel comfortable with RTLinux.

The full source of the programs referenced in the book can be found inAppendix K and are provided with the RTLinuxPro development kit.

Part I

RTCore Basics

15

Chapter 2

Introductory Examples

2.1 Introduction

The RTCore OS is a small, hard real-time operating system that can runLinux or BSD UNIX as an application server. This allows a standard op-erating system to be used as a component of a real-time application. Thischapter will provide an overview of RTCore’s capabilities, introducing ba-sic concepts, the API, and some of the add-on components. This chapterassumes the developer has already installed RTLinuxPro or RTCoreBSD.Please refer to the devkit manual.pdf provided with the development kit fordetails. This chapter will assume an RTLinuxPro environment, but the pro-cedures apply equally to a BSD host.

2.2 Using RTCore

RTCore extends the UNIX “design with components” philosophy to real-time. A typical RTCore application consists of one or more real-time com-ponents that run under the direct control of the real-time kernel and a setof non real-time components that run as user-space programs. This chapterwill be begin with a couple of simple programs.

Very basic familiarity of RTCore concepts is assumed thus far. If thereader would like more information getting up to speed before continuing,please refer to the RTLinuxPro guide found in Appendix C. Lastly, if thereader needs more grounding when working through the examples, skip aheadto Chapter 3 for more background information.

17

18 CHAPTER 2. INTRODUCTORY EXAMPLES

For this example, the core RTCore OS needs to be loaded as described inAppendix C. It is assumed that current working directory is doc/pdf/rtl book code

and the user is root. More examples can be found in rtlinuxpro/examples.

2.2.1 Hello world

As with most systems, it makes sense to start things off with a simple ”helloworld” application. This is no exception.

#include <stdio.h>

int main(void){

printf("Hello from the RTL base system\n");return 0;

}

Surprised? This is all that is involved - nothing more than what onewould see in a normal C introduction. Running the example (./hello.rtl)forces the RTCore OS to load the application and enter the main() context.Here it prints a message out through standard I/O for the user to see andexits.

A simple makefile is needed to build this. One can be taken from theexamples provided with RTLinuxPro or the example below:

include /opt/rtldk−x.y/rtlinuxpro/rtl.mk

all: hello.rtl

include $(RTL DIR)/Rules.make

This is the makefile needed to build the hello.rtl program from hello.c.Including rtl.mk will set up the build environment - compilers, CFLAGS,and so on. Including Rules.make will provide the build rules needed totransform C source to an RTCore application. rtl.mk needs to have its pathhardcoded to set the variable $(RTL DIR). For more detailed configurations,please refer to the examples provided with RTLinuxPro.

2.2. USING RTCORE 19

Note - do not build applications within the RTLinuxPro filesystem. In-stead, build them externally in a private build tree. This prevents accidentalchanges being made to RTLinuxPro by executing make in the rtlinuxpro/

directory. In a binary kit, this may remove pieces that cannot be repairedwithout a reinstall.

Those familiar with older RTLinux versions (2.0 and prior) are used toprintf() messages silently appearing in the kernel’s ring buffer, but nowthey print through stdout just like any other application. Also, there is astandard printf(), rather than the rtl printf() some users have seen.This printf() is fully capable and can handle any format that a normalprintf() can handle.

Once the message has been printed, the program exits, and RTCore un-loads the application.

2.2.2 Multithreading

If the reader is familiar with POSIX threading, they will feel at home withRTCore. For those that are not as familiar, there are many solid referenceson the subject, such as the O’Reilly book on Pthreads Programming. Thenext example shows a RTLinux pthread model with a task that operates ona 1 millisecond interval.

#include <stdio.h>#include <pthread.h>#include <unistd.h>

pthread t thread;

void *thread code(void *t){

struct timespec next;int count = 0; 10

clock gettime(CLOCK REALTIME, &next);

while (1) {timespec add ns(&next, 1000*1000);clock nanosleep(CLOCK REALTIME, TIMER ABSTIME,


&next, NULL);count++;if (!(count % 1000))

printf("woke %d times\n",count); 20

}

return NULL;}

int main(void){

pthread create(&thread, NULL, thread code, (void *)0);

rtl main wait(); 30

pthread cancel(thread);pthread join(thread, NULL);

return 0;}

Again, everything starts with a normal main() function. A standardthread is spawned right away (pthread attributes will be covered later) andthe code calls rtl main wait(). This is really a blocking function that al-lows the application to stay suspended until otherwise shut down. Thoseprogrammers who have ever done graphical applications with a main eventloop, the same concept applies here.

If the application is killed (via CTRL-C or otherwise), the waiting call willcomplete. The rest of the function will cancel the thread, join its resourcesand return.

The thread itself is a hard real-time thread running under RTCore thatexecutes on an exact 1 millisecond period. It samples the current time ex-actly, adds 1 millisecond to that value and sleeps until that time hits. Itcounts the number of wakeups and prints a count every 1000 iterations.(1000 printf() calls per second clutters the terminal pretty quickly.) Thisthread will execute indefinitely, until the application is actively unloaded.


It is important to note that code in the main() routine is inherently nonreal-time. Any potentially non real-time activity should be done here, such asmemory allocation and other initialization tasks. Memory allocation and it’spotentially non real-time activity will be discussed further in a later chapter.

2.2.3 Basic communication

There needs to be some communication from one real-time thread to anotherand also between real-time threads and non-real-time threads, such as Linuxprocesses. Later chapters will discuss this in more detail, but this examplewill just look at the simplest of mechanisms, the FIFO.

Real-time FIFOs are just like any other FIFO device. A producer, whethera real-time thread or a userspace application, pushes data in and a consumerrecieves it in the order it was submitted. Real-time FIFOs are constructedsuch that real-time threads will never block on data submission. They willalways perform the write() and move on as quickly as possible. This wayreal-time applications can never be stalled because of the FIFO’s state.

First, here is the real-time component:

#include <stdio.h>#include <pthread.h>#include <unistd.h>#include <sys/mman.h>#include <sys/types.h>#include <sys/stat.h>

pthread t thread;int fd1;

10


struct timespec next;


while (1) {timespec add ns(&next, 1000*1000*1000);


clock nanosleep(CLOCK REALTIME, TIMER ABSTIME, 20

&next, NULL);

write( fd1, "a message\n", strlen("a message\n"));}

return NULL;}

int main(void){ 30

mkfifo("/communicator", 0666);

fd1 = open("/communicator", O RDWR | O NONBLOCK);

ftruncate(fd1, 16<<10);


rtl main wait();40


close(fd1);unlink("/communicator");

return 0;}

This code begins by creating the FIFO with standard POSIX calls. mkfifo()creates the FIFO with permissions such that a device will appear in the GPOSfilesystem dynamically. The file is opened normally and a call ftruncate()to size it. This sets the ’depth’ of the FIFO.

A thread is spun, while it waits to be killed, after which the main codecontinues. Once rtl main wait() completes, the FIFO is closed/unlinkedin addition to regular thread cleanup, just like any normal file. RTCore will


catch dangling devices and clean them up for a user, but good programmingpractice is to do the work right in the first place.

The thread in this instance sleeps on a one second interval and writesto the FIFO every time it wakes up. As before, it will do this indefinitely.Below is the userspace code:

#include <stdio.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <unistd.h>

int main(int argc, char **argv) {int fd;char buf[255];

10

fd = open("/communicator", O RDONLY);while (1) {

read(fd,buf,255);printf("%s",buf);sleep(1);

}}

This is a normal non real-time, userspace application. It opens the otherend of the FIFO and reads periodically, getting the message from the otherend. This could be some device data protocol from the RTOS, the userspaceapplication could write data up to the RTOS to direct thread execution, orFIFOs can be used between real-time threads. Either way, they provide asimple file-based means of exchanging data.

2.2.4 Signaling and multithreading

Communication between threads is also done via standard POSIX mecha-nisms. Again, all of the different means are covered later, but here is a lookat semaphores, which are a very convenient method of signaling betweenthreads.

#include <stdio.h>


#include <pthread.h>#include <unistd.h>#include <semaphore.h>

pthread t wait thread;pthread t post thread;sem t sema;

void *wait code(void *t) 10

{while (1) {

sem wait(&sema);printf("Waiter woke on a post\n");

}}

void *post code(void *t){

struct timespec next; 20



clock nanosleep( CLOCK REALTIME,TIMER ABSTIME, &next, NULL);

printf("Posting to the semaphore\n"); 30

sem post(&sema);}

return NULL;}

int main(void){

sem init(&sema, 1, 0);

2.3. PERSPECTIVE 25

40

pthread create(&wait thread, NULL, wait code, (void *)0);pthread create(&post thread, NULL, post code, (void *)0);

rtl main wait();

pthread cancel(post thread);pthread join(post thread, NULL);

pthread cancel(wait thread);pthread join(wait thread, NULL); 50

sem destroy(&sema);

return 0;}

Instead of a single thread, two are spun up, once the semaphore is ini-tialized. One thread waits on the semaphore, while the other sleeps andperiodically performs the sem post() operation. Before the post occurs andafter the waiter wakes, a message is printed to indicate the sequence of events.The timespec add ns() function is shorthand provided by RTCore to addnanoseconds to a timespec structure.

Semaphores really are that easy. Later chapters will show how semaphoreseasily handle synchronization problems.

2.3 Perspective

At this point, take a step back and look at what was just covered. In ashort review, code that performs standard output, POSIX threads, commu-nication through real-time devices, and synchronization through standardPOSIX semaphores was shown. None of it required much experience be-yond basic knowledge of C and POSIX and minimal UNIX background. Infact, these applications are no different than what is seen in a normal C en-vironment under another UNIX. The only difference is that hard real-timeresponse is obtained in the threads.


The point of this was to get the reader handling useful code as quicklyas possible, easing the stigma surrounding real-time programming. The nextchapter will take a step back to look at a broader view of RTCore, the APIand some of the grounding principles of real-time programming.

Chapter 3

Real-time Concepts andRTCore

By now the reader has seen some basic RTCore code and can see that real-time programming is not as mystifying as it sounds. However, before divinginto detailed coverage of the API, some basic concepts need to be demon-strated. For those familiar with RTOS concepts, most of this chapter shouldbe review, but skimming is recommended because this chapter will be ex-plaining how RTCore handles real-time problems.

3.1 RTOS kingdom/phylum/order

The definition of real-time varies greatly based on its use. Anything fromstock quotes to stepper motors can be said to be real-time. Within thecomputing industry, real-time has many different meanings depending onthe requisite service level. Here is a simple breakdown of operating systemsin relation to real-time applicability.

3.1.1 Non-real-time systems

”Non real-time” systems are the operating systems most often used. Thesesystems have no hard guarantees and are able to utilize optimization strate-gies contradictory to real-time requirements, such as caching and buffering.Non real-time systems have the following characteristics:

27

28 CHAPTER 3. REAL-TIME CONCEPTS AND RTCORE

• No guaranteed worst-case scheduling jitter. Under heavy system load,the system may defer scheduling of a task as long as it deems necessary.

• No theoretical limit on interrupt response times. System load mayresult in delayed interrupt response. Also, running with interruptsdisabled for considerable periods, while considered to be bad form, isnot catastrophic.

• No guarantee that an event will be handled. Varying the system loadaffects the number of events it intercepts, such as interrupts.

• System response is strongly load-dependent. Tasks that take x amountof time under one system load will take y amount of time under a differ-ent load. Response prediction with any surety is generally impossible.

• System timing is an unmanaged resource. This means that timing datais not considered important to system execution and is not tracked withprecision.

Non real-time systems are unpredictable even at a statistical level becausesystem reaction is highly dependent on system load. Rough predictions canbe made if the error window is opened widely, but the results cannot beproven to fall inside the predicted range.

3.1.2 Soft real-time

In cases where missing an event is not critical, such as in a video applicationswhere a missed frame or two is not fatal, a ”soft real-time” system may do.Such a system is characterized by the following criteria:

• The system can guarantee a rough worst-case average jitter, but notan absolute worst case scenario.

• Events may still be missed occasionally. This is better than the nonreal-time system, as there is more control over response, but as theabsolute worst case is unknown, events such as interrupts may still belost. This may occur even when not in a worst case situation.

Soft real-time systems are statistically predictable for the average case,but a single event can not be predicted reliably. Soft real-time systems aregenerally not suited for handling mission-critical events.

3.1. RTOS KINGDOM/PHYLUM/ORDER 29

3.1.3 Hard real-time

The following list of requirements define a hard real-time system. The ab-sence of predictability for any one of these items disqualifies a system frombeing a hard real-time system.

• System time is a managed resource. Timing resources are managedwith the highest possible level of precision.

• Guaranteed worst-case scheduling jitter. If a task needs to be happenwithin a certain deviation, it is guaranteed to occur.

• Guaranteed maximum interrupt response time. As with schedulinglatency, interrupts are guaranteed to be acknowledged and handledwithin a certain window.

• No real-time event is ever missed. This is important. Under no cir-cumstances will a scheduled task not be run on time, an interrupt bemissed, or any other event the real-time code is interested in.

• System response is load-independent. Execution of real-time tasks isguaranteed to fall within the worst case value range, regardless of thesystem load factor. A thrashing database process or a runaway webserver will not delay movement of a robotic arm.

A system that can fulfill these criteria is fully deterministic and consideredto be ”hard” real-time. Of course, there are varying levels of service, as somehard real-time systems might have a worst case jitter of 2 seconds, whileothers provide 25 microseconds. Both qualify according to the definition, butonly one is usable for a wide range of applications. The RTCore approachqualifies on all of these counts, as response time is near the limits of theunderlying hardware.

Hard real-time systems will generally have slightly lower average perfor-mance than soft real-time systems, which in turn are generally not as efficientwith resources as non real-time systems. This is because non real-time sys-tems are concerned with throughput - if an Ethernet transfer is delayed alittle in order to burst out several disk transfers, this results in higher systemoutput, and has no significant repercussions in a non real-time environment.In a hard real-time system, not performing this optimization results in loweroverall throughput, but it maintains determinism. This determinism is what


makes the difference between getting a task done without fail and doing a”best effort” based on available system resources.

3.2 The RTOS design dilemma

The fundamental problem of an RTOS is that users have conflicting demandswith respect to system design. On one hand, an RTOS should obviously becapable of real-time operations. On the other hand, users want access tothe same rich feature sets found in general-purpose operating systems whichrun on desktop PCs and workstations. To resolve this dilemma, two generalconcepts have traditionally been used.

3.2.1 Expand an RTOS

Design guidelines for an RTOS include the following: It needs to be compact,predictable and efficient. It should not need to manage an excessive numberof resources and it should not be dependent on any dynamically allocatedresources. If one expands a compact RTOS to incorporate the features oftypical desktop systems, it is hard (if not impossible) to fulfill the demandsof the core RTOS. Problems that arise from this approach include:

• The OS becomes very complex. This makes it difficult to ensure de-terminism, since ALL core capabilities must be fully preemptive. Thismeans that all developers must now take into account every possiblereal-time demand in addition to solving problems in their specific do-main.

• Drivers for hardware become very complex. Since priority inversionmust not occur, drivers must be able to handle situations in whichthey are not being serviced. Again, this forces all developers to dealwith additional possibilities outside their domain.

• Since the core system is an RTOS, the vast amount of available softwarecannot (in most cases) be used without modification or at least signifi-cant analysis with respect to real-time demands. It is almost impossibleto determine interactions between the software and the RTOS.

3.2. THE RTOS DESIGN DILEMMA 31

• Many mechanisms for efficiency, like caching and queuing, becomeproblematic. This prohibits usage of many typical optimization strate-gies for the non real-time applications in the system.

• Maintenance costs of such a system are considerable for both developersand customers. Since every component of the system can influencethe entire system’s behavior, it is very hard to evaluate updates andmodifications with respect to how they will influence real-time behaviorof the rest of the system. Engineering costs skyrocket as reliabilitybecomes questionable.

3.2.2 Make a general purpose OS real-time capable

The seemingly natural alternative strategy would be to add real-time capa-bilities to a general purpose OS. In practice, this approach meets constraintssimilar to those noted above, as both are converging on the same idea fromdifferent directions. Problems that arise with such an approach include:

• General purpose operating systems are event-driven, not time-triggered.

• General Purpose OSs are (generally) not fully preemptive systems.Making them fully preemptive requires modifications to all hardwaredrivers and to all resource handling code. For a constantly evolving sys-tem, tracking these modifications is prohibitive from both a manpowerperspective and becomes even more difficult as the OS is patched andmodified in the field. In addition, preemption has been found to reducethroughput and response characteristics in many common scenarios.

• Lack of built-in high-resolution timing functions entail substantial sys-tem modification.

• Modifying existing applications to be preemptive is very costly anderror-prone.

• The use of modified applications would also greatly increase mainte-nance costs.

• Optimization strategies used in general purpose OSes can contradictthe real-time requirements. For example, removing all caching andqueueing from an OS would substantially degrade performance in areaswhere there are no real-time demands.


• Because such systems are very complex (and often not well-documented),it is extremely difficult to reliably achieve full preemption, especiallywithout performance degradation in many usage scenarios. Add in thefact that the system is constantly developing, and the problem worsens.

General purpose operating systems are efficient with resources. Becausethey do not manage time as an explicit resource, trying to modify the systemto do so violates many of its design goals, and causes components to be usedin ways they were never designed for. This is in principle a bad strategy,especially when there are many developers, all with different visions of whatthe exact behavior of the machine should be.

3.2.3 The RTCore approach to the problem

To resolve these conflicting demands, a simple solution has been developed.RTCore splits the OS entirely, so that one kernel (Linux or BSD UNIX) runsas a general purpose system (GPOS) with no hard real-time capabilities butwith a large capability set, and a second kernel (RTCore), designed aroundreal-time capabilities efficiently handles real-time work. The real-time kernelallows the GPOS to run when there are no real-time demands. This approachallows the non real-time side of the OS to provide all the capabilities thatdesktop users are used to, while the real-time side can be kept small, fast,deterministic, and verifiable.

Three major attributes make RTCore work:

• It disables all hardware interrupts in the GPOS.

• It provides interrupts via interrupt emulation.

• It runs full featured non real-time Linux (or BSD) as the lowest prioritytask. It is the ”idle task” of the RTOS, meaning that it is run wheneverthe real-time system has nothing else to execute.

3.3 Interrupt emulation

The main problem in adding hard real-time capabilities to a general purposeoperating system is that the disabling of interrupts is widely used in thekernel for synchronization purposes. The strategy of disabling interrupts in

3.3. INTERRUPT EMULATION 33

critical code sequences (as opposed to using synchronization mechanisms likesemaphores or mutexes), is quite efficient. It also makes code simpler, sinceit does not need to be designed to be reentrant. But disabling of interruptsfor long periods results in lost events.

To maintain the structure of the GPOS kernel while providing real-timecapabilities, one must provide an ”interrupt interface” that gives full controlover interrupts, but at the same time appears to the rest of the system likeregular hardware interrupts. This interrupt interface is essentially an inter-rupt emulation layer, and is one of the core concepts in RTCore. Interruptemulation is achieved by replacing all occurrences of sti and cli with emu-lation code. This introduces a software layer between the hardware interruptcontroller and the GPOS kernel, allowing the real-time kernel to handle in-terrupts as needed by real-time code, but still allowing the general purposeOS to handle them if there is a need.

Interrupts that are not destined for a real-time task must be passed on tothe GPOS kernel for proper handling when there is time to deal with them.In other words, RTCore has full control over the hardware and non real-timeGPOS sees soft interrupts, not the ”real” interrupts. Hardware interruptinteraction is simply emulated in the GPOS. This means that there is noneed to recode GPOS drivers, provided there are no hard-coded instructionsin binary-only drivers that bypass the emulation. (See F.0.2 for details.)

3.3.1 Flow of control on interrupt

What happens when an interrupt occurs in RTCore? The following pseu-docode shows how RTCore handles such an event.

if (there is an RT-handler for the interrupt) {

call the RT-handler

}

if (there is a GPOS-handler for the interrupt) {

call the GPOS handler for this interrupt

} else {

mark the interrupt as pending

}

This pseudocode represents the priority introduced by the emulation layerbetween hardware and the GPOS kernel. If there is a real-time handler


available, it is called. After this handler is processed, the GPOS handler iscalled. This calling of the GPOS handler is done indirectly: it runs as theidle task of the RTCore kernel, so the GPOS handler will be called as soon asthere is time to do so, but a GPOS interrupt handler cannot block RTCore.That is, the interrupt handler for the GPOS is called from within the GPOS,not from RTCore. If the interrupt is deferred to the GPOS and its interrupthandler is executing when a real-time task must run, the real-time kernel willsuspend it’s execution, and the real-time code will execute as needed.

3.3.2 Limits of interrupt emulation

Interrupt emulation does have limits. Even for non real-time interrupts, thesystem must take the time to acknowledge the interrupt controller and recordthe fact that the interrupt has happened. The hardware interrupts have pri-ority over the real-time tasks, and so a GPOS hardware interrupt may disturbthe real-time scheduling. Fortunately, the actual code is well optimized andhas very little impact even on older platforms. Also, the system works insuch a way that a particular GPOS interrupt may not preempt real-timesystem more often than once a period of RT activity. Therefore, the worstcase scheduling jitter that can be attributed to the non real-time hardwareinterrupts is bounded by the number of such interrupts that can be receivedby the current CPU times the maximum time to acknowledge an interruptand record the interrupt occurence. The CPU reservation facility (see Sec-tion 4.8.3) can eliminate this and minimize other sources of scheduling jitter,making for excellent real-time performance on SMP systems. The RTCoreadvance timer option (Section 4.8.1) may be used to improve performanceon both uniprocessor and multiprocessor systems.

Since non real-time activity may have effect on worst-case timings in thesystem (e.g. ping flooding a system while running a critical real-time taskmay shift its timings), the worst possible conditions should be used to test asystem’s worst-case scheduling jitter and interrupt response time.

Later chapters and Appendix J covers some basic testing environmentsfor stressing hardware.

3.4. SERVICES AVAILABLE TO REAL-TIME CODE 35

3.4 Services Available to Real-Time Code

Code run in the RTCore real-time kernel does not exist in a vacuum. Servicesare available to real-time code, although applicability may vary dependingon system configuration, real-time demands, and RTLinuxPro componentsavailable. Later chapters, will cover these components in detail, but a fewnotes are in order to aid in the understanding of the examples.

3.4.1 Memory management

Strictly speaking, there is no memory management from within RTCore be-fore version 2.2. The reason for this is that memory allocation is difficultto manage deterministically with respect to real-time demands. As of ver-sion 2.2, FSMLabs has added a fast real-time safe memory allocator that canwork with a variety of allocation usage patterns. This is documented in moredetail in section ??.

Aside from using this allocator, there are also other alternatives for real-time code:

• Allocate memory in intialization code, during execution of main(). Asthis is in the startup context, not in a real-time thread, interrupt han-dler, or otherwise, it is perfectly safe. Also, the memory could bedeclared as static to the module.

• Soft-IRQs: Soft-IRQs are discussed in detail later, but this approachinvolves creating a virtual IRQ that is visible to the GPOS. Using thisIRQ, real-time code can signal to the non real-time system that it needsmemory, and a handler on the other side safely takes care of the possibleblocking when allocating the memory. When this operation completes,a signal is sent back to the RTCore code. There may be any amountof delay before this handler gets to do the work, though.

This point is important, and should be considered carefully. Many usersneed to do work involving memory allocation and do not understand whenit is safe to do what. Whenever possible, perform allocations within main(),along with any potentially blocking kernel calls, such as PCI device initial-ization. The provided allocator can help, but the user should still know howmuch memory should be prefilled in each pool to be safe.


This also includes RTCore calls such as ftruncate(fd, size); on FI-FOs and shared memory, which involves a kernel memory allocation to createspace for device data. Put simply, RTLinux applications cannot safely per-form allocations from within real-time code, including threads and interrupthandlers. Calls chaining directly from main() are safe to allocate from.

Chapter 4, will demonstrate how to use preallocated memory in order tospawn new threads from within real-time code. This involves exactly what isdescribed here. Allocating a block of memory before entering the real-timesystem, and then using it safely later on, when in the context of real-timecode.

3.4.2 Memory protection

FSMLabs provides PSDD, which is a means of exporting the RTCore hardreal-time API to userspace. This allows real-time threads to have the benefitof working safely in different address spaces. More details on PSDD can befound in section 12.

3.4.3 VxWorks compliance

VxIT provides hard real-time compliance with a broad selection of VxWorksAPIs. Users who are moving from legacy systems to a modern hard real-time OS can continue to use the familiar VxWorks calls within the RTCoreenvironment. Details on this can be found in section 13.

3.4.4 Networking - Ethernet and FireWire

RTLinuxPro offers a component called LNet that allows real-time network-ing, from raw packets up to UDP, over Ethernet or FireWire. This allowsone to easily create and send raw packets destined for the network, for hardreal-time communication with other machines. Both transport mediums arein heavy field use by FSMLabs’ customers.

Of course, RTLinux applications can still interact with other machineswithout LNet, but the networking stacks are all dependent on the GPOS.The traditional path is for data to be collected by real-time code, pushedover a FIFO or shared memory to userspace, which then does any packagingwork and pushes it through the network stack via a socket. With LNet,the real-time data can be collected and dumped to the hardware through

3.4. SERVICES AVAILABLE TO REAL-TIME CODE 37

a zero-copy interface immediately, allowing deterministic network transfersbetween machines, and saving the trouble of going to userspace and backthrough kernelspace. Later chapters will cover this in detail, but for nowbegin to consider the idea that individual real-time systems do not have tooperate without real-time assistance from other processing nodes.

3.4.5 Integration with other services

Covered in a later chapter, the Controls Kit offers a means of integrating low-level real-time systems with the rest of the organization. Components thatuse the Controls Kit can be directed through web interfaces, Java GUIs, Excelspreadsheets, and other systems. This simplifies integration with existinginfrastructure. For example, now it is easy to let a custom Oracle databaseretain statistical information on how a company’s machine floor devices areoperating.

3.4.6 What’s next

The next chapter will cover the RTCore programming API. As it is POSIX-based, it provides few surprises, but it needs some coverage before diving intomore advanced topics and techniques. It is recommended at least skimmingthe API sections even if the reader is familiar with POSIX because there aresome areas that RTCore’s API covers but POSIX does not handle. After thischapter, there will be many more examples and techniques for more advancedwork.

Chapter 4

The RTCore API

The RTCore API is POSIX-based with some extensions. The developmentof the API continues to evolve to reflect new needs in the industry, butcompatibility with previous releases is provided. Current efforts include con-tinued POSIX compliance, along with some extensions to cover needs eithernot mentioned by POSIX, or not sufficiently addressed in current standardsspecifications.

The functions listed here are not the entire API provided by RTCore.This section is intended to be a broad overview of the system, but is not afull listing of all interfaces to the system. Additionally, keep in mind thatother components add additional APIs - LNet provides sockets, Controls Kitadds XML integration functions, VxIT adds VxWorks compliance, and soon.

4.1 POSIX compliance

To ease the real-time learning curve, FSMLabs long ago moved RTLinux(and thus RTCore) to a POSIX-compliant API. Most developers learning areal-time system have a solid programming background, and only need toadjust to the specific API set provided by the RTOS. With RTCore, thisadjustment comes at ease, since code under RTCore looks familiar to justabout anyone who has used a UNIX based OS.

It should be noted that the POSIX standard has evolved and will continueto do so, but in a controlled manner. FSMLabs will continue to maintainPOSIX compliance in light of new developments. Existing POSIX-based

39

40 CHAPTER 4. THE RTCORE API

systems are easily moved to RTCore, although source-code compatibilitywith other RTOSs should not be expected. Source compatibility is providedwhen moving between RTLinuxPro and RTCoreBSD, as both use the samePOSIX API.

RTCore provides POSIX extensions when needed (indicated by an np

in the name) to implement features that fall outside of the POSIX domain.These are mainly relegated to performance improvements in areas such asSMP where POSIX does not provide full guidance. Some of these may notbe an option for those in strict development environments, but it is up tothe programmer to determine the best approach.

4.1.1 The POSIX PSE 51 standard

The guiding standard for the RTCore API was POSIX PSE 51, a minimumset of POSIX threading functions for real-time and embedded systems. Pro-grammers that have learned the various pthread *() calls for normal thread-ing and synchronization will have the same function set they are used to. Themajor shift involved will be to keep in mind the contraints of timing-specificreal-time code, such as scheduling, minimalism, and other real-time-specificdemands, but the programmer will not be burdened with learning a new API.

RTCore’s API is designed to be used from within real-time code, so as auser, calling a POSIX function means that the user is entering a hard real-time function that was designed to be used in this fashion. However, dueto some interactions with the GPOS, a few calls can only be used from aninitialization context. Please refer to the RTCore man pages for specifics.

4.1.2 Roadmap to future API development

The RTCore OS will continue to follow the POSIX standard in order tomaintain a proper model for the developer community. At the same time,there will continue to be a need for extensions, as POSIX does not coverall of the possible industry needs. Sometimes these ideas are moved intolater versions of the POSIX standard. Some are specific to a certain systemconfiguration, such as SMP systems, where CPU affinity calls are needed forperformance reasons. In some cases, extensions are added in order to simplifywork that could be done with standard calls. These extensions are presentedas an option that may facilitate development, but most work revolves aroundthe POSIX calls.

4.2. POSIX THREADING FUNCTIONS 41

4.2 POSIX threading functions

Presented here are the POSIX functions available from RTCore, a brief de-scription of what they do, and some notes with respect to real-time usage.These calls are used throughout the examples in the book and should be ableto give a good practical grasp of their usage to the developer. For specificnotes, refer to the man pages provided in various forms with RTLinuxPro. 1

4.2.1 Thread creation

int pthread_create(pthread_t *thread, pthread_attr_t *attr,

void *(*start_routine)(void *), void *arg);

This will create a thread whose handle is stored in *thread. The thread’sexecution will begin in the start routine() function with the argumentarg. Attributes controlling the thread are specified by attr and will use thedefault values and create a stack internally if this value is NULL.

Note that pthread create() calls are generally limited to being withinthe intialization context of main(). If the call is needed during normal real-time operation, threads can be created with preallocated stack space. Other-wise, calling pthread create() from another real-time thread would at theworst cause deadlock and at best delay the first real-time thread an unknownamount while memory is allocated for the stack.

There is an attribute function (pthread attr setstackattr()) that al-lows a thread to be prepared with a preallocated stack for operation. Thefollowing example shows use of this function.

#include <time.h>#include <pthread.h>#include <stdio.h>

pthread t thread1, thread2;void *thread stack;

void *handler(void *arg){

1For a full description of the POSIX threading API concepts and usage, refer to theO’Reilly book on PThreads Programming or the POSIX standard directly.


printf("Thread %d started\n",arg); 10

if (arg == 0) { //first thread spawns the secondpthread attr t attr;pthread attr init(&attr);pthread attr setstacksize(&attr, 32768);pthread attr setstackaddr(&attr,thread stack);pthread create(&thread2,&attr,handler,(void*)1);

}

return 0;} 20

int main(int argc, char **argv){

thread stack = rtl gpos malloc(32768);if (!thread stack)

return −1;

pthread create(&thread1, NULL, handler, (void*)0);30

rtl main wait();

pthread cancel(thread1);pthread join(thread1, NULL);pthread cancel(thread2);pthread join(thread2, NULL);rtl gpos free(thread stack);return 0;

}

This again demonstrates the point that anything outside of the main()

call cannot directly allocate memory, at least without using the real-timeallocator provided with version 2.2, also documented in this book. Instead,allocation of memory is done with rtl gpos malloc()2 in main(), whereit is safe to block while the system handles any work associated with the

2rtl gpos malloc() uses the correct malloc() available on the host GPOS.


allocation, such as defragmentation. Note that on some architectures a globalstatic value may not be a safe place to store the stack of a running thread.

Next, a real-time thread is spawned. Within the handler function, itinitializes an attribute and configures it to use our preallocated area for thestack. Finally, the thread is spawn and execution occurs just as POSIX callsare expected to behave, with the exception being that the stack is alreadypresent. Note: A thread created with pthread create() is not guaranteedto be started when the call returns. It is just slated for initial scheduling.

Note that thread stacks in RTCore are static and will not grow as neededdepending on call sequence. Users need to make sure that they create enoughstack space for the thread and prevent too many large structures from beingplaced on the stack. In a system that allows for dynamic memory manage-ment and the possible delays incurred by doing so, stacks can dynamicallygrow as the application needs space. Under RTCore, growing the stack wouldrequire the program to wait while proper memory is found, possibly destroy-ing real-time performance. Instead, the stack is allocated at thread creationand does not grow.

This stack is generally only a couple dozen kilobytes in size, but userswith large data structures in function contexts need to understand that thesestructures can soak up available stack space very quickly causing an over-flow. If a thread has a 20K stack and calls a function 3 times recursively,with a local structure of 7K per invocation, an overflow will occur. Smallerstructures should be used or large structures should be kept off the stack.Another option is the thread’s stack should be enlarged to compensate.

4.2.2 Thread joining

int pthread_join(pthread_t thread, void **arg);

This function joins a running thread and stores the return value into arg.It has no restriction on the length of time it takes to complete. If the threadhas already completed, this call returns immediately, otherwise it blocks untilthe intended thread exits. As expected, this frees resources associated withthe thread, such as the stack, if it was not configured by hand. Lookingat the previous example, it can seen that this call was used to join both apreallocated stack thread and a normal thread. It cleans up the resources forboth, except for the stack on the second thread, which was explicitly freed.

pthread_detach(pthread_t thread);


The pthread detach() call will ’unhook’ a running thread whose statuswas previously joinable. After the thread is detached, it is no longer joinableand needs no further management. Its resources will be cleaned up on threadcompletion.

4.2.3 Thread destruction

int pthread_cancel(pthread_t thread);

This will cancel the thread specified by the given parameter. There aremany caveats to this as specified in the full man page, such as a cancelledthread works through its cancel handlers. It is not required to release anymutex locks it holds at the point of cancellation. (Though releasing theselocks is a good idea to for a stable system.) Also, it may not cancel imme-diately depending on the state the thread is in at the point of the call. Thetarget thread will continue to execute until it enters a cancellation point. Atwhich time it will begin to unwind itself through its registered cancel han-dlers. For most users, pthread cancel() followed by a pthread join() ismost effective as a means of shutting down real-time code from within thetail end of main().

4.2.4 Thread management

pthread_t pthread_self(void);

This is a very simple function, generally used by threads to get their ownthread handle for further calls.

pthread_setcancelstate(int state, int *oldstate);

pthread_setcanceltype(int state, int *oldstate);

Threads may use the pthread setcancelstate() to disable cancella-tion for themselves. The previous state is stored in the oldstate variable.Likewise, the pthread setcanceltype() call is used to determine the type ofcancellation used, either PTHREAD CANCEL DEFERRED or PTHREADCANCEL ASYNCHRONOUS. However, in real-time environments most sys-tems have a minimal set of simple, continuous threads and do not make heavyuse of cancellation calls.


pthread_testcancel(void);

This call ensures that any pending cancellation requests are deliveredto the thread. It has little use in real-time applications because cancellationrequests must be a deterministic call in the first place. If there are ambiguitiespresent in the code, it may be better to remove them rather than being forcedto check if the real-time thread should continue.

int pthread_kill(pthread_t thread, int signo);

pthread kill() sends the signal specified by signo to the thread spec-ified. This function is fast and deterministic if called on a thread runningon the local CPU, but there can be a delay when signalling a thread on aremote CPU.

4.2.5 Thread attribute functions

In addition to the normal thread calls, RTCore also includes the pthread attr *()

functions which control attributes of a thread. These functions behave asthey would in any other situation. Please refer to the standard documenta-tion for more detail.

int pthread_attr_init(pthread_attr_t *attr);

int pthread_attr_destroy(pthread_attr_t *attr);

These two functions initialize and destroy attribute objects, respectively.Attribute objects should be created and destroyed with these calls.

int pthread_attr_setstacksize(pthread_attr_t *attr,

size_t stacksize);

int pthread_attr_getstacksize(pthread_attr_t *attr,

size_t *stacksize);

Programmers can use these calls to manipulate the stack size of the threadthe attribute is tied to. Note all manipulation of stack sizes needs to bedone within main() unless preallocated memory for that thread has beencompleted within main(). Preallocated memory for a thread is shown in the


example in section 4.2.1. If these attributes are not set, the RTCore OS willhandle the stack manipulation internally. 3

Again, note that thread stacks under RTCore are static and will not growas needed based on what functions are called. Users need to ensure that theyhave enough stack space for their thread from the start. Section 4.2.1 hasmore details.

int pthread_attr_setschedparam(pthread_attr_t *attr,

const struct sched_param *param);

int pthread_attr_getschedparam(pthread_attr_t *attr,

struct sched_param *param);

As with normal POSIX threads, these two routines determine schedulingparameters as driven by the contents of the param parameter. Also, as usual,use the sched get priority min() and sched get priority max() callswith the appropriate scheduling policy to get the priority ranges. SCHED FIFOis the default scheduling mechanism and while it does not have to be speci-fied, it is helpful to ensure forward compatibility.

int pthread_attr_setstackaddr(pthread_attr_t *attr,

void *stackaddr);

int pthread_attr_getstackaddr(pthread_attr_t *attr,

void **stackaddr);

The above calls are important when creating threads from within thereal-time kernel. By using these calls to manage the stack address, one cancreate threads from inside the real-time kernel. These calls were used in theprevious example in Section 4.2.1.

int pthread_attr_setdetachstate(pthread_attr_t *attr,

int detachstate);

int pthread_attr_getdetachstate(const pthread_attr_t *attr,

int *detachstate);

Use these two calls to switch a thread’s joinable state from PTHREADCREATE JOINABLE to PTHREAD CREATE DETACHED. Alternatively,the pthread detach() call can be used to alter a running thread’s state.

3Version 2.2 provides a real-time safe memory allocator and can be used to satisfyallocation requirements such as these if needed.

4.3. SYNCHRONIZATION 47

4.3 Synchronization

4.3.1 POSIX spinlocks

RTCore provides support for the POSIX spinlock functions too. The APIis much like other POSIX objects. The initialization/destruction set is asfollows:

pthread_spin_init(pthread_spinlock_t *lock, int pshared);

pthread_spin_destroy(pthread_spinlock_t *lock);

As with other similar calls, these initialize or destroy a given spinlock.The following calls are also supported:

pthread_spin_lock(pthread_spinlock_t *lock);

pthread_spin_trylock(pthread_spinlock_t *lock);

pthread_spin_unlock(pthread_spinlock_t *lock);

These calls the programmer to take a lock, try to take a lock but returnif the lock is already held and unlock a given spinlock, respectively. Thesebehave like other spinlocks in that they will spin a given thread in a busyloop waiting for the resource, rather than putting it on a wait queue to bewoken up later.

As a result, the same spinlock caveats apply. They are generally prefer-able to other synchronization methods when the given thread will spin ashorter amount of time waiting than the sum of the work involved in puttingit on a queue, any associated locking, and waking up appropriately whenthe resource becomes available. In a real-time system, it is also of courseimportant that the resource is available quickly so the thread does not losedeterminism due to a faulty locking chain in other subsystems.

4.3.2 Comments on SMP safe/unsafe functions

The functions described here are inherently safe in SMP situations, althoughthere are real-time considerations. For calls that target threads running onother CPUs, there may be a delay in getting the signal to the running code.pthread cancel() and pthread kill() are two examples of this. Whensending a signal to code on the current CPU the code is fast and deterministic,but may delay slightly when targetting a ’remote’ thread. While in normal


situations this is unimportant, the incurred delay may have repercussionsfor real-time code. Keep these factors in mind when writing the real-timecomponent of an application. It may help to reconfigure which CPUs runwhich threads.

4.3.3 Asynchronously unsafe functions

Some functions are not asynchronously safe, at least in a real-time environ-ment. To ensure correct behavior, pthread cancel() is not recommendedfor threads that use any of these functions. What is meant by ’asynchronouslyunsafe’ is these calls may leave the system in an unknown state if the call isinterrupted in the middle of execution. An example would be a function thatlocks several mutexes in order to do work, and installs no cleanup handlers.If the call is halfway through and is cancelled by a remote pthread cancel()

call, that thread will exit while holding some mutexes, potentially blockingother threads indefinitely.

It is possible to handle mutex cleanups in a safe manner if one pushescleanup handlers for all shared resources, but this is complicated. Extremecare must be taken to ensure that held resources are freed in a mannerthat doesn’t incur locking, and that everything is cleaned properly for everypossible means of failure. Failing to get this correct will leave all waitingthreads blocked forever, as the cancelled thread will terminate with lockedresources left behind.

4.3.4 Cancel handlers

These calls are difficult to get right in all cases, and many developers do notcome into contact with them too often. In the interest of sidestepping futureconfusion and grounding the discussion, a short example will be shown next.

Put simply, cancel handlers are hooks attached to a running thread, asfunctions and are executed in the case that a thread is cancelled while aresource is held. The handlers are pushed on as a stack, so that if the threadis cancelled, the handlers are executed in the order they were pushed on thestack.

Also, a cancelled thread does not execute cleanup functions at the time thecancel is received. Rather, it continues execution until it enters a ’cancellationpoint’, which is generally a blocking function. Refer to the POSIX specifica-tion for specific cancellation points, but this generally means that code will


continue to execute until it hits a blocking call like pthread cond wait().Below is an example:

#include <time.h>#include <unistd.h>#include <pthread.h>#include <stdio.h>

pthread t thread;pthread mutex t mutex;

void cleanup handler(void *mutex){ 10

pthread mutex unlock((pthread mutex t *)mutex);}

void *thread handler(void *arg){

pthread cleanup push(cleanup handler,&mutex);pthread mutex lock(&mutex);while (1) { usleep(1000000); }pthread cleanup pop(0);pthread mutex unlock(&mutex); 20

return 0;}


pthread mutex init (&mutex, NULL);pthread create (&thread, NULL, thread handler, 0);

rtl main wait();30

pthread cancel (thread);pthread join (thread, NULL);pthread mutex destroy(&mutex);return 0;


}

This code correctly handles the cancellation problem. In the initializationcode, a mutex is created and spawns a thread. This thread correctly pushesa cleanup handler on the stack before it locks the mutex and then enters awhile loop. Now the mutex is locked indefinitely and any cancellation mustcause the mutex to be unlocked. If the application is cancelled with CTRL-C

at the command line, it induces the cancel and cleanup handler, causing aproper exit.

Note again when cancellation points occur. If the code pushes the cancelhandler on, but the thread is cancelled asynchronously before it actuallylocks the mutex, the thread will continue to run until it enters a cancellationpoint. It will continue to execute, running through the cleanup handler pushand the mutex lock. Once it locks the mutex, the thread is cancellable, thesignal will be delivered and the handler will be called from a known point.Think of cancellation points as being places where the system checks to seeif it should stop and clean up.

Consider the case without the cleanup handler. Reviewing the previ-ous example, once the thread locks the mutex and another process asyn-chronously cancels the thread, the thread will still wait for a cancellationpoint, but without the handler, it will exit with the mutex held and anyother code that depends on it will be blocked indefinitely. Now imaginewhat happens if multiple resources were held at various times, dependingon the call chain. Any lockable resource that is not attached to a cleanuphandler properly can cause a deadlock if the holding thread is cancelled.

As can be seen, while there are mechanisms to avoid cancellation prob-lems, care must be taken to make sure that everything is handled properly.Failure to do so in every possible cancel situation will result in system dead-lock. With a real-time system, this can be disastrous.

4.4 Mutexes

The POSIX-style mutexes are also available to real-time programmers as ameans of controlling access to shared resources. As timing is critical, it isimportant that mutexes are handled in such a way that blocking will notimpede correct operation of the real-time application.

4.4. MUTEXES 51

4.4.1 Locking and unlocking mutexes

int pthread_mutex_lock(pthread_mutex_t *mutex);

As with the standard POSIX call, this locks a mutex, allowing the callerto know that it is safe to work on whatever resources the mutex protects. Ina real-time context, locks around mutexes must be short because long lockscould cause serious delays in other waiting threads.

int pthread_mutex_trylock(pthread_mutex_t *mutex);

The pthread mutex trylock() call will attempt to lock a mutex andwill return immediately, whether it gets the lock or not. Based on the returnvalue, one can tell whether the lock is held and take appropriate action. Forapplications that may not be able to wait for a lock indefinitely, this is a wayto avoid long delays.

int pthread_mutex_timedlock(pthread_mutex_t *mutex,

const struct timespec *abstime);

Similar to the above pthread mutex trylock() function, pthread mutex

timedlock() provides a way to attempt to grab a lock with an upper boundon the length of the wait. If the mutex is made available and locked bythe caller before the allotted time has passed, the mutex will be locked. Ifthe allowed time passes and the mutex cannot be locked by the caller, thefunction returns with an error so that the caller can recover appropriately.

int pthread_mutex_unlock(pthread_mutex_t *mutex);

This unlocks a held mutex. It signals a wakeup on those threads that areblocking the mutex.

4.4.2 Mutex creation and destruction

int pthread_mutex_init(pthread_mutex_t *mutex,

const pthread_mutex_attr *attr);

int pthread_mutex_destroy(pthread_mutex_t *mutex);

As with normal POSIX calls, pthread mutex init inititalizes a givenmutex. If a pthread mutex attr is provided, it will use it, otherwise adefault attribute set will be created and attached. The second call destroysan existing mutex, assuming that it is in a proper state and not alreadylocked. Destroying a mutex that is in use will result in an error to the caller.


4.4.3 Mutex attributes

int pthread_mutexattr_init(pthread_mutexattr_t *attr);

int pthread_mutexattr_destroy(pthread_mutexattr_t *attr);

pthread mutexattr init is used to initialize a given mutex attributewith the normal values and pthread mutexattr destroy is used to destroyan already existing attribute.

int pthread_mutexattr_settype(

pthread_mutexattr_t *attr,

int type);

int pthread_mutexattr_gettype(

pthread_mutexattr_t *attr,

int *type);

This call allows the programmer to set the type of mutex used. For exam-ple, the type can be either PTHREAD MUTEX NORMAL, which impliesnormal mutex blocking, or PTHREAD MUTEX SPINLOCK NP, which willforce the mutex to use spinlock semantics when attempting to grab a lock.The second call, pthread mutexattr gettype will return the type previouslyset or the default value.

int pthread_mutex_setprioceiling(

pthread_mutex_t *mutex,

int prioceiling,

int *old_ceiling);

int pthread_mutex_getprioceiling(

const pthread_mutex_t *mutex,

int *prioceiling);

pthread mutex setprioceiling sets the priority ceiling for the givenmutex, returning the old value in old ceiling. This call blocks until themutex can be locked for modification. pthread mutex getprioceiling re-turns the current ceiling. More detail on priority ceilings will follow later.

4.5 Condition variables

The condition variable calls throughout the RTLinux API are the sameas those used in normal POSIX environments. Keep in mind that if a

4.5. CONDITION VARIABLES 53

thread waiting on a condition variable is cancelled while blocked in eitherpthread cond wait() or pthread cond timedwait(), the associated mutexis reacquired by the cancelled thread. To prevent deadlocks a cleanup han-dler that will unlock all acquired mutexes must be installed. Reacquiring theassociated mutex will take place before the cleanup handlers are called.

4.5.1 Creation and destruction

int pthread_cond_init(pthread_cond_t *cond,

const pthread_condattr_t *attr);

int pthread_cond_destroy(pthread_cond_t *cond);

A condition variable must be created and destroyed just like any otherobject. Note that there is an attribute object that is specific to conditionvariables and can be used to drive the behavior of the variable.

4.5.2 Condition waiting and signalling

int pthread_cond_wait(pthread_cond_t *cond,

pthread_mutex_t *mutex);

The pthread cond wait caller waits on a condition to happen specifiedby cond and coordinates usage with the mutex parameter. The mutex mustbe held at the time of the call, at which point it is released for other threadsto cause the condition to occur, also using the mutex. When the call returns,signalling that the condition has occurred, the mutex is again held by thecaller. The associated mutex must be released after the critical section iscomplete.

int pthread_cond_timedwait(pthread_cond_t *cond,

pthread_mutex_t *mutex,

const struct timespec *abstime);

As with pthread cond wait(), this call waits for a condition to happen,locked by a mutex. In this version, it will only wait the amount of timespecified by abstime. Based on the return value, the caller can determinewhether the call succeeded and the condition occurred or if time ran out.

int pthread_cond_broadcast(pthread_cond_t *cond);

int pthread_cond_signal(pthread_cond_t *cond);


The pthread cond broadcast function broadcasts a condition signal toall those waiting and the pthread cond signal function calls to a singlethread waiting on a condition variable, respectively. Note that the caller ofthese functions does not need to hold the mutex that waiting threads haveassociated with the condition variable.

4.5.3 Condition variable attribute calls

int pthread_condattr_init(pthread_condattr_t *attr);

int pthread_condattr_destroy(pthread_condattr_t *attr);

The attribute object calls for condition variables are no different thanany other attribute calls. The same object creation and destruction methodsapply.

int pthread_condattr_getpshared(

const pthread_condattr_t *attr,

int *pshared);

int pthread_condattr_setpshared(

pthread_condattr_t *attr,

int pshared);

Relative to thread and other object types, there is not much that can bemodified for conditional variable attributes. These calls toggle the status ofa conditional variable’s shared status. No other methods apply to this type.

4.6 Semaphores

RTCore semaphores look just like POSIX semaphores. As with the condi-tional variables, if a thread is cancelled with a process-shared semaphoreblock, this semaphore will never be released and consequently, a deadlocksituation can occur. It is the programmer’s responsibility to ensure thatsemaphores are handled properly in cleanup handlers.

Signals that interrupt sem wait() and sem post() will terminate thesefunctions so that neither acquiring or releasing the semaphore is accom-plished. The function call interrupted by a signal will return with valueEINTR.

4.6. SEMAPHORES 55

4.6.1 Creation and destruction

int sem_init(sem_t *sem, int pshared,

unsigned int value);

int sem_destroy(sem_t *sem);

As with the mutex functions, these functions will detect in-use semaphoresand other problems that could cause unpredictable behavior. Refer to the ex-amples in rtlinuxpro/examples and full documentation for more details ontheir use. In general behave as they would in any other POSIX environment.

4.6.2 Semaphore usage calls

int sem_getvalue(sem_t *sem, int *sval);

This function will store the current value of the semaphore in the sval

variable.

int sem_post(sem_t *sem);

sem post() increases the count of the semaphore. It never blocks, al-though it may induce an immediate switch if posting to a semaphore thata higher priority thread is waiting for, even if it is called from an interrupthandler.

int sem_wait(sem_t *sem);

int sem_trywait(sem_t *sem);

int sem_timedwait(sem_t *sem,

const struct timespec *abs_timeout);

These are the calls used to force a wait until the semaphore reaches anon-zero count and operate in the same way the mutex wait calls do. Thesem wait() call blocks the caller until a non-zero count is reached and thesem trywait() does the same without blocking, returning EAGAIN if thecount was 0. The sem timedwait() call blocks up to the amount of timespecified by abs timeout.


4.6.3 Semaphores and Priority

Semaphores must be handled with care in the context of real-time code. Iflow priority code does a sem post(), the programmer must keep in mindthat if a higher priority thread was waiting on that semaphore, the post willinduce an immediate transfer of control to the higher priority thread.

This comes as a surprise to some users, but it must kept in mind thatin real-time systems, speed is of course the most important factor. If thismeans that a real-time thread suspends the moment it does the post, thatis all right. The alternative is to further block the high priority thread thatneeds the semaphore.

Aside from ensuring the best possible performance, semaphores are alsoused in this way to simplify driver development. Interrupt handlers canbe kept very simple and succinct, with semaphore posts after the minimalamount of work is done. This will cause a switch to the handling thread,which can perform the rest of the work. Since threads are more capablethan interrupt handlers (being able to use the FPU, the debugger, etc), thedata can be handled in a simple thread context rather than building complexinterrupt handlers.

4.7 Clock management

RTCore provides standard POSIX mechanisms for managing the clock, threadsleeps, delays, and similar tasks. Examples include clock nanosleep(),clock gettime(), and so on. For detailed information on these functions,refer to the Single UNIX Specification provided with RTLinuxPro found in/doc/html/susv2.

One additional piece of information worth noting here is the additionof an advance timer to clock nanosleep(). Virtually every system has aninherent amount of jitter, depending on hardware load. Some applicationsrequire determinism below the threshold of this jitter. For these applications,RTCore provides the advance timer. Threads generally sleep with:

struct timespec t;

clock_gettime(CLOCK_REALTIME, &t);

timespec_add_ns(&t, 500000);

clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, &t, NULL);

4.8. EXTENSIONS TO POSIX (* NP()) 57

This way, the thread will be woken when the absolute time specified int, which was in this case, the current time plus 500us. If there is inherenthardware jitter, the thread may be delayed by a couple of microseconds.Please refer to section 4.8.1 for details on reducing this jitter to 0.

4.8 Extensions to POSIX (* np())

There are some calls available to developers that are specific to RTCore.These calls are not part of the POSIX specification, but do fill in some of thegaps left by it and may make their way in some form into future revisions ofthe standard. They are listed here as an option to developers. In order toproperly handle some situations, such as SMP environments, these calls maymake life much easier.

4.8.1 Advance timer

Sometimes the processor, processor board, and I/O combination cannot pro-vide the required level of determinism. Even some very fast processors arehandicapped by poorly designed I/O architectures that can block the proces-sor for many microseconds. Suppose an application can tolerate a worst casejitter of 13 microseconds. Meaning that a scheduled thread X is to run at acertain period and the execution of thread X cannot deviate from the sched-ule by more than 13 microseconds no matter what the system load. If thereare unavoidable hardware delays of 10 microseconds and RTCore schedulingtakes 3 microseconds, and context switch time takes 2 microseconds, the ap-plication is already outside of the allowable range. If the programmer cannotchange hardware, RTCore offers the advance timer to trade compute timefor determinism. Advance timer asks the scheduler to automatically schedulea thread to run early and then to hold the thread in a busy-waiting stageuntil the actual start time. This option wastes some compute time, but inmany cases, the reduction of compute time is bearable. For example, if adeveloper has a Pentium-4 processor on a mediocre board, they might findthey have jitter of 30 microseconds. To reduce that jitter to 10 microseconds,they can request a timer advance of 20 microseconds. If the thread period is500 microseconds, then the developer is “wasting” 20 microseconds a period- 4% of the processor. For a fast processor, 4% is often invisible. Dependingon the application, the trade-off may be worth it even for a slow processor.


On a mediocre Arm9 board, a programmer might get jitter of up to 100mi-croseconds. If the same programmer has a GSM phone with a packet atGSM 5 millisecond period, that programmer can get precise scheduling withno more than 2 microseconds error for a 5millisecond thread at the cost of98microseconds “wasted” per period — less than 2%.

Timer advance is requested through a flag in the arguments of clock nanosleep.Instead of the usual:

clock_nanosleep( CLOCK_REALTIME, TIMER_ABSTIME, &next, NULL);

the programmer can write:

clock_nanosleep( CLOCK_REALTIME, TIMER_ABSTIME|TIMER_ADVANCE,

&next, &ts_advance);.

The ts advance structure is a normal struct timespec with the tv nsec

field used to indicate how much deviation the programmer would like to ac-count for, in nanoseconds. So for example, if the worst case latency was 9microseconds on the hardware, but the application demanded 4 microsec-onds, tv nsec would be set to at least 5000.

RTCore will then switch the thread in early and spin it in a busy waituntil the exact scheduling moment occurs. One caution: by default, real-timethreads are run with interrupts enabled so even if the thread starts exactly ontime, it may immediately be pre-empted by an interrupt. To avoid that delay,the developer can disable interrupts just before the call to clock nanosleep

and then enable them again after the critical section.

rtl_stop_interrupts();

clock_nanosleep(...);

critical code that cannot be delayed;

rtl_allow_interrupts();

4.8.2 Processor selection

On a multi-processor, real-time threads can be assigned to run on specificprocessors. By default, a thread runs on processor that pthread create ranon when the thread was made. But if the programmer wants to distributethe load or to force threads to run on the same processor or on a processorthat has specific hardware bound to it (see section 15.3.3 on interrupt focus ),then they can set a special scheduling attribute that RTCore adds to POSIX.

4.8. EXTENSIONS TO POSIX (* NP()) 59

int pthread_attr_setcpu_np(pthread_attr_t *attr, int cpu);

int pthread_attr_getcpu_np(pthread_attr_t *attr, int *cpu);

One example of how the programmer could use this feature is to optimizetwo threads working together on two processors in a pipeline so that neithercan delay the other, even if their periods overlap. It is also possible to reservea processor for only real-time use (see section 4.8.3 below).

4.8.3 CPU reservation

int pthread_attr_setreserve_np(&attr, 1);

int pthread_attr_getreserve_np(&attr);

In SMP applications, especially very high speed systems, it is benefi-cial to reserve a CPU for only real-time applications, whether they arethreads or interrupt handlers. By using this thread attribute along withthe pthread attr setcpu np() call, one can spawn a real-time thread ona CPU such that the GPOS cannot run on that CPU. The benefit of thisis that the real-time code can then in many cases live entirely in cache andachieve more deterministic results at high speeds, as the GPOS cannot runon that CPU and disturb the cache usage.

Tests on larger scale systems with significant bus traffic indicate thatreserve CPU capabilities can reduce jitter by an order of magnitude.

4.8.4 Enabling FPU access

int pthread_setfp_np(pthread_t thread, int flag);

By default, real-time threads do not have access to the CPU’s floatingpoint unit, since the system’s context switch times are faster if it does nothave to restore floating point registers. This call will enable or disable thataccess. For threads running on another CPU, pthread attr setfp np() isthe proper way of enabling FPU support for the thread.

4.8.5 Concept of the extensions

The idea behind the extensions is to provide easy means of handling as-pects of real-time programming not covered (or not covered well) by the the


POSIX standard. In some situations, they provide an easy way to get some-thing done that could be done with standard calls, but would be much morework and result in convoluted code. In other cases, the standard does notspecify with detail certain aspects of real-time operations and the extra callsare there to work around the ambiguities. Most of these situations relate tohow certain operations are carried out in SMP mode and handle ambiguitiesassociated with targetting code on another CPU in real-time. The RTCoreextensions take what could be a non-deterministic situation and remove ex-ecution ambiguities.

4.9 ”Pure POSIX” - writing code without

the extensions

There are users who do not want to use the non-POSIX extensions. In thesecases, there is usually some need for all of the code to be POSIX-compliant,usually based on an internal coding standard. It is possible to write codewithout the extensions, although there may be performance issues as a result.RTCore does not force the programmer to deviate from the standard. Itsimply offers some solutions to improve performance.

4.10 The RTCore API and communication

models

Thus far, this book has focused on demonstrating the API used in RTCore tocommunicate between threads and other code living in the real-time kernel.A later chapter will focus more on the auxiliary communication models, suchas FIFOs and shared memory, when the developer needs to communicatewith the GPOS.

Chapter 5

More concepts

So far, some basic real-time concepts, introductory examples and basics ofthe API have been reviewed. Next, general programming practices, conceptsand how they work in RTCore will be illustrated.

5.1 Copying synchronization objects

To begin, it is recommended that the programmer not copy any objects oftype mutex, conditional variable, or semaphore. Operations on a copiedsynchronization object can result in unpredictable behavior. All of the syn-chronisation objects should be initialized and destroyed with the appropriatefunction for the data type.

The same holds true for attributes associated with synchronization ob-jects. They should never be copied, instead initialize them with the appro-priate calls. The following example shows what not to do:

pthread_mutex_t mutex1, mutex2;

pthread_mutex_init(&mutex1,0);

memcpy(&mutex2, &mutex1, sizeof(pthread_mutex_t);

Instead, this should be used:

pthread_mutex_t mutex1, mutex2;



61

62 CHAPTER 5. MORE CONCEPTS

5.2 API Namespace

As the reiterated before, the RTCore API is POSIX. However, the RTCoreAPI is also available using an rtl prefix. This means that pthread create()

can also be referenced as rtl pthread create().This is an added feature so that users can explicitly reference RTCore

functions when needed, if there is any ambiguity. In PSDD, as shown later,real-time applications exist inside of normal GPOS applications. In these sit-uations, an ambiguity exists. pthread create() will by default refer to thenormal userspace GPOS function, rather than the RTCore pthread create().In these situations, the rtl prefix is needed.

5.3 Resource cleanup

RTCore will clean up some unfreed resources if the application does notexplicitly catch everything on cleanup. If the program exits and forgets tocall close(), RTCore will detect this and make the call. This will allowfile usage counts to remain in proper order. This also holds true for POSIXI/O-based devices that any code may have registered. It will do the properderegistration.

However, some resources are not handled at this time. Threads are notcleaned up automatically. It is up to the programmer to make sure that eachthread belonging to an application is cancelled and joined properly. Thesame goes for memory allocated through rtl gpos malloc(). The callermust free these areas with rtl gpos free() to prevent memory leaks.

5.4 Deadlocks

When using synchronization primitives, it is the programmers responsibilityto ensure that either all shared resources are correctly freed if asynchronoussignals are enabled or that these are blocked. Make sure to use thread cleanuphandlers to safely free resources if the thread is cancelled while holding aresource.

5.5. SYNCHRONIZATION-INDUCED PRIORITY INVERSION 63

5.5 Synchronization-induced priority inversion

If a high priority thread blocks on a mutex (or any other synchronizationobject) that was previously locked by a low priority task, this will lead topriority inversion. The means the lower priority thread must gain a higherpriority in order to guarantee execution time. A high priority thread maycome along and block execution of the low priority task from running, pre-venting mutex release and stalling both the low and high priority threads.The high priority thread is waiting for the low priority thread to release andthe low priority thread is waiting for execution time. The mutex will neverbe unlocked.

Any scenario that allows a lower priority task blocking a high prioritytask is an implicit priority inversion. Theories abound on what the correctmechanism is to handle this. FSMLabs has found that analysis of code isthe best means of avoiding it. Based on internal and external experience, itfollows that if the programmer does not know what resources the code mightor might not hold at a given point, the chances of there being potentialdangerous situations is very high.

Protocols such as priority inheritance exist to solve this problem, but inturn induce potentially unbounded priority inversion. Inheritance involveslower priority resources being promoted to higher priority levels such as whena higher priority task is waiting on the lower. This approach can lead tounbounded suspension. Consider the example, a high priority thread thatis waiting on a lower priority thread that holds a lock. The lower prioritythread is promoted so it can execute and release. However, this thread nowneeds a lock held by an even lower priority thread. This third thread is thenraised so that it can execute, and so on. In the meantime, the high prioritythread may not be considered ’real-time’ anymore because it can easily loseits deterministic characteristic.

RTCore provides optional support for the ’priority ceiling protocol’ inwhich resources are given a ceiling priority they cannot exceed. This stillrequires analysis and is not perfect, but does provide a middle ground forusers.


5.6 Memory management

General purpose memory management is not strictly available to real-timethreads. Although, as of version 2.2 FSMLabs provides a real-time memoryallocator for use by either real-time threads or even non-real-time threads.This means that the programmer does not have to always preallocate andgovern their own memory usage in an application.

The provided allocator is fast, but is not bounded time. In practice,worst case times are found to be very small, but the current release doesnot support a bounded time exit path, meaning that if the allocation is notcomplete by X microseconds, the call aborts. This may be added in futurereleases.

The RTCore allocator works on a number of blocks of memory of a givensize. By default, it prefills an initial pool so that users can transparently callmalloc() and free() (rtl rt malloc() and rtl rt free()) without preconfiguringthe pool. The allocator must be loaded in order for these calls to succeed. Itis done with:

rtlinuxpro/utilities/allocator/allocator.rtl &

The allocator’s first pool is started with 15 blocks of 30000 bytes each. Asusers allocate memory, these blocks are pulled off into specific areas accordingto the size of the request. For example, a request for 23000 bytes will causea block to be pulled off, and marked as full. A request for 10 bytes will causea block to be pulled off into an area, but with free space so that other 10byte allocations can also come out of that block or values of a similar size.Multiple simultaneous requests will cause multiple blocks to be pulled off,but they will be filled in parallel as more requests come in.

The default values can be overridden with two parameters: bs and nblocks.The following code starts the allocator and fills the initial pool with 25 blocksof 10000 bytes each:

rtlinuxpro/utilities/allocator/allocator.rtl bs=10000 nblocks=25 &

The pool is made up of vmalloced memory by default. If other types ofmemory are needed, other requests can be made to fill in other pools withdifferent types. The allocator is built to manage multiple pools, whetherfor different memory types or to avoid contention in specific scenarios. Toallocate memory in a second pool, the user should call the following function:


rtl_init_mem_pool(pool_num, RTL_O_VIRT, blocksize, nblocks);

All parameters are integer types. pool num indicates the pool number.Usually this will be 1 or higher, to a maximum of 4. The blocksize andnblocks parameters inform the allocator how to carve up the allocated space.The second parameter defines the type of memory. Virtually contiguousis the default. Please contact FSMLabs support for working with differenttypes of memory.

To allocate and free out of these alternative pools, the user needs toreplace calls to malloc() (rtl rt malloc()) with calls to rtl rt malloc pool(),which specifies which pool should be used. Similarly, free() (rtl rt free()) be-comes rtl rt free pool(). Users then have to clean up that pool on completionwith:

rtl_cleanup_mem_pool(pool_num);

5.7 Synchronization

This is possibly the most important concept in real-time systems engineering.While synchronization is important as a protection mechanism in normal, nonreal-time threaded applications, it can make or break a real-time system. Ina normal application, a waiting thread will do just that - wait. In a hardreal-time system, a waiting thread might mean that a fuel pump is not beingproperly regulated, as it is waiting on a mutex that another thread has heldtoo long.

5.7.1 Methods and safety

Safe synchronization relies on several things: judicious use of it, code analysis,and above all, understanding of the code at hand. No amount of softwareprotection will save the system from a programmer who does not understandor care what locks are held in a real-time system. In fact, the presence ofthis software may result in carelessness on the part of the developer.

RTCore offers the standard POSIX synchronization methods, such assemaphores, mutexes, and spinlocks but also focuses on other, higher per-formance synchronization methods. Much of RTCore is designed in such away that synchronization is not necessary or is very lightweight. Heavy syn-chronization methods such as spinlocks can disable interrupts and interfere


with other activity in the system. Lighter mechanisms such as atomic oper-ations create very little bus traffic and have a minimal impact. Of course,an entirely lock-free mechanism is even better, if possible.

An example of this is the RTCore POSIX I/O subsystem. The originalRTLinuxFree versions were very fast but had no locking mechanisms what-soever. While the performance was good, it did not hold up to industrialuse. It needed proper locking in order to traverse the layers properly. Thelayer also needed to stay as fast as it was before. Users want it to be fastand safe. A simple and effective method would be to put mutexes aroundeach contended piece, locking and unlocking as needed.

While simple, this would severely slow the system down, as mutexes in-volve waiting on queues, switching threads while others complete, and soon. Instead, FSMLabs added a light algorithm based on atomic operations(Please refer to section 5.7.3). As requests come in to add a device name tothe pool, atomic operations such as rtl xchg are used to grab pieces of thepool. This prevents interrupt disabling and allows other threads to use otherareas of the pool at the same time.

Some other restructuring was also used resulting in a more flexible andsafe architecture that was just as fast as it had always been, except it is nowsafe. Other systems require different approaches, from heavier synchroniza-tion to none at all, but it is very important that the correct method is chosen,not just one that works.

5.7.2 One-way queues

As has been said, POSIX provides several synchronization methods, butother approaches are sometimes called for. Usually, when a very light andquick method is needed. The one-way queues provided by RTCore handlemany of these situations.

The Basic Idea

Many usage patterns require that one thread sends messages to another, inreal-time. In one form or another, this results in a queue. Because queuesare simple, many users write their own and rather than leaving it open theyprotect queue operations with a lock. While the lock will rarely be contended,the act of grabbing and releasing the lock may interfere with other systemactivity.


In light of this, RTCore provides a ’one-way queue’ implementation. Thisallows a user to declare specific message queues, shared between a singleconsumer and a single producer thread. Each queue declaration implicitlydefines a set of functions to operate specifically on that distinct queue, sothat all code using the queue interacts with it using a specific function name.

These queues are lock-free, meaning that there is no locking on sends andreceives. The API can handle concurrent enqueue and dequeue operations,but the user must wrap the calls with a lock if there are multiple consumersor multiple producers. (A locking version does also exist that offers a builtin lock.) The result is a very fast mechanism for exchanging data that needsvery little, if any, management overhead. The next example depicts a one-way queue.

#include <time.h>#include <stdio.h>#include <unistd.h>#include <pthread.h>#include <onewayq.h>

pthread t thread1, thread2;

DEFINE OWQTYPE(our queue,32,int,0,−1);DEFINE OWQFUNC(our queue,32,int,0,−1); 10

our queue Q;

void *queue thread(void *arg){

int count = 1;struct timespec next;

clock gettime(CLOCK REALTIME, &next);while (1) {

timespec add ns(&next, 1000000000); 20

clock nanosleep(CLOCK REALTIME,TIMER ABSTIME, &next, NULL);

if (our queue enq(&Q,count)) {printf("warning: queue full\n");


}count++;

}}

30

void *dequeue thread(void *arg){

int read count;struct timespec next;


timespec add ns(&next, 500000000);clock nanosleep(CLOCK REALTIME,

TIMER ABSTIME, &next, NULL); 40

read count = our queue deq(&Q);if (read count) {

printf("dequeued %d\n",read count);

} else {printf("queue empty\n");

}}

}50


our queue init(&Q);pthread create (&thread1, NULL,

queue thread, 0);pthread create (&thread2, NULL,

dequeue thread, 0);

rtl main wait();60

pthread cancel (thread1);pthread join (thread1, NULL);pthread cancel (thread2);


pthread join (thread2, NULL);return 0;

}

This requires some explanation, as the syntax hides much of the work.There are two threads, spawned as normal, where one enqueues data and theother dequeues. Both are periodic and as a quick method of preventing thequeue from overflowing, the dequeueing thread defines a period half as longas the enqueueing thread. Half of the dequeue calls result in an empty queuebeing found, but this is acceptable for example purposes.

The next section will break down the interesting parts of the exampleinto discrete steps, starting with the initial declarations.

Declarations

First the queue needs to be defined for data to flow between the threads.The syntax defines two steps. The first step is:

DEFINE_OWQTYPE(our_queue,32,int,0,-1);

This first step creates a data type for the queue.1 Think of this as thebacking for the queue operations. It defines the queue, its properties andstructure. Parameter 1 is the name that will be provided so the queue caninstantiate itself and parameter 2 defines the length of the queue. Parameter3 defines the type of unit the queue is made of. An integer is used as the baseelement, but pointers or anything else could have been used. As the queueoperations copy data into the queue, light units such as pointers are favoredover large structures. Parameters 4 and 5 are not used at the moment.

The queue structure is named our queue containing 32 elements, eachthe size of an int. If the programmer was passing characters or structuresthrough the queue, they would use char or struct x as parameter 3.

Now onto step 2:

DEFINE_OWQFUNC(our_queue,32,int,0,-1);

1 Using DEFINE OWQFUNC LOCKED will define an automatically locking versionof the queue.


This defines functions to be used explicitly on the queue type defined instep 1.2 The parameters work in a similar fashion. Parameter 1 defines boththe prepending name of the new queue operations and the type of queuestructure that the functions will work on. Parameter 2 again defines the sizeof each unit, and 3 determines the type.

The last two parameters are used in this case. One defines the returnvalue for a dequeue call on an empty queue and the other is the return valuefor an enqueue call on a full queue. Values such as 0 and -1 are generallysafe, but are configurable in light of situations where 0 is a valid value to bepushing through the queue. If 0 is enqueued and the other end dequeues it,there must be some means of determining that the value of 0 was intendedand not a result of a call on an empty queue. Select a value that is knownto be unique from the valid queue values.

Lastly, an instance of the queue structure is defined to be used in thethreads with the line:

our_queue Q;

Usage

Looking on to the thread code, it can see that the actual usage of the queueis simple. One thread calls our queue enq(&Q,count), which is the enqueuefunction created in step 2 above, using the defined structure Q, and pushinga value of count into it. The other end does a our queue deq(&Q) whichreturns a correct type off of the queue for usage in the other thread. Notethat step 2 also defines a few other simple calls for the queue. These includeour queue full() to see if the queue is full, our queue init() to initialize aqueue structure, and finally our queue top(), which will return the currenthead of the queue without removing it. our queue top also serves as anisempty() function.

Queue interaction, as can see, is very easy. It is also extremely fast, anddoes not require locking for most cases. The code is safe for multiple threadsthat are doing enqueue and dequeue operations at the same time, which isthe common case. The user needs to add an external lock when two or morethreads are enqueuing data at the same time or a set are dequeueing data atthe same time. Otherwise, no additional locking is needed.

2Again, specifying DEFINE OWQFUNC LOCKED will set up an automatically lock-ing queue.


This is only one example of a light synchronization method. RTCoreprovides this for the user’s convenience and the user is encouraged to closelyanalyze their synchronization needs to ensure that the right approach is cho-sen.

5.7.3 Atomic operations

In general, atomic operations include any type of operation that cannot befurther subdivided and can be viewed as a single distinct operation. For thepurposes of this document, atomic bit operations will be looked at in moredetail. Specifically, work done on a specific memory location in a single step.

Depending on the system at hand, simple steps like setting a variable toa specific value may appear to be atomic but can be very far from it. Codethat writes to a generic value may end up with that left in cache but notsynchronized with main memory, which on an SMP system can wreak havoc.Consider writing to a simple integer that another thread on a different CPUis waiting for. The first write may only make it to the cache but not to mainmemory, allowing the first thread to continue on with other work, while thesecond is working from flawed data. Even worse, both threads could updatethe value at the same time.

Atomic operations allow the programmer to say ’There is a value at thisaddress. Set bit 3 of this without ambiguity.’ The operation will be carriedout in a specific atomic fashion and error if someone else tried to do the samething at the same time. It will signal that the developer that they have totry again or take another route.

RTCore provides some simple API calls to handle these problems. Theyare custom, as POSIX does not define functions related to this problem,but they are meant to be easy to understand. As each architecture handlesatomic operations differently, these functions were designed to do the rightthing depending on the architecture at hand. Below are those functions.

rtl_a_set(int bit, volatile unsigned long *word);

rtl_a_clear(int bit, volatile unsigned long *word);

These two atomically set or clear a bit within a word address, respectively.The first parameter specifies which bit should be toggled and the secondspecifies the address base to be used for the operation.


rtl_a_test_and_set(int bit, volatile unsigned long *word);

rtl_a_test_and_clear(int bit, volatile unsigned long *word);

Implementations of the standard test and set/clear operations are pro-vided here. The call will atomically set a specific bit within a word andreturn the previous value.

rtl_a_incr(volatile unsigned long *w);

rtl_a_decr(volatile unsigned long *w);

Operating on a single long, these two will simply increment or decrementthe current value safely.

These are simple operations with a simple interface and can be usedto build very elegant and high performance synchronization methods. Whileother, more common mechanisms abound, most of them involve locks, queues,and other structures. With atomic operations, synchronization can be assimple as a single bit operation on an address.

Chapter 6

Communication betweenRTCore and the GPOS

The two components of a complete RTCore system, real-time and the user-space, generally run in two separate, protected address spaces. The real-timecomponent lives in the RTCore kernel, while the rest of the code lives as anormal process within the GPOS. In order to manage each side, there hasto be some kind of communication between the two. RTCore offers severalmechanisms to facilitate this.

6.1 printf()

printf() is probably the simplest means of communicating from a real timethread down to non real-time applications. When an RTCore applicationstarts up, it creates a ’stdout’ device to communicate to the calling envi-ronment, usually a terminal device of some kind. Calls to printf() in thereal-time application appear in the calling terminal the same way a printf()

call would in a normal application. This allows the user to log real-time out-put the following way:

./rtcore_app > log_file

The printf() implementation is fully capable and can handle any normaldata type and format. It also is a lightly synchronized method compared tosome others that will be presented here and very fast as a result, withoutimpacting other core activity.

73

74CHAPTER 6. COMMUNICATION BETWEEN RTCORE AND THE GPOS

6.2 rtl printf()

This can be thought of as a simple method of dropping information in theGPOS’s kernel ring buffer. rtl printf() is a normal printf() call thatexists within the real-time kernel and works the exact same way as printf()or printk(), but is safe from a real-time process.

For simplicity and speed in the kernel, this call does not support all formattypes that a standard printf() call does. Most notably, it does not handleformatting of floating point types.

While the overhead of rtl printf() is minimal, it is important to notethat there are implications. In order to safely synchronize with the GPOS,interrupts must be briefly disabled. This means that the programmer shouldavoid heavy use of it, especially in a tight loop. Any operation that affectstiming must be carefully considered with respect to real-time goals, so makesure that the debug output is not causing more problems than it is helpingsolve.

This call is a very useful method of logging via the kernel buffer, but mostusers will probably find the normal printf() call to be more convenient andflexible.

6.3 Real-time FIFOs

Generally, there is a need for bidirectional communication between the real-time module and the user-space code. The most straightforward mechanismfor this is the real-time FIFO. Applications can instruct RTCore to createFIFO devices at runtime via POSIX calls. The real-time module reads orwrites data to this device in a non-blocking manner. On the Linux side, aprocess can open a fifo and make read()/write() calls on it to exchangedata with the real-time kernel (non-real-time applications can be blocking ornon-blocking).

6.3.1 Using FIFOs from within RTCore

For every FIFO that is used, initialization code must do:

mkfifo("/mydevice", 0777);

fd = open("/mydevice", O_NONBLOCK);

ftruncate(fd, 8192);

6.3. REAL-TIME FIFOS 75

An important factor to remember is that the FIFO creation calls involvememory management in the ftruncate() operation, which is not availablefrom within real-time threads. As such, these calls must be made from withinthe main() context in order to be safe. This is unlikely to be a problem, asin nearly all cases, the developer needs to set up the FIFOs before startingreal-time operations. This is only for calls performing initialization, though.Real-time threads that call open("/mydevice", O NONBLOCK) do not invokememory management. Instead they just attach to the previously createddevice and are safe from within real-time threads.

In the example above, 0777 was used in the mkfifo() call. This indicatesto RTCore that the device should also be present in the GPOS filesystem.In the process of the call, a device of that name and permissions will be cre-ated. To create FIFOs that are to be used strictly between real-time threads,specify 0 for the mask. This will register the device so that real-time threadscan use it, but it will not be visible to the GPOS. More documentation onthis can be found in the Arbitrary FIFO device article provided in PDF formwith RTCore.

6.3.2 Using FIFOs from the GPOS

On the user-space side, the FIFO appears to be a normal file. As such, anynormal file operation is usable on the FIFO. For example, the user-spacecode could be a perl script or maybe just a logging utility comprised of:

cat /my_device > logfile

6.3.3 A simple example

It might be helpful to see some of the calls described here in a single appli-cation:

#include <time.h>#include <pthread.h>#include <unistd.h>#include <sys/types.h>#include <sys/stat.h>#include <stdio.h>#include <fcntl.h>


pthread t thread;int fd0, fd1, fd2; 10

void *start routine(void *arg){

int ret, status = 1;int read int;

while (status) {usleep(1000000);ret = read(fd1,&read int,sizeof(int));if (ret) { 20

printf("/mydev0: %d (%d)\n",read int,ret);

write(fd2,&read int,ret);}ret = read(fd0,&read int,sizeof(int));if (ret) status = 0;

}return 0;

}30


unlink("/mydev0");mkfifo("/mydev0", 0777);fd0 = open("/mydev0",O NONBLOCK);ftruncate(fd0, 4096);

unlink("/mydev1");mkfifo("/mydev1", 0777);fd1 = open("/mydev1",O NONBLOCK); 40

ftruncate(fd0, 4096);

unlink("/mydev2");mkfifo("/mydev2", 0777);fd2 = open("/mydev2",O NONBLOCK);

6.3. REAL-TIME FIFOS 77

ftruncate(fd2, 4096);

pthread create (&thread, NULL, start routine, 0);

rtl main wait(); 50

pthred cancel(thread);pthread join (thread, NULL);

close(fd0);close(fd1);close(fd2);

unlink("/mydev0");unlink("/mydev1"); 60

unlink("/mydev2");return 0;

}

Most of this code has been already used in other examples, but thissuccinctly shows the reader how to use POSIX I/O from within RTCore.As usual, a real-time thread is spawn from within main(), but first is hasto explicitly create, open, and size the FIFOs with the proper amount ofpreallocated space. 1

In the thread, read() reads from fd0 to see if it is time to shut downor otherwise read() reads from fd1 and write received data to fd2. Thesecalls are non-blocking for a reason. If the real-time thread ended up waitingfor a GPOS application that rarely got scheduling time, it would not bedeterministic. So in this case, the program just sleeps and attempts to readfrom the devices.

There is not much to look at on the user-space side, but for the sake ofcompleteness, here it is:

#include <stdio.h>#include <sys/types.h>#include <sys/stat.h>

1Refer to the next section for details on determining the correct amount to preallocate.


#include <fcntl.h>#include <unistd.h>

int main (int argc, char **argv){

int fd0, fd1, fd2, i, read int;10

fd0 = open("/mydev0",O WRONLY);fd1 = open("/mydev1",O RDWR);fd2 = open("/mydev2",O RDWR);

for (i = 0; i < 10; i++) {write(fd1,&i,sizeof(int));read(fd2,&read int,sizeof(int));printf("Received %d from RTCore\n",

read int);} 20

write(fd0,&i,sizeof(int));close(fd2);close(fd1);close(fd0);return 0;

}

The userspace application opens the FIFOs, dumps data over them andreads it back. After it is done, it writes to a third FIFO to signal the real-time thread that it is time to shut down, and then it closes the files. Oneminor difference is that on this end, the program did not open the devicesas non-blocking, although it can easily be done that way.

6.3.4 FIFO allocation

There are some rules as to how to handle FIFO allocation. When using thePOSIX interface, it is possible to do a normal

ftruncate(fd,32768);

6.4. SHARED MEMORY 79

style of call, but only when it is running in the main() context, and notin a thread. This way, if the call determines that there is no preallocatedspace to use for the device, it is safe to block while the memory allocationwork is handled.

6.3.5 Limitations

The real-time kernel is not bound to operate synchronously with the normaloperating system thread. If the real-time kernel is under heavy load, it maynot be able to schedule time for the GPOS to pull the data from the FIFO.Since the FIFO is implemented as a buffer, it is feasible that the buffermight fill from the real-time side before the user-space thread gets a chanceto catch up. In this case, it is advisable to increase the size of the buffer(with ftruncate()) or to flush the buffer from the real-time code to preventthe user-space application from receiving invalid data.

The inverse of this problem is that the FIFO cannot be a deterministicmeans of getting command data to the real-time module. The real-timekernel is not forced to run the GPOS thread with any regularity, as it mayhave more important things to do. A command input from a graphicalinterface on the OS side through the FIFO may not get across immediately,and determinism should never be assumed.

A subtler problem that must be overcome by the programmer is thatthe data passed through the FIFO is completely unstructured. This meansthat if the real-time code pushes a structure into the FIFO with somethinglike write(fd,&x,sizeof(struct x));, the user-space code should pull itout on the other side by reading the same amount of data into an identicalstructure. There has to be some kind of coordination between the two inorder to determine a protocol for the data, as otherwise it will appear tobe a random stream of bits. For many applications, a simple structure willsuffice, possibly with a timestamp in order to determine when the data wassampled and placed in the FIFO.

6.4 Shared memory

FIFOs provide serialized access to data, which is appropriate for applicationsthat operate with data in a queued manner. However, many applicationsrequire both userspace and real-time code to work with large chunks of data


and this is not always convenient to stream in and out of a FIFO. RTCoreprovides an option for these workloads: shared memory with mmap().

6.4.1 mmap()

If the reader is not familiar with mmap(), please refer to the RTCore orstandard man page for full details. The basic idea is that the user opens afile descriptor, calls mmap() on it with a given range, and it returns a pointerto an area in this file or device. Under RTCore, this is used with a device. Aswill be seen, both the real-time module and the user-space application bothopen the same device, call mmap(), and can subsequently access the samearea of memory.

The shared memory devices themselves are created with the POSIXshm open(), destroyed with shm unlink(), and sized with ftruncate().Please refer to the man pages for specific details. Only an overview will begiven here.

First, the device must be created. This is done with shm open(), whichtakes the name of the device, open flags and optionally a set of permis-sion bits. If the programmer is the first user and is creating the device,use RTL O CREAT. Furthermore, if the programmer wants this device to beautomatically visible in the GPOS filesystem, specify a non-zero value forthe permission bits. For example, the following call creates a node named/dev/rtl shm region that is visible to the GPOS with permission 0600, andreturns a usable file descriptor attached to the device:

int shm_fd = shm_open("/dev/rtl_shm_region",

RTL_O_CREAT, 0600);

Now the programmer has a handle to a shared region. However, it doesnot have a default size. This must be set via a call to ftruncate, as in:

ftruncate(shm_fd,400000);

Note that this will round up the size of the shared region in order to alignit on a page boundary (page size is dependent on architecture but generally4096 bytes). Also, as it does perform memory allocation, it must occur inthe initialization segment. Now the developer can use mmap() from eitherreal-time code or user-space code, as in:


addr = (char*)mmap(0,MMAP_SIZE,PROT_READ|PROT_WRITE,

MAP_SHARED,shm_fd,0);

The resulting addr can be used to address anything in that region up tothe size specified by the value passed to ftruncate().

Once the code is done with the area, it can call close() on the filedescriptor. The last user calls shm unlink() on the name of the device todestroy the area and unlink it from the GPOS filesystem:

close(shm_fd);

shm_unlink("/dev/rtl_shm_region");

It is worth noting that these need not be in order: if a thread is still usingthe area and another calls shm unlink(), the region will remain valid untilthe last user calls close() on the file descriptor. RTCore does referencecounting on devices like shared memory and FIFOs in order to allow thisbehavior.

6.4.2 An Example

The theory and practice are very simple, so without further discussion, belowis an example. First, the real-time application:

#include <time.h>#include <fcntl.h>#include <pthread.h>#include <unistd.h>#include <stdio.h>#include <sys/mman.h>#include <errno.h>

#define MMAP SIZE 500310

pthread t rthread, wthread;int rfd, wfd;unsigned char *raddr, *waddr;

void *writer(void *arg)


{struct timespec next;struct sched param p;

p.sched priority = 1; 20

pthread setschedparam(pthread self(), SCHED FIFO, &p);

waddr = (char*)mmap(0,MMAP SIZE,PROT READ|PROT WRITE,MAP SHARED,wfd,0);

if (waddr == MAP FAILED) {printf("mmap failed for writer\n");return (void *)−1;

}

clock gettime(CLOCK REALTIME, &next); 30

while (1) {timespec add ns(&next, 1000000000);clock nanosleep(CLOCK REALTIME, TIMER ABSTIME,

&next, NULL);waddr[0]++;waddr[1]++;waddr[2]++;waddr[3]++;

}} 40

void *reader(void *arg){

struct timespec next;struct sched param p;

p.sched priority = 1;pthread setschedparam(pthread self(), SCHED FIFO, &p);

raddr = (char*)mmap(0,MMAP SIZE,PROT READ|PROT WRITE, 50

MAP SHARED,rfd,0);if (raddr == MAP FAILED) {

printf("failed mmap for reader\n");


return (void *)−1;}


timespec add ns(&next, 1000000000);clock nanosleep(CLOCK REALTIME, TIMER ABSTIME, 60

&next, NULL);printf("rtl_reader thread sees "

"0x%x, 0x%x, 0x%x, 0x%x\n",raddr[0], raddr[1], raddr[2], raddr[3]);

}}


70

wfd = shm open("/dev/rtl_mmap_test", RTL O CREAT, 0600);if (wfd == −1) {

printf("open failed for write on "

"/dev/rtl_mmap_test (%d)\n",errno);return −1;

}

rfd = shm open("/dev/rtl_mmap_test", 0, 0);if (rfd == −1) {

printf("open failed for read on " 80


}

ftruncate(wfd,MMAP SIZE);

pthread create(&wthread, NULL, writer, 0);pthread create(&rthread, NULL, reader, 0);

rtl main wait(); 90


pthread cancel(wthread);pthread join(wthread, NULL);pthread cancel(rthread);pthread join(rthread, NULL);munmap(waddr, MMAP SIZE);munmap(raddr, MMAP SIZE);

close(wfd);close(rfd); 100

shm unlink("/dev/rtl_mmap_test");return 0;

}

First, a device is created and opened twice, once for a reader thread andonce for a writer. A thread is spawned for each task, which actually performsthe mmap(). Note that the ftruncate() call is in the main() context becauseit needs to perform memory allocation to back the shared area. Further callssuch as mmap() that do not cause allocations can happen anywhere.

The result of the mmap() call is a reference to the shared area, so oncethe programmer has the handles needed, they can reference the area freely.One thread updates the area every second, and the other reads it. Now thatthe programmer has an area that is shared between real-time process, butwhat about userspace? The same mechanism applies, as will be seen here:

#include <stdio.h>#include <unistd.h>#include <sys/mman.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <stdlib.h>#include <errno.h>

#define MMAP SIZE 5003 10

int main(void){

int fd;


unsigned char *addr;

if ((fd=open("/dev/rtl_mmap_test", O RDWR))<0) {perror("open");exit(−1);

} 20

addr = mmap(0, MMAP SIZE, PROT READ, MAP SHARED, fd, 0);if (addr == MAP FAILED) {

printf("return was %d\n",errno);perror("mmap");exit(−1);

}

while (1) {printf("userspace: the rtl shared area contains" 30

" : 0x%x, 0x%x, 0x%x, 0x%x\n",addr[0], addr[1], addr[2], addr[3]);

sleep(1);}

munmap(addr, MMAP SIZE);

close(fd);return 0;

} 40

There is not much work involved here. The code opens the device as anormal file and calls mmap() on it just as before. This piece of code performsthe same action as the reader in the real-time space, dumping the values ofthe first few bytes of data every second or so. As the writer updates the area,both the real-time reader and the user-space program see the same changes.

As with other RTCore mechanisms, it is assumed that the real-time sidedoes the initial work of creating the shared area. This ensures that the real-time code has a handle on what exists, and does not have to optionally waitfor some user-space application to get around to doing the work first. Ifthe user attempts to start the user-space code first, it will fail for multiplereasons: First, the device is not there to be opened until shm open() is called


from real-time code and even if it is there, there are no registered hooks forthe device.

6.4.3 Limitations

With shared memory, there is no inherent coordination between userspaceand real-time, as can be seen from the example. Any rules governing usageof the area must be added to the code. At any point, user code can overwriteone area that a real-time thread needed to retain data in. In addition, onecannot write to the area from real-time and then wait for it to be read andcleared when Linux gets time to schedule the user-space process. This woulddelay the real-time code indefinitely.

A little bit of synchronization can solve this type of problem. For example,if the developer is using the area to get frames of data over to user-space, thereal-time thread could write the blocks at a given interval across the sharedspace, and prepend each segment with a status byte indicating the state ofthe data. The user-space program, when it is done reading or analyzing eachsegment, can update that status byte to show that it is in use. This way thereal-time side can easily tell what areas are safe to overwrite.

This by-hand coordination can also easily allows the programmer to directreal-time code from user-space. One simple use is to allow control of real-timethreads. If both ends know that a certain area is meant to direct the actionsof a real-time thread, userspace code can easily flip a bit and indicate thata certain thread should be suspended, resumed, or even spawned. This canbe used to (non-deterministically) direct nearly anything that the real-timecode is doing, or vice-versa.

6.5 Soft interrupts

On x86 platforms, running Linux, the user will normally only find interruptsnumbered from 0 to 15 + NMI, as in the following:

CPU0

0: 75636868 XT-PIC timer

1: 6 XT-PIC keyboard

2: 0 XT-PIC cascade

4: 106 XT-PIC serial

6.5. SOFT INTERRUPTS 87

5: 157842206 XT-PIC eth0

8: 1 XT-PIC rtc

13: 1 XT-PIC fpu

14: 13637083 XT-PIC ide0

15: 12966 XT-PIC ide1

NMI: 0

On systems running RTCore, high interrupt numbers show up in /proc/interrupts

that range from 16-223. 2

CPU0

0: 1398262 RTLinux virtual irq timer

1: 4 RTLinux virtual irq keyboard

2: 0 RTLinux virtual irq cascade

11: 4902708 RTLinux virtual irq usb-uhci, eth0

12: 0 RTLinux virtual irq PS/2 Mouse

14: 29546 RTLinux virtual irq ide0

15: 5 RTLinux virtual irq ide1

219: 12178 RTLinux virtual irq sofirq jitter test

220: 0 RTLinux virtual irq RTLinux Scheduler

221: 26 RTLinux virtual irq RTLinux FIFO

222: 1293626 RTLinux virtual irq RTLinux CLOCK_GPOS

223: 5124 RTLinux virtual irq RTLinux printf

NMI: 0

ERR: 0

The interrupts above IRQ 15 are the software interrupts as provided byRTCore, although they still appear to be real hardware interrupts as faras Linux is concerned. The handler for these interrupts is executed in theGPOS’s kernel context, permitting a real-time thread to indirectly call func-tions within the GPOS kernel safely.

This demands a little explanation. Programmers cannot safely call GPOSkernel functions from within the real-time kernel because many of those callswill block as the Linux kernel performs various tasks. This generally leadsto deadlock, and has obvious implications for code that is supposed to beexecuting deterministically. A safe way around this is to register a softwareinterrupt handler in the Linux kernel that waits for a certain interrupt. When

2RTCoreBSD systems have a limit of 32 soft IRQs.


the real-time code requires a service to be done asynchronously in the GPOSspace, it signals an interrupt for this handler. The handler will not executein real-time, so the real-time code is not blocked in any way, but there is noguaranteed worst case delay between calling the soft-interrupt handler andactual execution. This is due to the same reason as before. The real-timekernel may prevent the GPOS kernel from running for some time, dependingon the current set of demands. However, for soft real-time tasks, this isgenerally a sufficient approach.

Again, it must be stressed that the GPOS is only seeing RTCore virtualIRQs. The handlers the GPOS had installed before RTCore was loaded arenot affected but are now managed by the interrupt emulation layer and thushave become soft interrupts. This process of insertion is handled transpar-ently to GPOS drivers.

This can be used to handle many inter-kernel communication mechanisms.As previously discussed, rtl printf() uses this mechanism to pass data tothe kernel ring buffer. It could also serve as a way for real-time code toallocate memory, by signalling a GPOS hander to safely perform the memorymanagement asynchronously.

6.5.1 The API

#include <stdlib.h>

To setup a software interrupt only a few functions are needed. Withthe rtl get soft irq() and rtl free soft irq(), interrupts are registeredand deregistered:

int rtl_get_soft_irq(void (* handler)(int, void *, struct rtl_frame *),

const char * devname);

void rtl_free_soft_irq(unsigned int irq);

The string passed as second argument to rtl get soft irq() is thestring name that will be associated with the IRQ, which on Linux will bedisplayed in /proc/interrupts. It is a good idea to make this somethingmeaningful, especially if the developer is making heavy use of the soft IRQhandlers.

The interrupt number assigned is the first free interrupt number fromthe top down. As such, there is little risk this will ever collide with a real


hardware interrupt. rtl get soft irq() will return -1 if there is a failure,but should otherwise return the value of the IRQ registered.

void rtl_global_pend_irq(int irq);

To actually signal the interrupt to Linux the function rtl global pend

irq() is given the soft interrupt number. When the Linux kernel runs, itwill see this interrupt as pending and execute the developer’s Linux handler.

The interrupt handler declaration is just like the one used for a regularLinux interrupt handler:

static void my_handler(int irq, void *ignore,

struct rtl_frame *ignore);

The same restrictions that apply to Linux-based hardware interrupt han-dlers apply to soft interrupt handlers, with respect to things like synchro-nization with Linux kernel resources from within an interrupt handler, etc.

6.5.2 An Example

This section would not be complete without a simple example. The soft IRQAPI is fairly small, so here is a piece of code that uses all of the calls:

#include <time.h>#include <stdio.h>#include <pthread.h>#include <stdlib.h>

pthread t thread;static int our soft irq;

void *start routine(void *arg){ 10

struct sched param p;struct timespec next;

p.sched priority = 1;pthread setschedparam (pthread self(),

SCHED FIFO, &p);




clock nanosleep (CLOCK REALTIME,TIMER ABSTIME, &next, NULL);

rtl global pend irq(our soft irq);}return 0;

}

static int soft irq count;

void soft irq handler(int irq, void *ignore, 30

struct rtl frame *ignore frame) {soft irq count++;printf("Recieved soft IRQ #%d\n",soft irq count);

}


soft irq count = 0;our soft irq = rtl get soft irq(soft irq handler,

"Simple SoftIRQ\n"); 40

if (our soft irq == −1)return −1;

pthread create(&thread, NULL, start routine, 0);

rtl main wait();

pthread cancel(thread);pthread join(thread, NULL);rtl free soft irq(our soft irq);return 0; 50

}

On initialization, the user gets a soft IRQ, providing the function thatshould act as the handler, and a short name. If this call is successful, the


code spawns a thread.From this point on, the soft irq handler() is registered in the Linux

kernel as an interrupt handler and the user has a real-time thread in aninfinite loop. In this loop, it activates on half-second intervals, pending thesoft IRQ each time. These interrupts are caught by Linux, which executesthe soft irq handler(), which in turn dumps the current interrupt countvia printk(). On exit, the tail end of main() destroys the real-time threadas usual, and then deregisters the soft IRQ handler.

As the reader can see, it is not very hard to interact with the Linux kernelin this fashion. By simply pending interrupts, the user can trigger theirown handlers to do some dirty work in the GPOS kernel, without sacrificingdeterminism in the real-time code.

Chapter 7

Debugging in RTCore

No one likes to admit it, but most developers spend a large chunk of timedebugging code, rather than writing it. Bugs in RTCore can be even moredifficult to trace down. By inserting any debug traces or other mechanisms,the system is changed, and all of a sudden the bug will not trigger. (Tim-ing dependent bugs are of course possible in other systems, but are moreprevalent in real-time development.)

Additionally, all real-time code, if it is running inside the RTCore kernel,has the potential to halt the machine (PSDD threads live in external addressspaces). Debugging userspace applications is simpler, as a failure will simplyresult in the death of the process, not the kernel. Trying to tackle the bugis usually just a matter of cleaning up and trying the program again. Theseluxuries are harder to come by in the kernel.

Fortunately, RTCore provides a debugger that can often prevent pro-gramming errors from bringing the system down. Loaded with the rest of theRTCore, (it can be disabled through recompilation with source) the RTCoredebugger watches for exceptions of any kind and stops the thread that causedthe problem before the system goes down.

7.1 Enabling the debugger

The debugger is enabled during configuration of RTCore, under selectivecomponent building options.

93

94 CHAPTER 7. DEBUGGING IN RTCORE

7.2 An example

There are some important things to know about the debugger, but beforegetting into the details, walking through a simple example is needed. Aswith anything else, the first step is a hello world application. It is assumedto be named ’debug-hello.c’ for the purposes of this exercise:

#include <time.h>#include <unistd.h>#include <stdio.h>#include <pthread.h>

pthread t thread;pthread t thread2;


int i;struct sched param p;

volatile pthread t self;self = pthread self();

p.sched priority = 1;pthread setschedparam (pthread self(), SCHED FIFO, &p);

if (((long) arg) == 1) { 20

/* cause a memory access error */*(unsigned long *)0 = 0x9;

}

for (i = 0; i < 20; i ++) {usleep(500000);printf("I’m here; my arg is %ld\n", (long) arg);

}return 0;

} 30

7.2. AN EXAMPLE 95


pthread create (&thread, NULL, start routine, (void *) 1);pthread create (&thread2, NULL, start routine, (void *) 2);

rtl main wait();

pthread cancel (thread);pthread join (thread, NULL); 40

pthread cancel (thread2);pthread join (thread2, NULL);return 0;

}

As with other examples, there is an initialization context and a cleanupcontext, with real-time code that is run in between. In this initialization,two real-time threads are spawn running the same function, with an error(access of illegal memory) that the first thread will hit, as its argument is 1.

What happens when this module is loaded is that the first thread isspawned and causes a memory access error. The fault will need to happenfirst so that the debugger can attach to it. The example can be run with:

./debug-hello.rtl &

The debugger catches the fault and halts all real-time threads. Thismeans the second thread is also halted, so there is no stream of ”I’m here”messages from the second thread, even though it does not have a problem.This is to allow for a completely known system state that the developer canstep through at will.

The debugger prints a notice of the exception to the console, so that run-ning ’dmesg’ will produce a line detailing which thread caused the exception,where it was and how to begin debugging. Now the debugger can be startedto analyze the running code.

RTLinuxPro provides the real-time debugger module, and GDB to beused from userspace. Other debuggers are also usable, such as DDD, butfor this example GDB is assumed. Now that the real-time code has hitan exception, GDC can be run on the object file that was saved duringcompilation for debugging:


# gdb debug-hello.o.debug

(gdb)

The next step is to connect GDB to the real-time system. This is accom-plished using the remote debugging facility of GDB. The real-time systemprovides a real-time FIFO for debugger communication:

(gdb) target remote /dev/rtf10

Remote debugging using /dev/rtf10

The RTCore debugger uses three consecutive real-time devices: /dev/rtf10,/dev/rtf11, and /dev/rtf12. The starting FIFO can be changed with thesource version of the kit. Future versions of RTCore may use the namedFIFO capability of RTCore rather than the older /dev/rtf devices.

Now, in this case, the user expects to see a memory access violation.Once the target remote /dev/rtf10 message is entered the user shouldsee GDB display the following:

Remote debugging using /dev/rtf10

[New Thread 1123450880]

start_routine (arg=0x1) at debug-hello.c:25

25 *(unsigned long *)0 = 0x9;

(gdb)

The above message tells the user that they are indeed debugging through/dev/rtf10, that the thread ID that faulted is 1123450880 and that the faultwas in the function start routine which was passed 1 argument named arg

with value 0x1. This is all contained in source file debug-hello.c on sourceline 25. GDB also displays the actual source line in question. The error thatwas generated was indeed where the example placed it.

Now, examining the function call history. This may be necessary in com-plex applications in order to determine the source of an error. Typing bt willcause GDB to print the stack backtrace that led to this point.

(gdb) bt

#0 start_routine (arg=0x1) at debug-hello.c:25

#1 0xd1153227 in ?? ()

7.2. AN EXAMPLE 97

Perhaps it is not clear what type of variables are being operated on. Ifthe programmer wishes to examine the type and values of some variables usethe following commands:

(gdb) whatis arg

type = void *

(gdb) print arg

$1 = (void *) 0x1

To get a better idea of what other operations are being performed in thisfunction one can list source code for any function name or any set of linenumbers with:

(gdb) list start_routine

16

17 volatile pthread_t self;

18 self = pthread_self();

19

20 p . sched_priority = 1;

21 pthread_setschedparam (pthread_self(), SCHED_FIFO, &p);

22

23 if (((long) arg) == 1) {

24 /* cause a memory access error */

25 *(unsigned long *)0 = 0x9;

It is also possible to disassemble the executable code in any region ofmemory. For example, to view the start routine function:

(gdb) disassemble start_routine

Dump of assembler code for function start_routine:

0xd1137060 <start_routine>: push %ebp

0xd1137061 <start_routine+1>: mov %esp,%ebp

0xd1137063 <start_routine+3>: sub $0x10,%esp

0xd1137066 <start_routine+6>: lea 0xfffffff8(%ebp),%eax

0xd1137069 <start_routine+9>: push %edi

0xd113706a <start_routine+10>: push %esi

0xd113706b <start_routine+11>: push %ebx

0xd113706c <start_routine+12>: mov 0xd116f8a0,%edx

0xd1137072 <start_routine+18>: mov %edx,0xfffffffc(%ebp)


0xd1137075 <start_routine+21>: movl $0x1,0xfffffff8(%ebp)

0xd113707c <start_routine+28>: push %eax

0xd113707d <start_routine+29>: push $0x1

0xd113707f <start_routine+31>: push %edx

0xd1137080 <start_routine+32>: call 0xd11540e4

0xd1137085 <start_routine+37>: add $0xc,%esp

0xd1137088 <start_routine+40>: cmpl $0x1,0x8(%ebp)

0xd113708c <start_routine+44>: jne 0xd1137098 <start_routine+56>

0xd113708e <start_routine+46>: movl $0x9,0x0

0xd1137098 <start_routine+56>: lea 0xfffffff0(%ebp),%ebx

0xd113709b <start_routine+59>: push %ebx

0xd113709c <start_routine+60>: mov 0xd1161e8c,%eax

0xd11370a1 <start_routine+65>: push %eax

Once the programmer is done debugging, they may exit the debugger andstop execution of the process being debugged.

(gdb) quit

The program is running. Exit anyway? (y or n) y

RTCore will resume execution of all threads but will leave the applicationthat was being debugged stopped. To actually remove the application ormodule the developer must stop it either by sending it a signal to stop it(perhaps by typing control-c in the window) or removing the applicationmodule.

7.3 Notes

There are a few items to keep in mind when using the RTCore debugger.Most of these items are short but important, so keep them in mind in orderto make the debugging sessions more effective.

7.3.1 Overhead

The debugger module, when loaded, catches all exceptions raised, regardlessof whether it is related to real-time code, GPOS, or otherwise. This incurssome overhead: Consider for example the case where a userspace programcauses several page faults as it is working through some data. These page

7.3. NOTES 99

faults cause the debugger to do at least some minor work to see if the faultis real-time related. This may lead to a slight degradation of the GPOSperformance, so if the GPOS really needs some extra processing, the debug-ger module may be removed. In practice, however, the benefits of havingprotection against misbehaving RT programs usually outweigh the overheadincurred by the debugger.

For those who wish to avoid this overhead, the source version of RTCoreallows the programmer to reconfigure the OS without the debugger for pro-duction use.

7.3.2 Intercepting unsafe FPU use

Real-time threads that use the FPU must enable floating point operationswith pthread attr setfpu np(). If they do not do this, they cannot safelyuse the floating point unit on the CPU, as the context will not be maintainedfor them.

On PPC systems, the debugger will detect threads that use the FPUwithout enabling it, and cause a fault.

7.3.3 Remote debugging

Sometimes it is helpful to debug code remotely. This usually occurs whenthe remote machine is a different architecture, and the developer does notwant to run GDB on the target machine itself. (RTLinuxPro provides GDB,but there may not be enough room on the target device or the developer mayneed some additional tools, etc.) In this case, netcat is the preferred optionand is provided with the RTLinuxPro development kit.

Netcat provides the ability to pipe file data over a given port. In thecontext of the RTCore debugger, this means that netcat can be started onthe target such that it essentially exports /dev/rtf10 over the network. Hereis an example of how to start netcat on the target machine:

nc -l -p 5000 >/dev/rtf10 </dev/rtf10 &

This starts netcat on the device, listening on port 5000, feeding data fromthe network listener into the FIFO, and also pushing data coming out of theFIFO out onto that same listener. In GDB running on the development host,the programmer can connect to the remote real-time system with target

remote targethost:5000, where targethost is the target machine name.


Netcat will exit when the user detaches from the socket, so if the pro-grammer is going to do many debugging runs, it will have to be restartedper session.

7.3.4 Safely stopping faulted applications

Once the programmer is done analyzing the state of the system, the faultyapplication must be stopped. This can be done the following series of com-mands:

(gdb) CTRL-Z

[1]+ Stopped gdb

# killall app_name

# kill %1

Make sure to not trigger any GDB commands that would cause the real-time code to continue because it would just execute the faulty code again.

7.3.5 GDB notes

GDB has a problem with examining data in the bss section, so any variablesthat were not explicitly initialized are not viewable from GDB. This may befixed in a later release, but in the meantime, it is simplest to initialize anyvariables that will be analyzed with GDB.

The RTCore debugger can be used to debug the user-space (PSDD) RTthreads1. Debugging threads running under the userspace frame scheduler isalso supported.

Under NetBSD, the RTCore symbols must be explicitly loaded. This canbe done with:

gdb hello.o.debug

(gdb) symbol-file /var/run/rtlmod/ksyms

(gdb) target remote /dev/rtf10

1at present, only with Linux

Chapter 8

Tracing in RTCore

8.1 Introduction

Real-time programs can be challenging to debug because traditional debug-ging techniques such as setting breakpoints and step-by-step execution arenot always appropriate. This is mainly due to two reasons:

• Some errors are in the timing of the system. Stopping the programchanges the timing, so the system can not be analyzed without modi-fying its behavior.

• If the real-time program controls certain hardware, suspending the pro-gram for analysis may cause the hardware to malfunction or even break.

RTCore implements a subset of POSIX trace facilities. Using them, itis possible to analyze and evaluate real-time performance while a real-timeprogram is running. An introduction to the POSIX tracing as well as theAPI definitions can be found in the Single UNIX Specification 1.

The tracer aims to follow the POSIX Tracing API reasonably closely. Onenotable difference is that most functions and constants have the RTL TRACE

or rtl trace prefix rather than posix trace . The API functions are de-clared in the include/rtl trace.h file. To use the tracer, CONFIG RTL TRACER

option (”RTLinux tracer support”) must be enabled during configuration ofthe system.

1In the development kit, doc/html/susv2

101

102 CHAPTER 8. TRACING IN RTCORE

rtlinuxpro/examples/tracer/turnon.o is a module that creates a trace-stream for each cpu and starts the tracing.

To see a quick demonstration, the user will have to have source to RTLinux.Recompile the system with CONFIG RTL TRACER on, load the RTCore, rtlinuxpro/examples/tracer/turnon.o, rtlinuxpro/examples/tracer/testapp.o modules, andrun the rtlinuxpro/examples/tracer/log 0, where 0 can be replaced withthe desired CPU number. The user should see the dump of stream of eventson the target CPU.

8.2 Basic Usage of the Tracer

There are two parties involved in tracing: the program being analyzed andthe analyzer process. When the program to be analyzed is instrumented fortracing, it records the information about events encountered during execu-tion. For each event, information about the current CPU, current thread id,a timestamp, and optional user data is recorded into an in-memory buffer.RTCore tracer provides built-in trace points for certain system events such ascontext switches. The list of currently supported system events is providedin the next section. In addition, an RTCore program can trace user-definedevents by invoking rtl trace event function with RTL TRACE UNNAMED USEREVENT

as the event id.Before the tracing can be started, a POSIX trace stream must be created.

For an example of creating a trace stream, please see the rtlinuxpro/examples/tracer/turnon.o module.

The analyzer process is a GPOS (userspace) process that reads the eventrecords made by the trace subsystem. This is done with the functionsrtl trace trygetnext event, rtl trace getnext event, andrtl trace timedgetnext event. An example of a trace analyzer processcan be found in examples/tracer/log.c.

8.3 POSIX Events

For every event, the following members of the struct rtl trace event info arefilled:

• posix event id is the event identifier.

8.3. POSIX EVENTS 103

• posix timestamp is struct timespec, represents the time of the event;the clock used does not necessarily correspond to any of the systemclocks.

• posix thread id is the thread id for the current thread.

The list of currently supported events include:

• RTL TRACE OVERFLOW – The system detected an overflow. Some eventshave been lost. It is necessary to reset the profiling in progress to avoidgetting incorrect results.

• RTL TRACE RESUME – The system has recovered from an overflow con-dition.

• RTL TRACE SCHED CTX SWITCH – context switch. The accompaning datais a void * pointer of the new thread.

• RTL TRACE CLOCK NANOSLEEP – the thread invoked the clock nanosleepcall.

• RTL TRACE BLOCK– the thread voluntarily blocks itself (e.g., as a resultof a clock nanosleep call).

• RTL TRACE UNNAMED USEREVENT – this is a user-defined event. The datacan be arbitrary.

The events may be selectively enabled for tracing with the rtl trace set filter

function. For best performance, it is advisable to disable unneeded eventtypes.

It is possible to perform function call tracing the help of the tracer. To dothis, the program to be analyzed must be compiled with the -finstrument-functionsoption to gcc. For an example, please see rtlinuxpro/examples/tracer/testmod.cin the RTCore distribution. For modules compiled with -finstrument-functions,two special events are generated:

• RTL TRACE FUNC ENTER – Function entry. event->posix prog address

represents the address in the program from which the function call hasbeen made. The data that accompanies this event is a void * pointerto the function that has been called.

104 CHAPTER 8. TRACING IN RTCORE

• RTL TRACE FUNC EXIT – Function exit. event->posix prog address

is a pointer to the function that has exited. The data that accompaniesthis event is a void * pointer to the place from which the function callhas been made.

Chapter 9

IRQ Control

Once RTCore is loaded, the GPOS does not have any direct control overhardware IRQs. Manipulation is handled through RTCore when there areno real-time demands. However, RTCore applications can manipulate IRQsfor real-time control. The next section will cover the basic usage of the IRQcontrol routines.

9.1 Interrupt handler control

First, a look at the calls needed to manage interrupt handlers will be re-viewed. Unless otherwise specified, only the original GPOS interrupt han-dlers will handle incoming interrupts, once there are no real-time demands.Setting up real-time interrupt handlers will be covered next.

9.1.1 Requesting an IRQ

An RTCore application can install an IRQ handler with the call rtl request irq(

irq num, irq handler), where the irq handler parameter is a function oftype:

unsigned int *handler(unsigned int irq, struct rtl_frame *regs);

This will hook the function passed as the second argument to rtl request irq

to be called when IRQ irq num occurs, much like any other IRQ handler.When that function is invoked, it will run in interrupt context. This meansthat some functions may not be callable from the handler and all interrupts

105

106 CHAPTER 9. IRQ CONTROL

will be disabled. This handler is not debuggable directly, but as threads are,it is safe to post to a semaphore that a thread is waiting on. The threadwill be switched to immediately so that operations can be performed in areal-time thread context. Upon execution of any operation the thread thatcauses a thread switch control will return to the interrupt handler.

9.1.2 Releasing an IRQ

An IRQ can be released with rtl free irq(irq num). This will unhook thehandler given to rtl request irq and it will not be called again. However,it is possible that this interrupt handler is still executing on the current oranother CPU so care should be taken by the application programmer toensure this is not the case.

9.1.3 Pending an IRQ

Many applications require that a GPOS interrupt handler get an IRQ inaddition to the handler installed by rtl request irq once the handler isdone doing any work. The RTCore application might just be interested inkeeping track of when IRQs are coming in or some simple statistic, beforeallowing the GPOS to proceed and handle the work.

In these cases, the rtl global pend irq(irq num) function should beused. This will pend the IRQ for the GPOS and once the RTOS is finishedthe GPOS will process this as a pending IRQ.

9.1.4 A basic example

The following code is a basic example of an application that tracks in-coming IRQs for the GPOS. It grabs an IRQ with rtl request irq(),pends it during operation with rtl global pend irq(), and releases it withrtl free irq().

#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <pthread.h>#include <string.h>#include <semaphore.h>

9.1. INTERRUPT HANDLER CONTROL 107

pthread t thread;sem t irqsem;int irq = −1; 10

void *thread code(void *t) {static int count = 0;

while (1) {sem wait(&irqsem);count++;printf("IRQ %d has occurred %d times\n",

irq, count);} 20

return NULL;}

unsigned int intr handler(unsigned int irq,struct rtl frame *regs) {

rtl global pend irq(irq);sem post(&irqsem);return 0;

} 30


int ret;

if ((argc != 2) | | strncmp( argv[1], "irq=", 4 )) {printf( "Usage: %s: irq=#\n", argv[0] );return −1;

}40

irq = atoi(&argv[1][4]);

sem init (&irqsem, 1, 0);pthread create(&thread, NULL, thread code, (void *)0);


if ((ret = rtl request irq( irq, intr handler )) != 0) {printf("failed to get irq %d\n", irq);ret = −1;goto out;

} 50

rtl main wait();

rtl free irq(irq);

out:pthread cancel(thread);pthread join(thread, NULL);

return ret; 60

}

This code initializes and pulls the requested IRQ for tracking from thepassed-in arguments. It then spawns a thread that waits on a semaphore.This thread will be printing out the IRQ count as they occur. As mentioned,the handler will be invoked in interrupt context, and as such is fairly limitedin what it can do. Instead, the handler is hooked up but does no real workexcept for the rtl global pend irq() for the GPOS and sem post() forthe thread.

As with the other examples, this can continue indefinitely. If it is hookedto the interrupt for a hard drive, it will trigger a message with a count foreach interrupt triggered by the device. When the application is stopped witha CTRL-C, it will release the IRQ handler, kill the thread, and unload asusual. The GPOS IRQ handler will then be the only handler for the device.

9.1.5 Specifics when on NetBSD

RTCore on BSD UNIX also requires the programmer to call rtl map gpos irq(

bsd irq) to obtain the IRQ identifier prior to using any RTCore interruptcontrol functions. This function transforms the NetBSD IRQ identifier to anRTCore IRQ.

9.2. IRQ STATE CONTROL 109

NetBSD’s interrupt scheme changed considerably with the addition ofSMP support. This call maintains compatibility with RTCore’s interrupthandling and must be called before functions like rtl request irq().

The IRQ can be an ISA IRQ number (e.g., IRQ7 for LPT) or the returnvalue from the PCI interrupt lookup function pci intr map(). On success,the function returns a value that can be used for rtl request irq() or otherRTCore IRQ functions. On error, a negative value is returned.

9.2 IRQ state control

Besides interrupt handlers, applications commonly need to control interruptstates - specifically, whether interrupts are enabled or disabled. This is acommon means of synchronization for some tasks, although less intrusivemeans of mutual exclusion are generally possible. The next section will coverhow to enable and disable interrupts, save state, and similar tasks.

9.2.1 Disabling and enabling all interrupts

Generally, interrupts are disabled with a “hard” disable and “hard” enabled.When RTCore is running, any enable and disable calls made by the GPOS arevirtualized so that they do not actually disable real interrupts. RTCore appli-cations can directly disable hardware interrupts with rtl stop interrupts

and enable them again with rtl allow interrupts. These function callsenable and disable interrupts unconditionally. Sometimes, it is preferable tosave the current state, disable interrputs, perform some critical work andthen restore the saved state. This can be done with the sequence below:

#include <rtl sync.h>

void function(void){

rtl irqstate t flags;

/* save state and disable interrupts */rtl no interrupts( flags );

/* perform some critical operation. . . */ 10


/* restore the previous interrupt state */rtl restore interrupts( flags );

}

These calls do disable the real-time interrupts, so they must be used withcare. Interrupts should never be disabled longer than absolutely necessarybecause events may be missed. The system may also run out of control ifthe application never re-enables the interrupts. However, some applicationscannot handle any kind of jitter during certain operations, even the minimaloverhead of receiving an Ethernet IRQ, and must disable all interrupts forshort periods.

While this is a simple mechanism for synchronization, it cannot be stressedenough that lighter mechanisms that do not disable interrupts are almost al-ways favorable. Even if the programmer thinks the code, protected withdisabled interrupts, is not on an important path, it may be running on thesame hardware with another application that cannot tolerate that kind ofbehavior. Please see section 5.7.3 for more details.

Specific IRQs can be enabled or disabled with rtl hard enable irq(irq num)

and rtl hard disable irq(irq num) respectively. This allows the user totarget a specific IRQ rather than the entire set.

9.3 Spinlocks

pthread spin lock includes an implicit IRQ save of state and interrupt dis-able and pthread spin unlock includes an implicit restore of the interruptstate at the time the corresponding pthread spin lock call.

This can be a problem in cases where the locks are released in a differentorder from when they were taken. For example:

#include <pthread.h>

void function(void){

pthread spinlock t lock1, lock2;

9.3. SPINLOCKS 111

/* initialize the locks */pthread spin init( &lock1, 0 );pthread spin init( &lock1, 0 ); 10

/* . . .assume interrupts are enabled here. . . */

/* acquire lock 1 */pthread spin lock( &lock1 );/* . . .interrupts are now disabled here. . . */

/* acquire lock 2 */pthread spin lock( &lock2 );/* the state in lock2 is interrupts “enabled” */ 20

/* release lock 1 */pthread spin unlock( &lock1 );/* interrupts are now disabled since lock1 state was “enabled” */

/* release lock 2 */pthread spin unlock( &lock2 );/* restore to a interrupt disabled state */

} 30

Note that state restore when releasing lock1 and lock2 is incorrect sincethe locks were acquired in a different order than they were released in.

Chapter 10

Writing Device Drivers

This chapter presents examples of several classes of RTCore drivers and howthey interact with user-level programs and other RTCore applications.

Writing RTCore device drivers is very similar to writing normal RTCoreapplications. Since all memory, including device memory, is accessible toRTCore applications every RTCore program can potentially function as adriver. Where drivers and normal RTCore applications differ is in how theycommunicate with user-space (GPOS) applications and other RTCore pro-grams.

10.1 Real-time FIFOs

The simplest way of communicating with a driver is through a real-timeFIFO. This is the simplest type of driver and is best used when one-waycommunication with the driver is needed since FIFOs only perform read()

or write() operations. An example is a motor controller that that onlyreceives commands (such as motor speed) or a simple data acquisition devicethat sends information (such as the temperature of a probe).

FIFO operations and how to use them in RTCore applications is wellcovered in previous chapters so it will not be covered here.

10.2 POSIX files

A more advanced and full featured, interface is through POSIX file oper-ations. Drivers can advertise their services to other RTCore applications

113

114 CHAPTER 10. WRITING DEVICE DRIVERS

and only RTCore applications, through files, just as with a standard UNIXsystem. These files are managed by RTCore and are not directly accessiblefrom the GPOS environment. For example, a Linux application that opens/dev/lpt0 is communicating with the Linux (non-real-time) parallel portdriver and not the RTCore driver. Conversely, a RTCore application thatopens /dev/lpt0 is communicating with the RTCore driver and not with theLinux driver.

The example driver below provides a /dev/lpt0 file that can be usedthrough POSIX open(), read(), write(), ioctl(), mmap() and close()

calls from RTCore applications. Two files, /dev/lpt0 and /dev/lpt1 arecreated. When an RTCore application performs any operations on these filesthe driver prints a message.

#include <stdio.h>#include <sys/types.h>#include <rtl posixio.h>

static rtl ssize t rtl par read(struct rtl file *filp, char *buf,rtl size t count, rtl off t* ppos)

{printf("read() called on file /dev/lpt%dn", filp−>f priv);return 0;

} 10

static rtl ssize t rtl par write(struct rtl file *filp, const char *buf,rtl size t count, rtl off t* ppos)

{printf("write() called on file /dev/lpt%dn", filp−>f priv);return 0;

}

static int rtl par ioctl(struct rtl file *filp,unsigned int request, unsigned long l) 20

{printf("ioctl() called on file /dev/lpt%dn", filp−>f priv);return 0;

}

10.2. POSIX FILES 115

static int rtl par open(struct rtl file *filp){

printf("open() called on file /dev/lpt%dn", filp−>f priv);return 0;

} 30

static int rtl par release(struct rtl file *filp){

printf("close() called on file /dev/lpt%dn", filp−>f priv);return 0;

}

static rtl off t rtl par llseek(struct rtl file *filp,rtl off t off, int flag)

{ 40

printf("lseek() called on file /dev/lpt%d, offset %d and flag %d\n",filp−>f priv, off, flag);

return 0;}

int rtl par mmap(struct rtl file *filp, void *a, rtl size t b,int c, int d, rtl off t e, rtl caddr t *f)

{return 0;

} 50

int rtl par munmap(struct rtl file *filp, void *a, rtl size t length){

return 0;}

int rtl par unlink(const char *filename, unsigned long i){

printf("unlink() called on %s, should be /dev/lpt%d\n",filename, i); 60

return 0;}


int rtl par poll handler(const struct rtl sigaction *sigact){

printf("sigaction() with SIGPOLL called\n");return 0;

}

int rtl par ftruncate(struct rtl file *filp, rtl off t off) 70

{printf("ftruncate() called on file /dev/lpt%d\n",

filp−>f priv);return 0;

}

void rtl par destroy(int minor){

printf("destroy() called on minor %d, last user done\n", 80

minor);}

static struct rtl file operations rtl par fops ={open: rtl par open,release: rtl par release,read: rtl par read,write: rtl par write,ioctl: rtl par ioctl, 90

munmap: rtl par munmap,mmap: rtl par mmap,unlink: rtl par unlink,install poll handler: rtl par poll handler,ftruncate: rtl par ftruncate,destroy: rtl par destroy};

int main(int argc, char **argv){ 100

rtl register dev( "/dev/lpt0", &rtl par fops, 0 );

10.2. POSIX FILES 117

rtl register dev( "/dev/lpt1", &rtl par fops, 1 );

rtl main wait();

rtl unregister dev( "/dev/lpt0" );rtl unregister dev( "/dev/lpt1" );

return 0;} 110

10.2.1 Error values

Drivers should report errors to the caller through handler return values foreach operation. For example, a driver that wishes to report a failure during awrite() when there is no space remaining should return -RTL ENOSPC. ThePOSIX file layer of RTCore will translate any return value less than 0 asan error and will set errno appropriately. So, RTCore applications makingthis write() call will receive a -1 return value and rtl errno will containRTL ENOSPC. This application will print the errno value through rtl perror.

A complete list of errno values can be found in include/rtl errno.h.

10.2.2 File operations

Any file operation that a driver does not wish to handle can be safely set toNULL. The RTCore POSIX file layer will check for NULL handlers and willreport the appropriate error to the caller.

A list of ioctl() flags is in include/rtl ioctl.h. There are manyflags for specific devices and for general use. It is recommended that theprogrammer not create new flags unless one of those found in rtl ioctl.h

does not fit their needs.Just as there is a list of ioctl() flags there is also a list of mmap() flags

in sys/mman.h. Only create new flags if the user absolutely needs them andnone of the pre-existing flags will fit.


10.3 Reference counting

Devices registered with the RTCore kernel are reference counted. Look-ing into include/app/rtl/rtl posixio.h shows an extra callback nameddestroy(). RTLinuxPro as of version 1.2 has added the capability of inter-nally handling reference counts to all devices.

From a developer’s standpoint, this generally does not require any extrawork in most situations, but it is worth stepping through the rules of howRTCore handles these operations. The previous example will be used as areference point.

First, when the device is registered with rtl register dev(), it registersthe name and sets the usage count to one. This causes the usage to drop backto 0 with the call rtl unregister dev(). Also, any open() call incrementsthe device’s usage count, while close() decrements it again.

For devices that allocate and destroy areas, it is important that whenthe last user detaches from the device, any resources associated with thatdevice are destroyed. The following example will look at a device driver thatmaintained a pointer to a shared region of memory, initialized to NULL, forthe user. When the first user calls open(), memory is allocated for use bythe threads. When the last user detaches from this device through close(),it is important that the area is deallocated.

This is the reason for the destroy() callback in include/app/rtl/rtl posixio.h.If the device has work that needs to be done when the last user exits on a de-vice, this hook is called. For the shared memory example, a destroy callbackwould have been added. It is defined as:

void example_destroy(int minor) {

rtl_gpos_free(array_ptr);

}

This would have been passed in the fops structure with everything else.When the last user exits, RTCore will call this function so that memory issafely deallocated and not when other threads may be using it. Otherwise, ifsome code was using the area when another called rtl unregister dev(),the memory would be freed out from under active code.

RTCore provides a couple of routines to allow the programmer to controlthese counts by hand if needed: incr dev usage(int minor) and decr dev usage(int

minor). This is helpful if the developer needs to work with device resources

10.3. REFERENCE COUNTING 119

and wants to make sure that the last user does not exit and cause a destruc-tion of all device resources while this work is occurring. An alternative isto perform a normal open() on the device, do the work and then close().This is the simplest method, but some drivers may still derive some use fromthe incr/decr routines.

There is one more factor to keep in mind when using these calls: thertl namei() call performs an implicit incr dev usage(). This is done inorder to simplify the process of safely allocating a device. For functionsthat use rtl namei(), there must be a symmetric decr dev usage() call toprevent an artificiallly raised usage count.

10.3.1 Reference counting and userspace

This reference count concept extends to devices available to userspace pro-cesses. Consider the API call rtl gpos register dev(), which allows RTCorecode to create devices visible in the GPOS filesystem. If the developer cre-ates a real-time device and a userspace-visible counterpart, there may alsobe userspace processes bound to the area as well. With respect to referencecounting, these processes are treated the same way. Each GPOS open()

raises the device count and each GPOS close() decrements it. Even if all ofthe real-time threads close and exit while one userspace maintains a handle,the destruction of the resource waits until the last user closes. When theuserspace code exits, the callbacks will find that it was the last user and willfree any resources just as if it was opened by a real-time thread.

If the previous example had added an rtl gpos register dev() call tothe creation of a shared memory device (and an rtl gpos unlink() to thecleanup), let a userspace application also access the area, and then shut downthe real-time threads, the userspace application would still be able to accessthe area. Once it exits, the close() would occur and bring the usage countto 0, causing the destroy callback to execute and clean up.

Of course, this is a fairly simple example, but it does not get much morecomplicated in a real-world system. One difference is that most drivers en-capsulate information on a per-device basis, so the destroy() logic needs touse the minor parameter in order to determine what should be cleaned up.However, all of the basic concepts apply, and RTCore does all of the workfor the developer internally. This allows for greater flexibility and simplicityin the common driver.

Part II

RTLinuxPro Technologies

121

Chapter 11

Real-time Networking

11.1 Introduction

For many applications, a simple machine running real-time code will solvea problem sufficiently. Common problems are generally self contained, andthere usually is no need to refer to external sources in real-time for informa-tion. The configuration data comes from a user application in the generalpurpose OS that is either interacting with the user or with some normal datasource.

However, more complex systems are appearing on the market that needto access real-time data that may not be contained on the local system. Anexample would be a robot with multiple embedded boards connected by aninternal network, where each machine needs to transfer processed informationbetween components in real-time. Visual information that has been processedand converted into motion commands need to get to the board driving therobot’s legs quickly, or it may stumble on an obstacle ahead.

RTLinuxPro offers zero copy, hard real-time networking over both Eth-ernet and FireWire, through a set of common UNIX network APIs. Thisallows users to communicate over FireWire links or Ethernet segments withthe same calls one would use anywhere else.

For more information on this package, please refer to the LNet documen-tation or email [email protected].

123

124 CHAPTER 11. REAL-TIME NETWORKING

Chapter 12

PSDD

12.1 Introduction

The standard RTLinuxPro (RTCore) execution model may be described asrunning multiple hard real-time threads in the context of a general purposeOS kernel. This model is very simple and efficient. However, it also impliesno memory protection boundaries between real-time tasks and the OS kernel.For some applications, the single name space for all processes may also be aproblem. This is where Process Space Development Domain (PSDD) comesinto play.

In PSDD, real-time threads execute in the context of an ordinary userspaceprocesses and thus have the benefits of memory protection, extended libcsupport, easier developing and debugging. It is also possible to use it forprototyping ordinary in-kernel RTCore modules.

12.2 Hello world with PSDD

The next example will look at a PSDD ”hello world” application. The main()function locks all the process’s pages in RAM, creates an RTCore thread,and sleeps. The real-time thread prints a message to the system log everysecond. This periodic mode of execution is accomplished by obtaining thecurrent time and using it as a base for rtl clock nanosleep(3) absolutetimeout value.

#include <rtl pthread.h>

125

126 CHAPTER 12. PSDD

#include <rtl time.h>#include <sys/mman.h>#include <stdio.h>#include <unistd.h>

rtl pthread t thread;void *thread code(void *param){

int i = 0; 10

struct rtl timespec next;

rtl clock gettime(RTL CLOCK REALTIME, &next);next.tv sec++;while (1) {

rtl clock nanosleep(RTL CLOCK REALTIME,RTL TIMER ABSTIME, &next, NULL);

rtl printf("hello world %d\n", i++);next.tv sec++;

} 20

return NULL;}

int main(void){

if (mlockall(MCL CURRENT | MCL FUTURE)) {perror("mlockall");return −1;

}30

rtl pthread create(&thread, NULL, &thread code, NULL);

while (1)sleep(1);

return 0;}

12.3. BUILDING AND RUNNING PSDD PROGRAMS 127

There is a couple of interesting things about this program. First of all,it needs to use mlockall(2) to make sure it does not get a page fault whilein real-time mode. Second, rtl /RTL prefixes are added to the names ofall RTCore POSIX functions and constants to distinguish them from otheruserspace POSIX threads implementations, e.g. LinuxThreads/glibc.

12.3 Building and running PSDD programs

The above example program can be built using the Makefile shown below.rtl.mk is a small makefile fragment that is found in the top-level directory ofRTCore distribution. It contains assignments of variables that are useful inbuilding RTCore applications. This provides the necessary link to libpsdd.a

in the PSDD library to correctly build the program.

all: psddhello

include rtl.mk

psddhello: psddhello.c$(USER CC) $(USER CFLAGS) −opsddhello psddhello.c \

−L$(RTL LIBS DIR) −lpsdd −N −static

To run the program, execute ./psddhello as root. Messages generatedfrom this program can be viewed using demesg(8).

12.4 Programming with PSDD

The RTCore paradigm of strict separation between real-time and non real-time application components is still true with PSDD. Typically, the main()program performs application-specific initialization, locks down process pagesin memory, creates some real-time threads using rtl pthread create(), andthen proceeds to interact with them or just sleeps. Note that real-timethreads execute in the same address space as the process, so shared memoryis automatically available.

As with kernel-level RTCore, the programmer is restricted with whatthey can do in the real-time threads. First of all, no GPOS system callsare allowed in real-time threads. If a function that results in a system call,


for example sleep(3), is called from a real-time thread, RTCore issues awarning message of the following form to the syslog:

Attempt to execute syscall NN from an RT-thread!

You can use reentrant functions from libc and other libraries, for example,sprintf(3), and RTCore API functions.

In GPOS context (meaning non real-time threads in userspace, as opposedto the hard real-time threads in userspace controlled by PSDD), RTCoreAPI functions are also allowed, as long as they are non-blocking. For ex-ample, rtl clock nanosleep is not allowed in the GPOS context, whilertl sem post is OK.

Running hard real-time threads in user process context requires the pro-cess memory map to be fixed while real-time threads are running. RTCoreenforces it by making all attempts to change the memory mappings fail afterthe first real-time thread was created in a process. The following list con-siders the ways in which a user space process memory map may potentiallychange.

• Automatic stack growth. Ordinary, GPOS will attempt to automati-cally map new pages to process stack as it grows. For PSDD program,a fixed amount of stack1 is allocated for the main() routine at the timethe first real-time thread is created. An attempt to use more stack thanthe allocated amount will cause a segmentation fault.

• Dynamic memory allocation routines, e.g. malloc(), free(), rtl gpos malloc()

etc can only be used before the first real-time thread is started2.

• Memory remapping calls: mmap(), shmat() etc. Same restrictions asfor malloc().

• fork(), exec(), and the calls based on them, such as popen() andsystem() should not be used in PSDD processes – neither before, norafter the first real-time thread is started.

1The default is 20480 bytes; this can be changed with rtl growstack(int stacksize)before the first real-time thread is created.

2Starting from RTCore version 2.2, a simple preallocated memory support mechanismis provided for use with PSDD. Please refer to Section 12.5

12.4. PROGRAMMING WITH PSDD 129

• Some libc calls that may appear safe will internally modify the process’memory map. Because of this, libc interaction after thread creationshould be kept to a minimum.

An implication of the above concerns PSDD real-time thread stacks.There is an implicit malloc() call done in rtl pthread create() if the real-time thread userspace stack has not been provided with rtl pthread attr setstackaddr().Therefore, one has to use rtl pthread attr setstackaddr() to providestack space for all real-time threads (with a possible exception for the firstthread).

Given the above, the correct initialization sequence of a PSDD applicationis as follows.

1. Make an mlockall(MCL CURRENT|MCL FUTURE)3 call to lock down theprocess memory.

2. Allocate all needed memory (including memory for the real-time threadstacks), establish shared memory and other mappings.

3. Optionally call rtl growstack(stacksize) function to specify theamount of stack in bytes for the main() function.

4. Possibly perform additional application initialization.

5. Create application real-time threads. At the time of the creation of thefirst real-time thread in a program, the main() stack will be allocatedand then the process memory map will be fixed. Subsequent malloc(),free(), mmap() etc. calls will fail.

RTCore API functions have rtl prefix added to their names to avoidambiguity. This may result in a confusion. For example, there are bothnanosleep() and rtl nanosleep() available in PSDD environment. nanosleep()should only be used in GPOS context (functions called from main()). On theother hand, rtl nanosleep() should only be called from real-time threadsand never from GPOS. A single program may use both functions in differentcontexts.

3FreeBSD does not have a working mlockall() implementation. The RTCore systemuses mlock() internally to emulate the effect of mlockall() for code, data and stackpages. This emulation only works if the program is built statically (-static option togcc). In addition, mlock() must be used for other mapped memory.


12.5 Preallocated Memory Support

RTCore provides the same in-kernel real-time allocator to PSDD threads.This allocator can be used to make calls to malloc() and free() succeedeven after real-time threads have been created. Among other things, thisfeature allows libc functions that internally call memory allocation routinesto continue working.

In order to take advantage of the preallocated memory support in a pro-gram, additional flags must be used when the program is built. If the pro-grammer uses gcc to compile and link the program at the same time, addmake variable PSDD ADD CFLAGS contents to the gcc command line, e.g.

include rtl.mk

program: program.c

$(USER_CC) $(USER_CFLAGS) -oprogram program.c \

$(PSDD_ADD_LDFLAGS) -L$(RTL_LIBS_DIR) -lpsdd

With ld, use PSDD ADD LDFLAGS, e.g.

include rtl.mk

program: program.o

$(LD) -oprogram program.o -L$(RTL_LIBS_DIR) -lpsdd \

$(PSDD_ADD_LDFLAGS)

By default, 100 blocks of 70000 bytes are preallocated4. This meansthat up to 100 malloc() requests with requested size not exceeding 70000will be satisfied or many more smaller chunks depending on the allocationpattern. For example, a 4 byte allocation will cause one of those blocksto be allocated as a collection of 4 byte chunks, allowing for many morethan 100 successful calls to malloc(). After a call to free(), a buffer canbe reused by later malloc() calls. The limits can be changed on a per-process basis by setting environment variables PSDD PREALLOC BLOCKSIZE

and PSDD PREALLOC NBLOCKS. All allocations done in PSDD applications usethis allocator, for real-time and non-real-time threads.

4This is a large amount by some measures, but if a lower number is needed, the envi-ronment variables can be used to override it to a lower amount.

12.6. STANDARD INITIALIZATION AND CLEANUP 131

12.6 Standard Initialization and Cleanup

PSDD can be used for prototyping in-kernel RTCore modules. With PSDD,it is often possible to enjoy convenience and safety of user space developmentand then simply recompile it for inclusion into the kernel for improved per-formance. Because the in-kernel interface for RTCore applications is similarto userspace, and the in-kernel POSIX names are visible using their rtl

prefixed counterparts, applications can move from userspace to kernel withrelative ease. The build options for the application need to be changed touse the in-kernel flags and not link against the PSDD libraries.

12.7 Input and Output

PSDD programs have access to all of the available real-time devices. Thisis accomplished with the standard POSIX IO functions. The PSDD ver-sions of those are rtl open, rtl read, rtl write, rtl ioctl, rtl mmap,rtl ftruncate, and rtl close. Most devices currently do not implementblocking IO and thus require O NONBLOCK flag to open them. The notableexception is /dev/irq. Commonly available devices include:

• /dev/rtfN real-time FIFOs

These are FIFO channels that can be used for communication betweenreal-time and non real-time components of the system. To create aFIFO, use rtl open.

rt_fd=rtl_open("/dev/rtf0",O_WRONLY|O_CREAT|O_NONBLOCK);

To set the size of a real-time FIFO to 4000, use:

rtl_ioctl(rt_fd, RTF_SETSIZE, 4000);

After that, rtl write call can be used to put data to the real-timeFIFO. The user side will use ordinary userspace open/read/write func-tions to access the FIFO.

• /dev/irqN interrupt devices


These are intended for handling real-time interrupts in userspace con-text. A blocking read from /dev/irqN blocks execution of a callingthread until the next interrupt number N is received. rtl ioctl (fd,

RTL IRQ ENABLE, 1) must be called on the irq file descriptor to enablereceiving of further interrupts.

• /dev/ttySN RTCore serial driver

• /dev/lptN RTCore parallel driver

RTCore also provides rtl inb() and rtl outb() functions for accessingx86 IO space.

As of RTLinuxPro 2.1, PSDD applications also now have access to namedFIFOs (rtl mkfifo(), rtl unlink()), shared memory (rtl shm open(),rtl shm unlink()), and named semaphores (rtl sem open(), rtl sem unlink(),rtl sem close()). PSDD applications have access to many of the RTCoreAPI calls in userspace.

12.8 Example: User-space PC speaker driver

The next example displays code to a PC speaker driver written with PSDD.This example demonstrates interrupt handling in user space processing andx86-style IO.


char ctemp;char devname[30];sprintf(devname, "/dev/rtf%d", FIFO NO);fd fifo = rtl open(devname, RTL O WRONLY|RTL O CREAT|RTL O NONBLOCK);if (fd fifo < 0) {

rtl printf("open of %s returned %d; errno = %d\n",devname, fd fifo, rtl errno);

return −1; 10

}rtl ioctl (fd fifo, RTF SETSIZE, 4000);fd irq = rtl open("/dev/irq8", RTL O RDONLY);if (fd irq < 0) {

12.8. EXAMPLE: USER-SPACE PC SPEAKER DRIVER 133

rtl printf("open of /dev/irq8 returned %d; errno = %d\n",fd irq, rtl errno);

rtl close(fd fifo);return −1;

}rtl pthread create (&thread, NULL, sound thread, NULL); 20

/* program the RTC to interrupt at 8192 Hz */save cmos A = RTL RTC READ(RTL RTC A);save cmos B = RTL RTC READ(RTL RTC B);/* 32kHz Time Base, 8192 Hz interrupt frequency */RTL RTC WRITE(0x23, RTL RTC A);ctemp = RTL RTC READ(RTL RTC B);ctemp &= 0x8f; /* Clear */ctemp |= 0x40; /* Periodic interrupt enable */RTL RTC WRITE(ctemp, RTL RTC B);(void) RTL RTC READ(RTL RTC C); 30

}

IBM PC compatible computers have a speaker that can be turned onand off by switching a bit in IO port 0x61. So the idea is to convert theincoming audio stream to a series of 1-bit samples to turn this bit on and offto make the speaker produce the sound. Appendix K contains full source ofa userspace PC speaker driver.

The input for the sound driver is a stream of 1-byte logarithmically en-coded (ulaw -encoded) sound samples. The most common sampling rate forsuch files is 8000 Hz. Rather than using a periodic thread, it will drive thespeaker using interrupts from the so-called Real-Time Clock (RTC) availableon x86 PCs. The code programs the RTC to interrupt the CPU at 8192 Hzwhich is a close enough match for the sampling frequency. The example usesreal-time FIFO 3 to buffer samples.

The user module initialization function creates and opens real-time FIFO3, sets the FIFO size to 4000, and opens the /dev/irq8 device. Interrupt8 is the RTC interrupt. Then it starts up the thread that is going to do alldata processing and programs the RTC to interrupt at the needed frequency.

The real-time thread shown below, enters an infinite loop. First it callsrtl read blocks on the /dev/irq8 file descriptor. This causes the thread toblock until the next interrupt from RTC is received. Once this happens, anattempt to get a sample from the real-time FIFO is performed. If successful,


the data is converted from the logarithmic encoding and the speaker bit isflipped accordingly.

A very important moment here is that interrupt processing code has tosignal the device that it can generate more interrupts (”clear device irq”).This code is device specific. In addition, the interrupt line needs to bereenabled in the interrupt controller. The latter is accomplished by usingRTL IRQ ENABLE ioctl in the driver.

void *sound thread(void *param){

char data;char temp;struct rtl siginfo info;while (1) {

rtl read(fd irq, &info, sizeof(info));(void) RTL RTC READ(RTL RTC C); /* clear IRQ */rtl ioctl(fd irq, RTL IRQ ENABLE);

10

if (rtl read(fd fifo, &data, 1) > 0) {data = filter(data);temp = rtl inb(0x61);temp &= 0xfc;if (data) {

temp |= 3;}rtl outb(temp,0x61);

}} 20

return 0;}

The cleanup routine (please see the listing in the Appendix) cancels andjoins the thread and closes file descriptors to deallocate interrupt and FIFOresources.

12.9. SAFETY CONSIDERATIONS 135

12.9 Safety Considerations

PSDD environment provides a safe execution environment for hard real-timeprograms. All arguments of the RTCore API functions are checked for valid-ity and memory protection is enforced. A hard real-time program, however,can potentially bring the system down simply by consuming all availableCPU time. To ensure that this does not happen, RTCore provides an op-tional software watchdog that would stop all real-time tasks in case of suchevent. The watchdog is enabled by default during the configuration of thesystem.

Normally, root privilege is required to use PSDD facilities. Memory lock-ing functions in GPOS also require root privilege. It is possible to allownon-root users to run PSDD applications by reconfiguring the RTCore ker-nel, however, this is potentially insecure, and is not recommended.

12.10 PSDD API

For details on which functions are available from RTCore to PSDD applica-tions, please refer to the RTCore man pages. Each function is detailed withrespect to which calling context it can be used from – in-kernel real-time,non real-time, PSDD, and so forth.

12.11 Frame Scheduler

12.11.1 Introduction

Many real-time tasks contain periodic loops that do not require sophisticatedscheduling that RTCore is capable of providing. It is also often convenientto separate scheduling details from program logic. This allows the real-time systems developer to experiment with different scheduling parameterswithout recompiling application programs. For such cases, PSDD provides auserspace frame scheduler5.

The frame scheduler supports hard real-time scheduling of user spacetasks in terms of frames and minor cycles. There is a fixed number of minorcycles per frame. Minor cycles can be either time-driven or interrupt-driven.

5The frame scheduler is only available for Linux systems.


For each task, it is possible to specify task priority the CPU to schedulethe task on, the starting minor cycle number within the frame, and the runfrequency in terms of minor cycles. (For example, if there are 10 minor cyclesin a frame, the starting minor cycle is 2 and the run frequency is 3, the taskwill run at the following minor cycles: 2, 5, 8, 2, 5, 8, ...). If there are multipletasks ready at the start of a minor cycle, the task with a higher priority isrun first.

The tasks running under a frame scheduler are UNIX processes of thefollowing structure:

void rt thread(void*arg) {/* thread executed in hard real-time */while (1) {

/* block the execution until the next run */fsched block();user code();

}}

10

int main(int argc, char **argv) {struct fsched task struct task desc;application init();// initialize RT subsystemfsched init(argc, argv, &task desc, NULL);// start real−time threadfsched run(rt thread, &task desc);

/* main thread sleeps forever; hard RT thread is running*/while (1) { 20

sleep(1);}

}

The hard real-time part of the user process is a PSDD real-time threadand therefore subject to the same restrictions, e.g., it can not use UNIXsystem calls or non-reentrant library functions.

The task code itself does not contain any scheduling information. Thisinformation is supplied when attaching a new task to the scheduler via com-

12.11. FRAME SCHEDULER 137

mand line interface. This approach allows the user to change schedules with-out recompiling.

The power of PSDD can be seen from the fact that the frame scheduleritself is implemented using hard real-time user facilities of PSDD. Thus, quitecomplicated real-time applications can be developed using the framework.

12.11.2 Command-line interface to the scheduler

The user manipulates the frame scheduler via the ”fsched” command. Thedescription of the supported formats and their meanings is provided below.

fsched create

- create and initialize the frame schedulers subsystem. This commandhas to be issued before any other commands can be used. In thecurrent implementation, this starts the userspace scheduler process,rtl fsched and the directory containing rtl fsched must be presentin the user’s PATH variable.

fsched delete

- destroy the frame schedulers subsystem

fsched config -mpf minor cycles per frame -dt dt per minor cycle

[ -s sched id ] [ -i interrupt source ]

- configure a scheduler. Must be issued before the scheduler can beused.

fsched [ -s sched id] start

- starts the frame scheduler

fsched [ -s sched id ] stop

- stops the frame scheduler. If there are any user tasks attached to thescheduler, they are detached and killed.

fsched [ -s sched id] pause|resume

- pauses and resumes the execution at the next minor cycle.


fsched attach [ -s sched id ] -n program -p priority -rf run freq

-smc starting -cpu cpu number -args "arguments passed to user

process"

- attach a program to the frame scheduler. ”program” is the name orpath of the executable to start. ”priority” can lie between 1 (min) and255 (max). If the CPU is not specified, the default CPU is used. Thetask starts the execution at the ”starting” minor cycle number of thenext frame with ”run freq” frequency.

fsched info [ -s sched id ] [ -n average runs ]

- display the information about the schedulers and tasks. More infor-mation is provided in the Section 12.11.4.

fsched reset [ -s sched id ] - reset scheduler statistics

fsched debug -p pid - break in the user process ”pid”. The breakhappens at the next minor cycle, and all scheduling activity stops. Af-ter that, it is possible to attach to the process with GDB and performsource-level debugging. Please refer to the GDB example in the distri-bution and to the Chapter 7 for more information.

If sched id is omitted, the default scheduler id of 1 is used. There maybe several slot schedulers running concurrently on the same machine. It isup to the user to ensure that there are no conflicts between the schedules.

12.11.3 Building Frame Scheduler Programs

A typical Makefile structure for the frame scheduler user programs is asfollows:

all: engine

include fsched.mk

engine: engine.c

$(USER_CC) $(FSCHED_CFLAGS) -o engine engine.c $(FSCHED_LIBS)

fsched.mk is a small makefile fragment that is provided with by the slotscheduler. It contains assignments of various variables that encapsulate in-clude paths, compiler switches and libraries.

12.11. FRAME SCHEDULER 139

12.11.4 Running Frame Scheduler Programs

First, it is necessary to make sure the fsched directory is in the PATH:

export PATH=$PATH:/directory_that_contains_fsched

Typically, running frame scheduler programs is accomplished with a shellscript like the following:

fsched create

sleep 1

fsched config -mpf 10 -dt 50

fsched attach -n user1 -rf 3 -smc 1 -p 1

fsched start

Here the frame scheduler is created and configured to a 50 millisecondsperiod with 10 minor cycles per frame. Then the user program is attached toexecute starting at minor cycle 1 of each frame with the frequency of 3 minorcycles. The task runs at priority 1. Finally, the whole system is started withthe fsched start command.

It is often useful to keep a continuously updating window with the sched-uler status display. This can be accomplished with the following command:

watch -n 1 fsched info -s 1

This will run the fsched info command every second and display itsoutput full screen. An example of such a screen is displayed below.

Every 1s: ./fsched info -s 1 Wed Aug 28 19:54:54 2002

FS: 1 baraban IRQ=0 MPF=10 DT=50ms started

CPU0 load 1%

PID CPU PRI FREQ LAST MIN MAX AVG(us) TOTAL OVR CMD

3474 0 1 3 43.7 30.9 43.7 37.8 15 75 ./user1

3477 0 2 2 17.1 15.0 39.6 18.4 120 0 ./user2

For each task, fsched info displays execution statistics: last, runningaverage, min and max execution times in microseconds, total number ofexecution cycles, and number of overruns. Percentage of the current CPUtime used by the real-time tasks is also displayed.


12.12 Conclusion

PSDD offers a simple means of writing complex real-time code in user space,while still allowing for the normal RTCore approach of splitting real-timelogic from management code. Users with no knowledge of GPOS kernelprogramming can use it for rapid prototyping and deployment of real-timeapplications. Others may use it as a testbed for code that will eventuallyrun in kernel mode.

Chapter 13

VxIT

13.1 Introduction

Many VxWorks users are looking to move to a modern, capable UNIX thatis flexible enough to satisfy integration requirements while at the same timeproviding hard real-time response. VxIT provides this level of integrationwith ease, allowing access to much of native VxWorks API calls throughRTCore and the VxIT Layer.

With hundreds of API calls supported, many VxWorks users can move di-rectly over with very minimal changes to existing codebases, and can also getbetter performance under RTCore than in the native VxWorks environment.

VxIT also comes with a porting guide found in rtlinuxpro/vxit/doc onthe VxIT build environment, and details on every VxWorks API call. Somecomponents of VxWorks are best handled by native Linux infrastructure,such as the process of getting an IP address for an Ethernet address withDHCP, while others of course are best handled in the RTOS environment.The porting guide provides details on how to handle the migration.

An extensive regression suite comes with VxIT, allowing users to demon-strate the layer on their target hardware directly. Example code details howVxWorks code fits in the RTCore environment. FSMLabs can also provideexamples of code that simply moves over without a change.

For more information on VxIT, contact [email protected] or the VxITdocumentation, if provided with the development kit.

141

142 CHAPTER 13. VXIT

Chapter 14

Controls Kit (CKit)

This chapter provides an overview of CKit by working through a simpleProportional and Derivative (PD) analog controller example. For more indepth documentation, please refer to the CKit manual.

14.1 Introduction

During the implementation of controllers and control algorithms, one findsoneself needing to handle parameter updates and alarms in a well behaved,controlled manner. Moreover, these may sometimes be handled in the contextof a distributed application, as would be the case in dangerous environments.For example, a fully automated assembly plant may need to be centrallymonitored and tuned from a remote location.

FSMLabs has addressed this problem by introducing the FSMLabs Con-trols Kit (CKit). It is a collection of utilities and libraries for building con-trol systems and control interfaces using XML to describe control objectsto third party applications. CKit provides software for exporting RTLinuxcontrol variables, including methods for defining composite objects, settingalarms and triggers, updating and exporting control information to either alocal or remote machine. CKit makes it easy to develop both the localizedand distributed application via a set of API interfaces and libraries as wellas the highly portable XML document standard.

The FSMLabs’ ControlsKit (CKit) is the subcomponent of RTCore whichgives developers a mechanism via which to manipulate both parameters andalarms from the Linux command line and through the network. Additional

143

144 CHAPTER 14. CONTROLS KIT (CKIT)

tools and libraries from both FSMLabs and FSMLabs partners interface toCKit to allow for:

• distributed control and logging

• control of legacy hardware

• controller algorithms

• graphical user interface creation and manipulation

• asynchronous alarm messaging

The core CKit subsystem is divided into the following main subcompo-nents. Please refer to Figure 14.1 for a visual description of the same:

1. the hard real time component, ckit module.rtl: this component pro-vides the communications interface and services that connect the RTCoreprograms to the user space programs. All RTCore applications whichuse the CKit services must use the CKit hard real time API as describedin the CKit Manual to:

• register the parameters/entities of interest

• assign description information to each parameter

• assign attributes and limits to each parameter

• attach each parameter to a global logic tree – this is especially use-ful to differentiate between similarly named parameters belongingto different controllers. For example, a given RTCore programmay have two PID algorithms (PID1 and PID2), both of whichmay have a parameter named Kp.

• from RTCore programs, request that shell commands be executedon the GPOS shell

• send messages to the GPOS from the RTCore programs

• send alarms of varying degrees of criticality to the GPOS from theRTCore programs

• write third party hard real time libraries that enhance the func-tionality of RTCore. For example, third party control hard realtime libraries may include control algorithms, hardware drivers,networking algorithms, and legacy hardware interfaces.

14.1. INTRODUCTION 145

Figure 14.1: CKit Design


2. the user space CKit Daemon, ckitd: this is the main user space serverwhich monitors the real time side and performs all types of actionson behalf of both the real time component (above) and the user spaceutilities.

3. the user space real time utilities: these are a collection of user spaceprograms that interface to the CKit Daemon. These tools are used tonot only interpret all messages and shell commands generated withinthe RTCore programs, but also to set and read all parameter informa-tion of all registered parameters. Please refer to the CKit Manual for acomplete listing of the CKit utilities. The user will use these utilities tointerface to the CKit Daemon and query parameters and alarms. Theremainder of this chapter will use some of these utilities.

4. the user space C++ libraries: these libraries can be linked against theuser’s C++ programs. These libraries are used to perform all the samefunctionality of user utilities both locally and remotely via the XML-RPC server. In addition, it can be used to parse the XML responsesfrom the Ckit Daemon. Again, please refer to the CKit Manual for amore in depth description of this library.

14.2 Operation of the Ckit

To use the CKit, the user must do the following:

• execute ckit module.rtl: This is only needed once while the computercontinues running, and it assumes that rtcore is already running. Thisenables the hard real time infrastructure of the CKit.

• execute ckitd: This is only needed once while the computer continuesrunning. This enables the non-real time infrastructure of the CKit andmonitors the hard real time module, above.

• write CKit capable RT programs: For this, use the RT API to bothregister critical parameters with CKit and identify alarm conditions

• execute user’s RTCore program: see Figure 14.3.1 for a description ofone such program hard real time program.

14.3. PD CONTROLLER EXAMPLE 147

• use the CKit non-real time utilities: These utilities are a set of userspace utilities designed to – from within the user space – to:

– set parameter values in the hard real time module,

– read parameter values from the hard real time module,

– view alarms generated either/both on the user side or/both thereal time side,

– subscribe to asynchronous alarms of varying levels,

– execute shell commands whenever a subscribed asynchronous alarmsoccur,

Please refer to the CKit Manual for a more thorough description of theCKit user space utilities.

• use the CKit C++ Library: Write your own user space applicationswhich can mimic all of the aforementioned user space utilities. Pleaserefer to the CKit Manual for a full description of the same.

Optionally, you can also use (or write your own) hard real time and non realtime libraries. For example, see the appropriate chapters in the CKit Manualfor a complete description of how to use and write your own libraries whichyou can either share with your co-workers or clients in binary form.

The next sections will demonstrate a simple hard real time programmingexample and its execution. This example will use the core CKit to registerentities which can be used to implement a simple Proportional and Derivative(PD) controller.

14.3 PD Controller Implementation Using Core

CKit Entities

We now present a simple example which registers the parameters for a simpleProportional & Derivative (PD) controller, as well as the Set point variablefor the PD controller.


14.3.1 Entity Registration

The entities for this project will be housed within a toplevel group entity:“Toplevel”. Please refer to Figure 14.3.1 for the source code listing.

To the “Toplevel” entity, we are going to link – as children entities –two additional entities:

• a group entity, PD, which is going to group additional subentities:

– a float entity which is to act as the proportional entity, Kp, whichin this case it will model the stiffness of the valve controller.

– a float entity which is to act as the derivative entity, Kd, whichin this case it will model the damping provided by the valve con-troller.

– an integer entity, Coolant, which is to act as the setpoint to thePD controller. In this case, this variable will denote the desiredvalve opening and it is to be specified as percentage of total gap.

For this example, all non-group entities will be updateable in real time fromthe user space side.

Optionally, we choose to provide for our entities, a set of attributes and“suggestions” for the graphical utility. The GUI has the option of eitherignoring these suggestions or acting on them. The settable attributes orsuggestions are – among others – any of the following:

• type of widget to use when displaying the entity

• units to use for the widget

• display string to use

• is the minimum value locked?

• is the maximum value locked?

• is the current value locked?

• is the minimum value auto-ranged?

• is the maximum value auto-ranged?


In this case, when the GUI displays the Kd and Kp entities, we will requestthat it use a dial widget. We are also going to request that the GUI displaythe value of both Kp and Kd using the C string syntax: “%.1e” and “%.3f”,respectively. Last but not least, we are going to request that the GUI usethe units “N/m” to display the Kp entity, and the units “Ns/m” to displaythe Kd entity.

Finally, at the end, during cleanup (after rtl main wait()), we simplyneed to destroy the toplevel entity “Toplevel”. This call will not only unlinkthe “Toplevel” entity, but will also recursively unlink all offspring of “Top-level”.

/*********************************************************************** Include our CKit header, declare our entities**********************************************************************/#include<stdio.h>#include<pthread.h>#include<rtl posixio.h>#include<ckit/rtmodule.h>CK entity Toplevel, PD, Kp, Kd, Coolant;

10

/*********************************************************************** All float entities should be initialized within the thread!!**********************************************************************/

void *myThread(void *arg){

/** PD GROUP, CHILD OF TOPLEVEL*/

CK group init(&PD,"PD", 20

"Proportional + Derivative Controller",&Toplevel);

/** CHILDREN OF PD GROUP*/

/* proportional gain and attributes. Don’t let users change the* limits, only the current value. */ 30

CK scalar float init(&Kp,"Kp",


"This sets the loop gain for the controller",&PD,0.1, 10.0, 4.2);

CK entity set sugg str(ckWidget, &Kp,"ckit::dial");CK entity set sugg str(ckRepresentation,&Kp,"%.1e");CK entity set sugg str(ckUnits, &Kp,"N/m");CK entity set sugg opt(isMinLocked, &Kp, true);CK entity set sugg opt(isMaxLocked, &Kp, true); 40

CK entity set sugg opt(isCurrLocked, &Kp, false);

/* derivative gain and attributes. Don’t let users change the* limits, only the current value. */

CK scalar float init(&Kd,"Kd","This sets the derivative gain for the controller",&PD,1.0, 3.0, 2.2);

CK entity set sugg str(ckWidget,&Kd,"ckit::dial"); 50

CK entity set sugg str(ckRepresentation,&Kd,"%.3f");CK entity set sugg str(ckUnits,&Kd,"Ns/m");CK entity set sugg opt(isMinLocked, &Kd, true);CK entity set sugg opt(isMaxLocked, &Kd, true);CK entity set sugg opt(isCurrLocked, &Kd, false);

/* valve controller (setpoint for PD controller) and attributes. In* this case, give users the ability to tune the limits for this* specific entity. */

CK scalar int init(&Coolant, 60

"Coolant","Set the desired opening (percent) of valve",&Toplevel,20, 80, 30);

CK entity set sugg str(ckWidget,&Coolant,"ckit::dial");CK entity set sugg str(ckUnits,&Coolant,"%");CK entity set sugg str(ckRepresentation,&Coolant,"%.1d");CK entity set sugg opt(isMinLocked, &Coolant, false);CK entity set sugg opt(isMaxLocked, &Coolant, false);CK entity set sugg opt(isCurrLocked, &Coolant, false); 70

/* Now, do something useful with these entities your controller code* goes here. For example, you can have your code sample analog ** devices, calculate the PD algorithm, and then write out to an ** analog device. You would extract the actual entity values as ** follows:*


* float KpGain, KdGain;* int SetPnt;* 80

* KpGain = CK scalar get float(&Kp);* KdGain = CK scalar get float(&Kd);* SetPnt = CK scalar get int(&Coolant);*/

}

/*********************************************************************** Main routine. Note: all float entities should be initialized within* a thread that has floating point permissions!! Initializing float 90

* entities within the main routine will only work in most x86s, but* will fail on PPCs.**********************************************************************/

int main(void){

pthread t MyThread;pthread attr t attr;

/** TOPLEVEL GROUP 100

*/CK group init(&Toplevel, /* Pointer to entity */

"Controller", /* registered name */"Conveyor belt controller", /* tooltip */NULL); /* parent */

/* Create our PD thread, and be sure to set floating pointpermissions */

pthread attr init(&attr);pthread attr setfp np(&attr, true); 110

pthread create(MyThread, NULL, myThread, 0);

/** WAIT UNTIL WE ARE SHUT DOWN*/

rtl main wait();

/* Shut down our thread */pthread cancel(myThread);pthread join(myThread,NULL); 120

/*


* RECURSIVELY DESTROY ENTITIES, START AT THE TOPLEVEL ENTITY*/

CK entity destroy(&Toplevel);

return 0;}

Fig. 14.3.1 Programming Example which demonstrates the initial-ization and utility of CKit entities.

14.3.2 Program Execution

To execute this program, we do the following:

1. if you haven’t done so already, start up rtcore

2. if you haven’t done so already, start up ckit module.rtl

3. if you haven’t done so already, start up ckitd

4. execute our RT program. In this case, the name of the source file is“mycontroller.c”. Using our Makefile, we immediately obtained theexecutable “mycontroller.rtl”. To execute it, we simply type:

./mycontroller.rtl

Now, we are ready to begin querying and setting parameters.

5. query the parameter tree by typing:

ck_hrt_op -L

where the -L option denotes that we want to print out a listing of thefull tree. This should give you:


+-#> Controller # group #

| +-#> Coolant # integer # 30

| +-#> PD # group #

| | +-#> Kd # float # 2.200

| | +-#> Kp # float # 4.2e+00

| |

|

In this case, each entity is displayed along with its type and currentvalue. Additional verbosity can be obtained by specifying the “-n#”option as follows. Note that the larger the number, the greater degreeof verbosity will be used to display the tree:

ck_hrt_op -L -n3

In this case, a verbosity level of 3 will add not only the current values ofthe entities, but also the minimum/maximum bracket for each entity,if applicable:

+-#> Controller # group # #| +-#> Coolant # integer # 30 # [0,100]| +-#> PD # group # #| | +-#> Kd # float # 2.200 # [1.000,3.000]| | +-#> Kp # float # 4.2e+00 # [1.0e-01,1.0e+01]| ||

You can also obtain the XML version of the same by providing the “-x”option in the command line as follows:

ck_hrt_op -L -x

which will return the entire tree and associated information in XMLformat. Please refer to the CKit Manual for a more complete descrip-tion of the ck hrt op utility.

6. set the value of Kp by typing:


ck_hrt_op -p Controller::PD::Kp -v -s 2.5

which states that for the parameter entity Kp (“-p Controller::PD-

::Kp”), we want to set the current value (“-v”) to the value of 2.5 (“-s2.5”). You should see a synchronous alarm appear on the screen whichstates that the command was either successful or not.

7. obtain the maximum allowable value of Kp by typing:

ck_hrt_op -p Controller::PD::Kp -g -u

which once again states that we want to query (“-g”) the upper limit(“-u”) of the parameter entity Kp (“-p Controller::PD::Kp”). Theresponse from the CKit is:

1.0e+01

where note that the string format is consistent with the syntax used inthe source file of “%.1e”.

8. query the description and type of a given entity by typing the -d and-t options as follows:

athena% ck_hrt_op -d -p Controller::PD::Kp

This sets the loop gain for the controller

where in this case we show both the shell command prompt (athena%)prior to the command itself and the subsequent output. Similarly toquery the entity type:

athena% ck_hrt_op -t -p Controller::PD::Kp

float

9. query parameters on a remote target machine. To do so, first, in yourtarget machine, type the following:

ck_xmlrpc_server


Then, in your host machine, type any of the aforementioned commandsbut add the “-X” option to specify the remote target machine. Forexample, assuming that the target machine name is “coyote.hilton.-net”, then on your host machine you would type:

ck_hrt_op -X http://coyote.hilton.net:3134/RPC2 -L

This should give you:

+-#> Controller # group #

| +-#> Coolant # integer # 30

| +-#> PD # group #

| | +-#> Kd # float # 2.200

| | +-#> Kp # float # 4.2e+00

| |

|

which is exactly the same output as before. The URL used is of theform:

http://machineName:3134/RPC2

where port number 3134 is the default port number as configured inthe CKit configuration file.

10. subscribe to asynchronous alarms of all levels using ck alarm:

ck_alarm -s all

Note that in our example, we did not explicitly specify any alarm mes-sages, although alarms can be generated due to many reasons withinthe CKit infrastructure.

11. subscribe to asynchronous alarms of level 2 and 3 asynchronous alarmsusing ck alarm. Also, when an alarm occurs, execute my script com-mand “myAction.sh” which accepts a single argument denoting thealarm level:


ck_alarm -s 2,3 -e "/home/efhilton/myAction.sh %L"

Note that in this example, the %L is a key token which will automati-cally be replaced by the actual alarm level each time that myAction.shis executed. Please refer to the CKit Manual for a more thorough de-scription of this command.

12. view the parameter trees on the local machine using the GtkPerl graph-ical user interface. Note that this executable is written in Perl and canbe easily edited by the user to match the user’s criteria:

ck_hrt_op_GUI

Note that for this last command to work, you must make sure that youhave both Gtk and GtkPerl installed in your machine. If not, you canusually obtain it directly from your Linux distribution CDs or fromCPAN, the central Perl repository. If any of the needed packages aremissing, then the program will complain accordingly.

13. view and manipulate the parameter trees in a remote machine (assumeonce again coyote.hilton.net) using the graphical user interface:

ck_hrt_op_GUI http://coyote.hilton.net:3134/RPC2

Note that for this last command to work, you must make sure that youhave both Gtk and GtkPerl installed in your local machine (not thetarget machine). If not, you can usually obtain it directly from yourLinux distribution CDs or from CPAN, the central Perl repository.Also, you need to make sure that ck hrt op is in your path. Pleaserefer to Figure ??khrtopgui.png.

14. view the parameter trees on the Java graphical user interface. Note thatfor this command to work, you need to have Sun’s Java fully installedin your machine. Then, type:

java -jar /opt/rtldk-2.2/bin/ckitjava.jar


Figure 14.2: CKit GTK+ Perl GUI


Figure 14.3: Screenshot of Java GUI demonstrating several of the availablewidgets


Figure 14.4: CKit Java GUI showing a robotic project


This GUI can also be embedded into the web browser of your choice.Please refer to Figures ??kitjavagui.png and ??kitjavaguiwbg.png. Ad-ditionally, you can position the widgets in any way that makes the mostsense for your control project, as well as to embed background imagesto help you best document your project.

You can also write your own C++ programs which take advantage of the CKituser space library for the sake of querying parameters and alarms both locallyand on remote machines. Within your C++ programs, you can easily queryparameters, set parameters, subscribe to alarms, etc. Please refer to theCKit Manual for a more thorough description of this library. Alternatively,you can also write XML-RPC programs in any language which will query theXML-RPC server over the network. The following Section will describe onesuch example.

14.4 XML-RPC API

It is possible to make XML-RPC queries through the network using anylanguage that is XML-RPC capable and which can further interpret theresulting XML. For example, users have created interfaces to Microsoft Excelwhich they can then use to query RTLinux boxes from within their MicrosoftWindows boxes. For this, Visual Basic was used to not only perform theXML-RPC calls, but to also interpret the resulting XML.

We now present a simple example which will query the parameter treerunning on a remote target machine. In this case, we’ll reuse the machineused in the previous section, coyote.hilton.net. It is assumed at this pointthat ck xmlrpc server is already running on the target machine.

The example code is presented in Figure 14.4. In this example, severalheaders are first included. Then, we initialize some constants which we lateruse during the call to the remote server.

#!/usr/bin/perl −wuse Frontier::Client;use MIME::Base64;use strict;

# Let’s initialize some constantsuse constant TRUE=>1;use constant FALSE=>0;

14.5. CONCLUSION 161

use constant ROOTPATH=>"hrt";use constant NODEPATH=>"Controller"; 10

use constant TREEDEPTH=>1024;use constant SHOWHIDDEN=>FALSE;

# Set the target address:my $target = "http://coyote.hilton.net:3134/RPC2";

# initialize the clientmy $rpc = new Frontier::Client ( url => $target )

| | die "Unable to connect for whatever reason";20

# Do querymy $response = $rpc->call(’fsmlabs.ckit.getTree’,

NODEPATH,ROOTPATH,TREEDEPTH,SHOWHIDDEN);

# print out the responseprintf("%s\n\n",$response);

Fig. 14.4 Perl program that queries parameter tree on remote tar-get machine.

Notice the simplicity of the code. As such, it is the responsibility of theuser to next parse (if necessary) the XML response from the server. However,the point has been made. It is quite easy to query the remote target machine.

14.5 Conclusion

You are now on your way to understanding the utility of the Controls Kit.Its strength lies in helping you manipulate parameter trees in real time, whileat the same time monitoring asynchronous alarms. In addition, please notethat there are now two graphical front ends for the CKit. The first is a Perlbased interface, and the second one is a Java based interface. Please refer tothe CKit documentation for more detail. Both of these are designed to helpgreatly in the development of interfaces for use in industrial control environ-ments. In short, the Controls Kit is designed to help in your deployment andcreation of control algorithms. Happy Controlling!

Chapter 15

RTLinuxPro Optimizations

Optimizations are of course very important in real-time systems. However,many are detrimental to the development process. With RTLinuxPro, itis very easy to enable proper optimizations during development and disablethem on deployment. This chapter will cover general optimizations and tech-niques useful in developing RTCore applications.

15.1 General optimizations

First off, this section will cover the basic optimizations that can make thedifference between having real-time response or a non real-time system. Pri-marily, these can be grouped into the following categories, mainly targettedat the x86 architecture:

• Power management

• System Management Interrupts (SMIs)

• Interrupt controllers

Power management is generally the simpler of the two to solve. It shouldbe disabled in the BIOS for nearly all systems. Contact FSMLabs for by email([email protected]) for information on systems that absolutely requireboth power management and real-time response.

The reason power management is a problem is that it will dynamicallyshift the CPU’s clock speed, so that an operation that took a given amount

163

164 CHAPTER 15. RTLINUXPRO OPTIMIZATIONS

of time at one point may take a different amount of time later on, if the clockspeed has been changed in the meantime. Disabling the feature will ensurea constant clock speed for the hardware and a reliable execution time forreal-time code.

For SMIs, the hardware responds to certain events by essentially takingthe CPU offline while it manages internal work. This appears to softwareas a long-delayed execution of whatever was happening at the time of theSMI. If a real-time thread is execution when an SMI occurs, the code maybe delayed by tens of milliseconds or more.

RTLinuxPro provides a tool to disable SMIs for some hardware, in theutilities directory. This does not cover all possible SMI possibilities, but itwill help on many configurations. The application will disable SMIs while itis running and reenable them on exit. The result is that while it is running,a system that would not ordinarily have acceptable real-time performancewill be capable of standard response times.

Interrupt management is also important - when using Linux as the GPOS,it is important that APIC support is enabled if the hardware is capable. Thisallows for a much higher performance interrupt controller, which results inbetter real-time performance.

15.2 RTCore-internal optimizations

Users that have purchased the source of the RTCore OS have several buildoptions are available for optimization and error checking. These are madeavailable through the RTCore build system, usually entered with:

make menuconfig

Under the internal debugging section, users can enable or disable severaloptions. First is a paranoid mode - this enables more extensive internalchecks within RTCore. While helpful for debugging, it does add overhead,and should be disabled on deployment for users that need every last cyclefrom their hardware. There is also a similar error and sanity check modethat checks for valid file descriptors and so on - the impact is minimal, butmay be important in very tight cases.

Additionally, under ’Selective building of RTLinux modules’ is anothermeans of improving performance. By default, the debugger is built intothe system, which will catch faults and other problems. However, this does

15.3. CPU MANAGEMENT 165

add overhead to the normal exception handling of faults in the GPOS, asthe RTCore debugger is always run first on any execption. This overheadcan be very minorly detrimental to GPOS throughput. If the best possibleperformance is needed from the GPOS on deployment and there is no needfor the debugger, RTCore can be rebuilt without the module.

15.3 CPU management

Proper management of the processor is essential to getting the best pefor-mance out of the system. RTCore offers several means of managing thisresource. After each of the aspects have been covered, they will be combinedtogether into a single example of how to effectively manage CPU resources.This section is primarily interested in SMP systems.

15.3.1 Targetting specific CPUs

By default, real-time threads are spawned on the same processor that theloading program is run on. On an SMP system, the GPOS may be runningthe loading program on any CPU at any given time, so real-time threadsmay be started on CPU 0, CPU 1, or any other, depending on the currentscheduling considerations for the GPOS.

If the application at hand has 2 real-time threads and each of them needsmore than 50% of the CPU’s bandwidth, it is obvious that they must bedirected to different CPUs in order to be able to handle the workload. (Bydefault, they would both attempt to start on the processor they were createdon.)

This is handled easily with a pthread attribute and a call to pthread attr

setcpu np(), which takes the attribute for the thread and the CPU num-ber the thread should run on. The example at the end of this section willdemonstrate the entire sequence in action.

15.3.2 Reserving CPUs

When there is no real-time activity on a specific CPU (all threads are waiting,no interrupts waiting, etc), RTCore allows the GPOS to execute non real-time code. For example, this means that Linux can then let cron run or letsome scripts execute on that processor.


This allows GPOS throughput to increase, but it does so at the expenseof the cache. As the GPOS runs tasks on the processor, it dirties the cache,pushing real-time thread code and data out. When it comes time to run athread again, it may be necessary to get some of its data back into the cache,resulting in a slight delay. The amount of delay is slight, but it reduces theeffective bandwidth of the CPU because it has to wait while the cache refills.

RTCore allows the user to reserve a CPU such that the GPOS is notallowed to execute code on that specific CPU. This is done again with apthread attribute and a call to pthread attr setreserve np(), providingthe attribute object and the cpu to be reserved. This call has the effect ofrefocusing interrupts away from that processor. Please refer to the documen-tation for this call for full details.

Once the given CPU is reserved, real-time threads can generally staymostly in cache, allowing for performance at the hardware limit.

15.3.3 Interrupt focus

Interrupts can be a source of some latency, if there are a large number ofthem coming into a processor while there is a real-time thread working. Theoverhead is minimal, but it can cause slight disturbances. In these cases, it isgenerally best to refocus the non-real-time interrupts to another processor.

Focus is accomplished by using rtl irq set affinity(int irq, unsigned

long *mask, unsigned long *oldmask). Callers provide the IRQ numberto be refocused and a mask for that CPUs should allow that IRQ. The callprovides the previously used mask to the user with the third argument.

By setting a thread on a specific processor, disallowing the GPOS fromrunning on that processor and focusing non real-time IRQs away from thatprocessor, real-time threads can execute at the very limit of the hardware.The thread and data can generally live entirely within cache and the onlyinterrupts seen will be those related to any real-time interrupts that are stillfocused to that specific CPU.

The act of reserving a processor as described above will automaticallyrefocus interrupts away from the targetted processor. After this thread isstarted, any real-time interrupts can be refocused back to the reserved CPU.

15.3. CPU MANAGEMENT 167

15.3.4 Example

Now that that the three management techinques have been introduced, thefollowing example shows them in an application Here, the program startsa thread on CPU 1, disallows the GPOS from running on that CPU, andfocuses an interrupt to that CPU. Since the act of reserving the processorrefocuses interrupts away from it, the program needs to refocus the interruptback.

#include <pthread.h>#include <stdio.h>#include <semaphore.h>

pthread˙t thread;unsigned long mask = 0x2, oldmask;sem˙t irq˙sem;

unsigned int irq˙handler(unsigned int irq, struct rtl˙frame *regs){ 10

rtl˙global˙pend˙irq(irq);sem˙post(&irq˙sem);

return 0;}

void *thread˙code(void *t){

rtl˙irq˙set˙affinity(12, &mask, &oldmask);20

while (1) {sem˙wait(&irq˙sem);

printf("Got IRQ 12\n");}return NULL;

}

int main(void){


pthread attr t attr; 30

sem init(&irq sem, 1, 0);pthread attr init(&attr);pthread attr setcpu np(&attr, 1);pthread attr setreserve np(&attr, 1);

rtl request irq(12, irq handler);

pthread create(&thread, &attr, thread code, 0);40

rtl main wait();

pthread cancel (thread);pthread join (thread, NULL);

rtl free irq(12);

rtl irq set affinity(12, &oldmask, &mask);sem destroy(&irq sem);

50

return 0;}

That is all that needs to be done for all 3 optimzations. Stepping throughthe program, first it sets up the semaphore between the interrupt handler andthe thread and initializes the pthread attribute. This is used to do 2 of thethree steps. The thread is targetted to CPU 1, and that CPU gets reserved.(The reservation does not actually take place until the thread has started.)

Next, the IRQ handler is installed, and the thread is spun. All that isleft to do is refocus the interrupt back to the CPU the real-time applicationis on. This is done in the thread, saving the old mask so the program canrestore it when it is done. That’s all there is to the third step.

As can be seen, it’s not difficult. In roughly 50 lines of code, all threefactors have been integrated into the system. For more details, please referto the RTLinuxPro examples and documentation.

Part III

Appendices

169

Appendix A

List of abbreviations

• AGP: Advanced Graphics Port

• API: Application Programming Interface

• APIC: Advanced Programmable Interrupt Controller

• APM: Advanced Power Management

• BIOS: Basic Input Output System

• CLI: CLear Interrupt flag

• CPU: Central Processing Unit

• DA/AD: Digital to Analog / Analog to Digital conversion

• DAQ: Data AcQuisition

• DMA: Direct Memory Access

• DRAM: Dynamic RAM

• EDF: Earliest Deadline First

• FAQ: Frequently Asked Questions

• FIFO: First In First Out

• FP: Floating Point

171

172 APPENDIX A. LIST OF ABBREVIATIONS

• GNU: GNU’s Not Unix (a recursive acronym)

• GPOS: General Purpose Operating System

• GUI: Graphical User Interface

• IDE: Integrated Device Electronics / Integrated Development Environ-ment

• IP: Internet Protocol

• IPC: Inter Process Communication

• IRQ: Interrupt ReQuest

• ISA: Industry Standard Architecture / Instruction Set Architecture

• ISR: Interrupt Service Routine

• NVRAM: Non-Volatile RAM

• OS: Operating System

• PCI: Peripheral Component Interconnect

• PCB: Printed Circuit Board

• PIC: Programmable Interrupt Controller

• PLIP: Parallel Line Internet Protocol

• POSIX: Portable Operation System Interface eXchange

• RAM: Random Access Memory

• RFC: Request For Comment

• RMS: Rate Monotonic Scheduler

• ROM: Read Only Memory

• RPM: RedHat Package Manager

• RT: Real-Time

173

• RTOS: Real-Time Operating System

• SCSI: Small Computer System Interface

• SHM: SHared Memory

• SLIP: Serial Line Internet Protocol

• SMI: System Management Interrupt

• SMM: System Management Mode

• SMP: Symmetric Multi Processor

• SRAM: Static RAM

• STI: SeT Interrupt flag

• TCP: Transmission Control Protocol

• TCP/IP: Transmission Control Protocol / Internet Protocol

• TLB: Translation Lookaside Buffer

• UDP: User Datagram Protocol

• UP: Uni Processor

• XT-PIC: Old XT (Intel 8086) Programmable Interrupt Controller

174 APPENDIX A. LIST OF ABBREVIATIONS

Appendix B

Terminology

• GPOS : General Purpose Operating System - The non real-time oper-ating system that RTCore is running as the lowest priority thread.

• RTCore : The core technology that powers RTLinuxPro and RTCoreBSD.Viewing the system as running two operating systems, RTCore is theRTOS that provides the deterministic control needed for real-time ap-plications.

• EDF-scheduler : In this scheduling strategy, rather than using thepriority of a task to direct scheduling, the scheduler selects the taskwith the closest deadline. In other words, it selects the task with theleast time left until it should be run. This scheduling strategy hasa ”flat” priority and is optimal for systems that handle asynchronousevents and non-periodic real-time tasks.

• FIFO Scheduler (SCHED FIFO): A First In First Out Scheduler is onein which all processes/threads at the same priority level are scheduledin the order they arrived on the queue. When the scheduler is calledthe queue is checked for jobs of the highest priority is checked first. Ifthere is no thread runnable in the highest priority level the next level ischecked and so forth. A job scheduled with a policy of SCHED FIFOcan monopolize the CPU if it is always ready to run and if there is nomechanism to preempt it.

• Frontside Bus : This is the high speed bus that exists between the CPUand memory.

175

176 APPENDIX B. TERMINOLOGY

• Host Bridge : The host bridge acts as a hub between most majorsubsystems in a PC. It acts as an interface between CPUs, memory,video, and other busses, such as PCI.

• North Bridge : The north bridge of a machine is the controller responsi-ble for high speed operations. Bus components that require high speedaccess, such as CPU to memory, PCI interaction, etc., are consideredpart of the north bridge.

• PCI-Bridge: A logic chip (controller) connecting PCI busses. Access toPCI devices runs over the PCI-Bridge from another subsystem to thePCI bus where the peripheral device is located.

• PCI-ISA-Bridge: To support legacy ISA devices most PC’s have a ISA-bus available via the PCI bus. The connecting controller is referred toas PCI-ISA-Bridge.

• Rate Monotonic Scheduler (RMS) : An optimized scheduling policythat is applicable if all tasks have a common periodicity, the criteria isthat all tasks fit the requirement

n∑i=1

Ci

Ti

< n(21n − 1)

with C being the worst case execution time and T the period of eachtask. As the task number n increases, the utilization converges to about69%, which is not as efficient as other schedulers, but is preferable insituations requiring static scheduling.

• South Bridge : The south bridge is the collection of controllers thatdeal with slower component systems, such as serial controllers, floppy,PCI-ISA bridges, etc.

• Asynchronous Signals: All signals that reach a thread from an externalsource, meaning that a different thread of execution is posting a signalvia pthread kill(). Not all thread functions are async safe, as signalsmay come at any time, even when the thread is not ready to be inter-rupted. An asynchronous signal is delivered to the process and not toa specific thread within a multithreaded process.

177

• Async Safe : A thread function that can handle asynchronous sig-nals without leading to race conditions or synchronisation problems(like blocking other threads indefinitely, leading to inconsistency inglobal variables, etc.) are considered to be async safe functions. Func-tions that are not async safe should be used with these possible sideeffects in mind, meaning that the points at which they are safe tocall should be set appropriately. If a thread has a cancellation stateof PTHREAD CANCEL ENABLE and the cancellation type set toPTHREAD CANCEL ASYNCHRONOUS then only async-safe func-tions should be used, or signal handlers must be installed.

• Atomic Operation : A execution operation during which a contextswitch can occur but state is preserved. During atomic operations itis legal to assume that conditional variables, mutexes, etc. will beunchanged, as proper locking has taken place. An atomic operationbehaves as if it where completed as a single instruction.

• Barrier : A thread synchronisation primitive based on conditional vari-ables. A barrier is a point in the execution stream at which a setof threads will wait until all threads requiring synchronisation havereached it. After all threads have reached the barrier the conditionpredicate is set TRUE and execution of all threads can continue.

• Busy wait loop : This is the act of waiting for an event in a runningprocess, using the CPU during the wait. Rather than being put tosleep and rescheduled when the event occurs, the process spins doinguseless activity during the wait. This saves the overhead of schedulinganother process in and then having to reschedule the first.

• Cache flush : A cache flush involves writing the content of the cache tomemory or to whatever media is appropriate. This is only necessary onhardware that does not support write through caching or on SMP sys-tems when a task moves between CPUs. Generally cache flushes have anoticable influence on performance, especially for real-time operations,because the flushed data must be refetched. This resulting delay froma flush may result in jitter.

• Conditional Variable : A condition variable is a complex synchronisa-tion mechanism comprised of a conditional variable and its predicate,


as well as an associated mutex. A thread acquires the mutex and thenwaits until the condition is signaled, then performs the task dependingon the condition, releasing the mutex afterward.

• Context Switch : removing the currently running thread from the pro-cessor and starting a different thread on this CPU. A context switch inRTCore will only save the state of the integer registers unless floatingpoint is enabled. (See pthread attr setfp np.)

• Deadlock : Deadlock occurs if synchronisation primitives are used in-consistently such that different threads are waiting for each other torelease resources. An example of such a setup is two threads that eachacquire a mutex that the other is waiting for. Since both threads areblocked neither will free the mutex they hold and thus both are blockedinfinitely.

• Detached thread : When creating a thread with pthread create, theattributes passed will by default make the thread joinable, meaningthat another thread can call join on it. This is commonly done tocatch return status and to finish the cleanup of the joinable thread. Ifthe thread’s state is set to PTHREAD CREATE DETACHED, thenall resources of this thread will be released when the thread exits, sothere is no return status and no further synchronization needed.

• Embedded system : Operating systems and software for systems thatperform non-traditional tasks are referred to as embedded systems.These systems span a wide range, but in general, embedded systemsare low memory systems and have restrictions with respect to availablemass-storage devices as well as minimal power.

• Global Variables : Global variables are those that are visible through-out the application, rather than being restricted to a specific thread.The variables themselves are not protected against concurrent activityand usually require some kind of synchronization primitives to ensuresafe handling.

• Handler : If an event should be handled in a specific way, a function orthread will be programmed that can respond to this event (e.g. updatethe pixels on the screen if the mouse moves). The association of thisfunction or thread with a specific event makes it to the handler of this

179

event. The handler must be explicitly registered with whatever willdetect the event, which is generally the operating system.

• Hard Real-time : Systems capable of guaranteeing worst case jitter anda worst case response time, regardless of system load, qualify as beinghard real-time systems.

• InterProcess Communication : Commonly referred to as IPC, this refersto any mechanism by which multiple processes can coordinate their ac-tion. These mechanisms range from files, shared memory, semaphores,and other shared resources.

• Interrupt : All processors have the capability to receive external signalsvia dedicated interrupt lines. If an interrupt line is set the processorwill halt execution and jump to a interrupt handling routine. Interruptsare electric signals caused by some hardware (peripherals like networkcards or IDE disk controllers) and have a software counterpart that ispart of the operating system.

• Interrupt Interception : In RTCore, no interrupt will directly reachLinux’s interrupt handlers as every interrupt is handled by the inter-rupt interception code first. If there is a real-time handler availablethis handler will be called, otherwise it will be passed on to Linux forhandling when there is time.

• Interrupt Handler : The action that should be taken when an interruptoccurs is defined in a kernel thread that is called upon recieving thatinterrupt. The mapping of interrupt service routines to an interrupthandler is done by the GPOS kernel as well as by a real-time thread.This means that there can be two handlers for the same interrupt inRTCore: In this case the real-time handler is called first, and only ifthe task is not destined for the real-time handler will it be passed tothe GPOS interrupt handler for execution.

• Interrupt Mask : An interrupt mask determines which interrupts ac-tually can reach the system. A bit mask is used to enable/disableinterrupts.

• Interrupt Response Time: On asserting an hardware interrupt the sys-tem will call the associated interrupt service routine. The time from


the assertion of the interrupt (the electric signal being active on theinterrupt pin) to the point where this interrupt service routine is calledis defined as the interrupt response time. In practice the interruptresponse time is the time from asserting the interrupt until the sys-tem acknowleges it or respond with a noticable action. This time istherefore a little longer than the ”theoretical” interrupt response time.

• Instruction Set: To communicate with a specific hardware a set of op-erations is used to directly communicate with the hardware (i.e. ma-nipulate register content). This instruction set is hardware specific anddirectly maps to machine code.

• Jitter : Jitter values represent the time variance in completion of anevent. This can represent anything from task completion variance toreal-time scheduling variance.

• Kernel : The kernel is the core of an operating system, providing thebasic resources and controlling access to these resources.

• Kernel Module : Modules are dynamically loaded capabilities, repre-sented as object code that is linked into the kernel as needed. Once akernel module is loaded it is no different from a statically compiled inkernel function.

• Kernel Thread : A kernel thread is similar to a normal thread in that itrepresents a specific execution path, although in this case it runs withinthe kernel. Kernel threads can have more restrictions than normalthreads, such as stack space, but offer the advantage of access to kernelstructures and subsystems.

• Latency : The time between requesting an action and the actual oc-currence.

• Local Variables : As opposed to global variables, local variables areonly visible to a single thread or single execution scope.

• Multithreaded : A process that has more than one flow of control (Ingeneral, there are also shared resources between these control paths).

• Mutex (Mutual Exclusion Object): A mutex is an object that allowsmultiple threads to synchronize access to shared resources. A mutex

181

has two states: locked and unlocked. Once a mutex has been lockedby a thread all other threads that try to lock it will block until thethread that acquired the mutex unlocks it. After this one of the blockedthreads will acquire it.

• Polling : Polling is the strategy of checking a condition or a conditionchange while in a loop. Generally polling is an expensive strategy totest conditions/condition changes.

• Priority Inversion : If a high priority thread blocks on a mutex (orany other synchronisation object) that was previously locked by a lowpriority task, this will lead to priority inversion: The lower prioritythread must gain a higher priority in order to guarantee executiontime. Otherwise another high priority thread may come along andblock execution of the lower priority task from running, preventingfreeing of the mutex and also stalling both the low and high prioritythreads. and thus the mutex will not be unlocked. This scenario leadsto a lower priority task blocking a high priority task which is an implicitpriority inversion.

• Process : An entity composed of at least one thread of execution anda set of resources managed by the operating system that are assignedto this entity.

• Race Condition : If two executing entities compete for a resource andthere is no control ensuring safe access of the resource, unpredictablebehavior can occur. Race conditions can occur with any shared re-sources if appropriate synchronization is not done by all entities thatrequire access to this resource.

• Re-entrant Function : A reentrant function will behave in a predictableway even if multiple threads are using it at the same time. Any syn-chronisation or access of global data is handled in a way that it is safeto call these functions multiple times without fear of data corruption.

• RR Scheduler : In Round Robin Scheduling, there are different prioritylevels available, and the ordering of threads/processes is the same asin SCHED FIFO. The difference is that each scheduling entity has adefined time-slice. If it does not exit or block before the time-slice


expires it will be preempted by the kernel and the next runnable threadwill be scheduled.

• Scheduler : The thread that handles the task-queue of the system, itdecides which process is to be run next after a process gives up theCPU (either by exiting or blocking). The order in which the schedulerwill grant control to the CPU is described by the scheduling policy andthe priority assigned to each task.

• Scheduling Jitter : The variance of time between the point at whicha process requested scheduling and the time at which it actually runs.In the common literature, scheduling jitter will sometimes refer to theabsolute deviation from the requested timing.

• Semaphore : The simplest form of a semaphore is equivalent to a mutex,the binary semaphore. Associated with a semaphore is a counter thatdefines the number of threads that can access the protected resource viathe semaphore. On access of the protected resource a thread acquiresthe semaphore by decrementing the counter. If the counter reaches0 no other threads can access the protected resource. When a threadreleases the protected resource it increments the semaphore again. Theunderlaying mechanism is a conditional variable with the conditioncounter > 0.

• Shared Memory : Memory accessed by more than one process. Sharedmemory can be accessed from a real-time process as well as from a nonreal-time (GPOS processes) for data exchange or for process synchroni-sation. RTCore offers this mechanism, although there are many typesof shared memory systems.

• Signal : A numeric value delivered to a process via system call, de-scribing an action to be taken by the process. The process may accepta signal or mask it. If a process has a signal handler installed for thesignal number sent this handler will be executed on arival of the signal.Signals issued from a thread within a process can be posted to a spe-cific thread (via the thread id), while signals sent between prcesses arereceived at the process level and are not directed to a specific thread.

• Signal Handler : To manage asynchronous signals at a process levelsignal handlers are installed. These can then be called by the thread

183

that received the signal. Note that signal handlers are installed at theprocess level and not at the thread level, so if an asynchronous signal isreceived, it cannot be directed at a specific thread. Only signals issuedfrom within the process can be sent to specific thread thread IDs thatexist within that specific process.

• Sigaction : The sigaction call controls the actions taken upon receptionof a given set of signals. It sets up signal handlers for the action, amongother things.

• Soft Interrupts : All GPOS interrupts in RTCore are soft interrupts.These interrupts are not directly related to hardware events but are thehardware events that the real-time kernel has passed on to the GPOSfor management if there was no real-time interrupt handler associatedwith the interrupt.

• Soft Realtime : Systems that can provide guaranteed average responsetimes to a class of events, but cannot provide a guaranteed maximumscheduling variance.

• Spinlock : Waiting on a mutex can be done in a infinite loop, probingfor the mutex on every iteration. Spinlocks block the CPU and arethus ”expensive” operations if it is not ensured that the thread willonly spin for a very short time. A spinlock is efficient only if it isnot active longer than the amount of time it would take to perform acontext switch.

• Spurious Wakeup : If a thread is waiting on a conditional variable andreceives a signal it can be woken up and could return from the wait,even if the condition the code was waiting for has not really occurred.To prevent race conditions due to spurious wakeups, evaluation forcondition variables and condition wait is done in a loop. This way, aspurious wakeup will be caught and the code will continue to wait forthe condition.

• Stack : To pass arguments and context information for function callseach process and thread have a stack associated with it. This stack isprivate to each process or thread.


• Symbol Table : A symbol table exists to map an mnemonic symbol toan address or location where the contents are stored. In the context ofthis book, this generally refers to the kernel symbol table, which mapsthe addresses of kernel structures. In the general case, this can be usedanywhere, such as in a custom application.

• Synchronous Signal : Any signal that is the result of the threads ac-tion, and occurs in direct reaction to that action. This is opposed toasynchronous signals, which may arrive at any time and may not berelated to a thread action. An example of a synchronous signal wouldbe a thread that does a division by zero causing an FPE INTDIV. Syn-chronous signals are delivered to the process that caused the it and notto the specific thread.

• Task : In the process model a task represents a process, while in thethread model a task can be a process (single threaded process) or athread of execution within a process. The task concept is used in theV1 API of RTCore and was replaced by the thread based POSIX API.Usage of the V1 API is not recommended.

• Task Priority : Every task will be called by the scheduler to execute inan order specified by its priority level. POSIX specifies a minimum of32 priority levels. Besides the priority of a task its scheduling policy(SCHED FIFO, SCHED RR, etc) will influence when it is run as well.A priority level only specifies its rank within the scheduling policy andis in relation to other tasks in that same scheduling class.

• Thread : Each independant flow of control within a process that hasan execution context associated with a instruction sequence that canbe executed. A thread is fully described by its context, instructionsequence, and state.

• Thread Context : Each thread exists within the context of a process,and this context is comprised of a set of resources, such as registercontext, stack, private storage area, attributes and the instructions toexecute. It also includes the structures through which the thread isaccessible (thread structures and other management constructs).

• Ticks : Each cycle executed by a processor counts as a tick. Processorssuch as the Pentium maintain a count of the number of these ticks that

185

have occurred since boot time. This value is useful as an indicator ofthe length of time of a task, among other things.

• Timers: Hardware components on the motherboard or integrated intothe CPU that measure time and can be the source of a periodic trigger.

• User-space Thread : These are threads created, synchronized, and ter-minated using the threads API running in user space. They are asso-ciated with a kernel-scheduled entity (a process) and are not visible tothe kernel, but rather, are scheduled by a separate scheduling entitythat lives within the process. The kernel only sees a single process andwill not distinguish between the different threads. Note that this differsfrom userspace POSIX threads, in which each thread appears to thekernel as a schedulable process.

• User Mode : A mode of operation where access is restricted to a subsetof available functions available to normal user-space processes. Kernellevel subsystems and special processor modes are not available to theuserspace code. A thread executing application code will do this in usermode until it issues a system call, which the kernel will then executeon behalf of the process. Once the system call completes and returns,user mode is reentered.

• User-space : The memory space a user process exists in. Executionof user code and all resources associated with a user mode operationreside in user space. User space is left when a privileged operation isexecuted (syscall).

Appendix C

Familiarizing with RTLinuxPro

This chapter is intended to provide a simple overview of how to interact withRTLinuxPro. It is assumed that the kit is already installed as defined by theinstructions provided with the CD or download.

RTLinuxPro installs into a root directory point as defined by the version,and cannot be safely moved from that point. This is because all of the toolsare built against a known location and depend on the existence of that pointfor configuration, libraries, and other information. This allows the kit tobe installed on any distribution, regardless of host glibc version, installedutilities, etc.

This installation directory is different in every version, so that the pro-grammer can keep multiple installations on the same machine. The root pathis /opt/rtldk-x.y, where x is the major release number and y is the minorrelease number. The remainder of this chapter will assume this path as theroot location of all commands.

C.1 Layout

The installation guide should walk the user through the specifics of eachdirectory, so this section will focus only on the important ones:

1. bin - This is where all of the tool binaries exist. It is required bythe developer to make sure the full path to this directory is first intheir $PATH variable, so that these tools are used before any others inthe path. (Running gcc -v should report information including the/opt/rtldk-x.y path if this is configured properly.

187

188 APPENDIX C. FAMILIARIZING WITH RTLINUXPRO

2. rtlinux kernel x y - This is the prepatched kernel to be loaded on thereal-time system, whether that is the development host or a targetboard. The developer will need to take the precompiled image andinstall it like any other kernel or rebuild it to suit their environment.(The x y value will correspond to the major and minor kernel versionnumbers provided with the release.

3. rtlinuxpro - All of the RTCore components (and optionally, code),scripts, examples, drivers, and other RTCore-specific tools are con-tained here.

There are many other directories, but these are central to this section’suses here. Also of note is the docs directory, which contains API documen-tation and more information on getting started with RTCore.

C.1.1 Self and cross-hosted development

There is a large divergence in ways that the kit is used because each em-bedded system has its own set of specific requirements. In many cases, thedevelopment kit will be installed on an x86 machine, but built with compil-ers and real-time code targeting a different architecture. In this case, theprogrammer will need to find the correct way of getting the kernel image,real-time modules, and filesystem to the embedded device, whether it is aflash procedure, a BOOTP configuration with an NFS root, or whatever isappropriate. For simplicity’s sake, this document will assume that the in-stallation of the development kit is on the machine that will be used for theactual real-time execution. (Although this is not always an optimal solution.)

If the user is doing cross-hosted development, refer to the installationinstructions provided with RTLinuxPro. Included with the installation man-ual are some example procedures for getting a kernel built and transferredto the target board. Since each board varies slightly, this document will notbe covering the specifics of this procedure here.

C.2 Loading and unloading RTCore

RTCore must be loaded in order for any real-time services to be available.The process is simple, once the patched kernel has been compiled and in-stalled. This procedure is the same as with any other kernel and as the

C.3. USING THE ROOT FILESYSTEM 189

procedure is beyond the scope of this book, it is suggested to refer to thenormal Kernel-HOWTO for details. Essentially, it involves changing to thertlinux kernel x y directory, building a kernel image suited to the deviceneeds (or using the provided stock image), and installing that image. Thismay be a local boot loader (GRUB) update or it might be a matter of mak-ing the image available for TFTP by an embedded board. Again, for thisexample it is assumed a self hosted development environment.

Once the system is running the correct kernel, RTCore can be loadedwith the following commands:

cd /opt/rtldk-x.y/rtlinuxpro

./modules/rtcore &

This will load the RTCore OS found in the installation, which will varybased on any additional components installed. Unloading the OS consists of:

killall rtcore

C.2.1 Running the examples

Now some of the examples can be ran. The regression test found in scripts/regression.sh

in the rtlinuxpro directory can also be ran to provide a robust test on theuser’s particular equipment. In order to get a feel for the steps needed to loadand run real-time code, it is worth stepping through the examples provided.Each of the examples is built to be self-explanatory, and can be run by justexecuting the local binary.

Once the application code is running, the test will generally continueindefinitely. After the user is done running it, the application can be stoppedwith a CTRL-C.

C.3 Using the root filesystem

Included with the development kit is a root filesystem, built for the intendedtarget of the kit. This means that if the programmer is using the genericPowerPC version of the kit, there is a root filesystem containing a set ofbinaries built for generic PowerPC root. This will provide a solid Linux in-stallation for use by the development system. For a generic PowerPC version,


there is a ppc6xx root directory inside the development kit, but this namewill vary by architecture.

For a generic x86 system as described in the installation section, theprogrammer will likely use the host filesystem already present. However, ifthey intend to use separate systems for development and testing (as advised)or are targeting a different architecture completely, this option should helpspeed the development process. For many embedded systems, it is muchsimpler to NFS root mount a remote filesystem, at least for testing, ratherthan rebuilding an image every time the developer generates new binary codefor the target.

For most distributions, exporting this directory is a very simple exerciseand is no different than exporting any other NFS mount point.1 Edit thefile /etc/exports, run exportfs -a and the tree will be available to theembedded system. In many environments, it is also advisable to simplyhave the device retrieve its kernel image from DHCP and build the imagesuch that it automatically mounts the root filesystem from the developmentmachine. If this is useful for the programmer’s environment, the kernel buildoffers an option to build the boot parameters in as automatic argumentsto the bootstrap process. For example, under a PowerPC build, under thekernel’s ’General setup’ option, the user can set the boot options to be (asone variable):

root=/dev/nfs nfsroot=10.0.0.2:${RTLDK_ROOT}/ppc6xx_root ip=bootp

The setting defined here sets the root filesystem to be NFS and the rootsystem lives on the machine at 10.0.0.2, under ${RTLDK ROOT}/ppc6xx root.Be sure to replace ${RTLDK ROOT} with the correct location of the devel-opment kit. The IP setting configures the device to use bootp in order toconfigure itself, although there are many options that may be used in order toconfigure the interface. These arguments are built into the kernel image andare passed as normal parameters to the boot process during a TFTP-basedboot, just as if they were typed in at a LILO prompt. For more informationon these options, refer to Documentation/nfsroot.txt inside the Linux ker-nel tree. Many users need read/write access to the root filesystem, at leastfor testing. Add a rw after the root=/dev/nfs to use this or remount theNFS root as read/write on the target, with:

1Linux 2.6 users may need to compile NFS support as a module. For example, FedoraCore 3 init scripts assume that NFS-related kernel components are built as modules andmay report failures if it cannot rmmod/insmod a module.

C.4. SUMMARY 191

mount -o remount,rw 10.0.0.2:${RTLDK_ROOT}/ppc_root /

Once these options are configured and the remote device is using theNFS mount as its root filesystem, the developer can do all development onthe host machine with the development kit and move the resulting imagesunder the NFS mount point. For simplicity, it is often useful to simply copythe rtlinuxpro directory from the kit under the NFS root mount. Whilesome of these pieces should be removed for the final system, this simple copywill allow access to all of the targeted real-time code needed for the embeddeddevice.

C.4 Summary

This chapter on development kit usage might come across as being ratherlight, and there is a reason for that. The development kit is intended to besimple to use and to allow a programmer to install a stable build environmentfor producing real-time code. As such, this involves installing the kit, placingthe tools in the user’s path and then using the various components (such asthe root filesystem and modules) as needed. The intent is that configurationand use is as simple as possible, allowing the programmer to concentrate onthe task at hand and not have to be distracted by development tool problemsin the build environment. Specific details such as board configuration fornetwork boot is described in more detail in the devkit manual.pdf documentprovided with RTLinuxPro.

Appendix D

In-kernel C++ Support

As of RTLinuxPro 2.2.1, FSMLabs has added support for in-kernel hard real-time C++. This adds to the C++ method already provided in userspaceapplications through PSDD.

Existing support covers a wide range of C++ capabilities, including trans-parent real-time memory allocation, STL coverage and of course many of thelanguage features, such as classes, inheritance, etc. This coverage is providedthrough integration with the compiler provided with the development kit. Asof this writing, C++ support is limited to hardware that has hardware float-ing point capabilities.

D.1 Building C++ applications

Building in-kernel C++ real-time applications is straightforward and verysimilar to building normal RTCore applications. An example Makefile isprovided in rtlinuxpro/examples/cpp. The largest difference between anormal C application and C++ is that the toolchain’s standard C++ librarymust also be linked into the application.

Some linking scenarios require the additional $(LD) flag to link multiple.o files into a single .o object file for the user’s .rtl application, as in thefollowing:

$(LD) -r -o multi_obj.o cpp1.o cpp2.o --allow-multiple-definition

In this case, a cpp1.cpp and a cpp2.cpp file were compiled to cpp1.oand cpp2.o. To build multi obj.rtl, they must be linked into multi obj.o. C

193

194 APPENDIX D. IN-KERNEL C++ SUPPORT

applications do not need the additional linker flag in most situations, but ifit is not used on C++ object files, the link will fail with object redefinitions.

D.2 Running C++ applications

Running a C++ application is the same as a normal RTCore application -just build the .rtl application and run it like a normal user binary. However,there are a couple of additional applications that must be run to providesome services to support C++. The first is a real-time memory allocator toprovide a backing allocator for STL. The second is a C++ wrapper modulethat transparently handles a lot of the work that the toolchain’s C++ librarywill need. These applications must be run before the real-time applicationusing the following commands:

rtlinuxpro/utilities/allocator/allocator.rtl &

sleep 1

rtlinuxpro/utilities/cpp/cpp.rtl &

Once these are loaded, the programmer can run their application. If theyare not present, an application built with C++ will fail to load and themissing symbols from the attempt will show up in the dmesg log.

D.3 Caveats

As of this writing, many C++ features are supported, but RTCore does notprovide full coverage for the language. Specifically, the following items shouldbe considered when using C++ in RTCore applications:

• Exception handling for real-time applications is not fully supported.

• Due to assumptions about stdout and userspace applications, C++apps should not use printf() or cout. Instead, rtl printf() is the pre-ferred solution. This will be addressed in future versions.

• Users should use the POSIX names for POSIX functions, as opposedto the rtl prefixed names.

• Users should always:

D.3. CAVEATS 195

#include <iostream>

as a minimum. This ensures that the minimal RTCore headers arepulled in and new/delete are overloaded. Just changing the compilerto be g++ instead of gcc is not enough.

Also, extensions to POSIX are not directly visible through the C++namespace in all situations. For example, a user may want to use pthread attr init()

and pthread attr setcpu np(). The first is a POSIX function and will bedirectly usable as is. The second one may not be visible to the user as theincluded pthread.h will not pull in POSIX extensions for C++. In this case,the user should declare this as extern, wrapped safely for C++:

#ifdef _cplusplus

extern "C" {

extern int pthread_attr_setcpu_np(pthread_attr_t *attr, int);

}

#endif

Future versions of RTLinuxPro may remove this requirement.

196 APPENDIX D. IN-KERNEL C++ SUPPORT

Appendix E

Important system commands

This is an overview with some usage examples that might be helpful whenworking with RTCore and most UNIXes in general.

bunzip2The bzip2 and bunzip2 command are for compressing and decompressing

.bz2 files. Bzip2 offers better compression rates than gzip, and is becomingmore popular on FTP sites and other distribution locations.

bunzip2 linux-2.4.0.tar.bz2

Decompress the compressed archive.bzip2recover file.bz2

Recover data from a damaged archive.bunzip2 -t file.bz2

Test if the file could be decompressed, but don’t do it.

dmesgThe kernel logs important messages in a ring buffer. To view the contents

of this buffer the developer can use the dmesg command.

dmesg

Dump the entire ring-buffer content to the terminal.dmesg -c

Dump it to the terminal and then clear it.dmesg -n level

Set the level at which the kernel will print a message to the console. Set-

197

198 APPENDIX E. IMPORTANT SYSTEM COMMANDS

ting dmesg -n 1 will only allow panic messages through to the console, butall messages are logged via syslog.

findfind can be used to find a specific fileset in a directory hierarchy, and

optionally execute a command on these files.

find .

List all files in the current directory and below.find . -name "*.[ch]"

List all files in the directory and below that end in .c or .h (c-sources andheader files)find . -type f -exec ls -l {} ;

Find all regular files an display a long listing (ls -l) of them.find . -name "*.[ch]" -exec grep -le min ipl {} ;

List all files in the directory hirarchy that contain the string ” min ipl”in them.find /usr/src/ -type f -exec grep -lie MONOTONIC {} ;

List all files below /usr/src/ that contain the string MONOTONIC, usinga non case sensitive search. (MONoToniC will also match.)

grepgrep is for searching strings using regular expressions. Regular expres-

sions are comprised of characters, wildcards, and modifiers. Refer to the grepman page or a book on regular expression syntax for details.

grep -e STRING *.c

Display all lines in all files ending with .c that contain STRING.grep -ie STRING *

Display all lines in all files of the local directory that contain STRING inupper or lower case. (e.g. StrInG)grep -ie "void pthread" *.c

Find the string ”void pthread” in any .c files. The quotations are requiredto enclose the blank in the string.grep -e "char *msg" *.c

Find the declaration of ”char *msg” in the .c files of the local directory.The ”*” must be escaped so that it is not interpreted as a wild card.

199

gunzipThe gunzip command will decompress .gz files. No options are needed for

decompression. For compressing files use gzip.gunzip FILE.gz

Decompress FILE.gz which will rename it to FILE in the process.gunzip -c FILE.gz

Decompress FILE.gz and send the decompressed output to standard out-put, and not to a file.gzip FILE

Compress FILE renaming it to FILE.gz with the default compressionspeed.gzip -9 FILE

Compress FILE with the best compression ratio. (This will be slow)

initInit is the master resource controller of a SysV-type Unix system. While

testing an RTCore system it is advisable to do this in init or runlevel 1, whichis a single user mode without networking and with a minimum set of systemresources.

init 1

Put the computer into runlevel 1. (No networking, single user mode)init 2

After tests in init 1 ran successfully, bring the box back up to a multiusernetworking system. This need not be runlevel 2 and will vary depending onwhich UNIX the user is running. Check /etc/inittab to see which runlevelis the system default runlevel. It should be safe to run it back up to therunlevel set as default.init 6

Reboot the system.init 0

Halt the system.

locateMany but not all Linux systems have the locate database available, which

caches all filenames on the system and make it easier to locate a specific file.

locate irq.c


List all files on the system that have irq.c in them. (alpha irq.c andirq.c.rej will also match.)locate rtlinux | more

If the search is too general, output will be more than a screen. By pipingthe output into the ”more” program a paged listing is displayed.

makeGNU make is one of the primary tools of any development under modern

UNIXes. Given a makefile, Makefile, or GNUmakefile, which are the defaultname make will look for, make will build a source tree, resolving dependan-cies based on the information and macros given in the makefile.

make -f GNUmakefile

This will run make with the provided my makefile, if the name isn’t oneof the default names that GNU make will search for.make -n

This will instruct make only to report what it would do, but will notactually process any source files.make -k

Normally make will terminate on the first fatal error it encounters. Withthe -k flag make can be forced to continue. This makes sense if within asource tree multiple independant executables are to be built, and one wantsto build the rest even if the first fails.make -p -f /dev/null

Show the database settings that make will apply by default without actu-ally compiling anything. This will list all implied rules and variable settings.

objdumpobjdump allows the user to view symbol information in object files, such

as kernel modules. It also allows the user to disassemble object files. Thisis helpful when trying to locate what could be causing system hangs witha module. The output is not very user friendly, but if short functions wereused it should not be too hard to read. If long functions with many flowcontrol statements were used, it can be close to unreadable.

tarArchives ending in .tar (Compressed tar files will end in .tar.gz tar.bz2

or .tgz) can be unpacked with tar. To make this operation safe, check what

201

is in the archive and where it will be unpacked to first!

tar -tf rtlpro cd.tar

List the files contained in the archive.tar -tvf rtlpro cd.tar

This command gives more details on the files than the above command.tar -xvf rtlpro cd.tar

Unpack the rtlpro cd.tar archive in verbose mode. This will list every fileas it is handled.tar -cvf mycode.tar mycode

This will pack up the content of the directory ”mycode” into the archivemycode.tar, naming every file as it is processed.

unameTo get the exact system name of the running kernel, use the uname com-

mand. A common problem is that one has the wrong kernel running andruns into ”funny” problems this way, such as symbol problems on moduleload. Running uname should clear up any question of what kernel is active.

uname

Print the system type. (e.g. ”Linux”)uname -m

Print the system hardware type. (e.g. ”i586”)uname -r

Print the kernel release name of the running system. (e.g. 2.4.16-rtl)uname -a

Print the full system string, dumping all known information about therunning kernel.

Appendix F

Things to Consider

There are limits to RTCore introduced by underlying hardware that in prin-ciple cannot be bypassed in software. These limits need to be consideredduring a project’s planning stage or at the latest, when selecting a hardwareplatform. Example code provided in the RTLinuxPro package will performsome basic tests on the system in order to judge its appropriateness for real-time work, but here are some common stumbling blocks that developers runinto.

F.0.1 System Management Interrupts (SMIs)

Essentially all Pentium-class systems have the capability to use SMIs, butit has only rarely been done. Some platforms, though, make heavy usageof SMIs to control peripheral devices like sound cards or VGA controllers.SMIs are interrupts that cannot be intercepted from software. Consequently,RTCore will be prevented from operating correctly during SMI execution.Preventing SMIs from controlling hardware is generally not a problem: Sim-ply select peripheral devices that do not require SMIs. This is a simplechoice for almost all ISA/PCI/AGP cards, although it is not necessarily truefor onboard controllers. In rare cases, SMIs have been ”used” to correct de-sign bugs in the hardware, so make sure to keep away from such hardwarewhen selecting components for a real-time system. Check with the vendorfor details.

203

204 APPENDIX F. THINGS TO CONSIDER

F.0.2 Drivers that have hard coded cli/sti

There are drivers available for Linux which may have hard coded cli/sti (clearinterrupt flag/set interrupt flag), that will cause problems in conjunction withRTCore. To make sure a driver is not using cli/sti, use the command objdump

to check for cli instructions. Good candidates for such hard-coded cli/sti’sare binary released drivers for Linux. Vendors of such drivers most likely didnot take real-time requirements into account when designing their drivers. Itis very important to perform this check on binary drivers. If the programmerdoes not see delays during normal execution, it is not safe to assume thatthey are not there, as that code path may not have been triggered yet.

F.0.3 Power management (APM)

Most laptops and some desktop PCs now have power management hardwareincluded, which optimizes power consumption by reducing system clock fre-quency, memory timings and bus frequencies (Probably other things as well).This has clear implications for real-time systems; if timers change their be-havior during operations, consequences are at best hard to predict. In gen-eral, a system that is using power management will not be very good forreal-time operations, unless these effects have been explicitly addressed bydrivers and the core real-time system. If this is not the case, power manage-ment must be disabled.

F.0.4 Spurious IRQ7

Due to PCB quality and possible interference on electrical lines on PCBsthis hardware latency might occur. The cause of the spurious irq conditionis due to the interrupt controller recognizing an interrupt line being assertedhowever the line is de-asserted before the processor has received the interruptvector via an interrupt acknowledge and data read cycle. At this point theinterrupt controller notifies the processor that IRQ7 is asserting. This wholeprocess can introduce huge latencies in dispatching of the timer interruptand subsequent delays processing scheduler deadlines. If possible enable thelocal APIC as a workaround.

205

F.0.5 Hardware platforms

RTCore is dependent on certain hardware behavior for successful operation.This might be most obvious for peripheral devices like data acquisition boardsor stepper motor controller boards, but ”standard” hardware dependenciesare often overlooked.

Depending on application demands, hardware platform selection can makeor break the project. It is important to find a platform that can provide theperformance and accuracy needed for the application. With RTLinuxPro, atargeted evaluation is recommended to ensure that the machine can provideappropriate accuracy, followed by a strict analysis of program demands tosee if the specifications of both hardware and software can be met.

There may be a lot of flexibility here, depending on need. For a very highperformance application, the range of possible architectures may be limitedto a small handful of target systems. But for others, such as a few lowfrequency sampling threads, a much slower system will likely be cheaper andstill provide ample resources.

A prime example of this is the Geode processor. While it is x86-compatible,many operations are virtualized on the chip, meaning that performance maydegrade during certain time windows. (Video and audio are two known prob-lem areas.) While the chip goes into System Management Mode (SMM) tohandle this activity, hardware-induced jitter may spike as high as 5 millisec-onds. For many applications, this is the kiss of death, but others may befine with this level of jitter. For these lower bandwidth applications, theGeode is a cheap x86-compatible solution for the field and the jitter is withinspecification.

It is because of these situations that FSMLabs recommends evaluationand testing with the RTLinuxPro test suite, followed by hard analysis ofapplication demands. If the target hardware will suit the application, it maynot matter if there is potentially 5 millisecond jitter. The important part isthat requirements are built and understood so that the proper hardware andsoftware configuration can be selected.

F.0.6 Floppy drives

Typical PCs include a floppy drive. For historic reasons, the floppy drive isable to change the bus speeds and floppy drivers do CMOS calls to selectthe floppy type. The consequence for RTCore is that scheduling jitter can


substantially increase if the floppy is accessed. The simplest solution is notto have a floppy drive on a real-time system. If a floppy drive is absolutelynecessary, these effects must be taken into account. That is, the programmermust test the real-time threads while accessing the floppy drive, to ensurethat it is not disturbing real-time operation in an unacceptable manner.

F.0.7 ISA devices

In a PC-based system, compatibility with older hardware is available onlyat a relatively high performance penalty. A typical example of this is thePCI-ISA bridge that can be the dominating cause of worst-case system jitterin a system. When making the decision of which hardware to select for areal-time system, careful consideration should be made concerning the ISAbus. If a system can be designed without an ISA-bus, it is the preferablechoice.

RTCore will not be able to compensate for slow hardware in all cases: Ifthe bus is controlled by an ISA device, RTCore will have to wait. When anISA DMA request occurs, everything is clocked down to the speed of the ISAbus and waits until the transfer finishes. Thankfully, the ISA bus is beingremoved entirely from many modern designs, so unless the user has specifichardware that is ISA-only, the entire issue can be avoided.

F.0.8 DAQ cards

Data acquistition is one of the more common tasks where one would use anRTCore-based PC system. When designing such a system, it is importantto carefully consider which data acquisition peripherals should be used. De-pending on project demands, there are a variety of cards offering varyinglevels of capabilities.

Depending on the included hardware, some cards will sample data au-tonomously, buffering into their own internal storage, and will only notifythe system when a large amount of data has been collected. For acquisitionrates that outpace the timing capabilities of the host machine, this can bebeneficial, but it usually comes with some kind of cost.

On the other hand, some cards operate in a polling manner, allowing thedeveloper to set up the sampling rate purely from real-time code. This hasadvantages in that the developer can use simpler hardware without internalbuffering, but it requires the host machine to be capable of performing the

207

requested sampling rate. For most applications, the best choice is somewherein the middle, allowing some work to be done on the board, and some inRTCore, without raising costs too much.

Appendix G

RTCore Drivers

This appendix covers specific details on drivers provided with RTCore.

G.1 Digital IO Device Common API

Digital IO devices advertise their services through files that can be operatedon with open, read, write, ioctl and close.

IO devices must first be opened with a call to open using the appropriateflags. The read/write mode of the returned file descriptor only affects writeand read calls. ioctl calls completely ignore the read/write status of thefile descriptor and allow reading or writing as requested. Once operations onthe device are complete a normal call to close is necessary.

Most operations are done through ioctl. There are a number of ioctlcalls that operate on devices listed below.

Setting or clearing a single bit is done with:

int fd;

/* set bit #3 */

ioctl( fd, RTL\_SETBIT, 3 );

/* clar bit #10 */

ioctl( fd, RTL\_CLEARBIT, 10 );

One can clear specific bits and clear other bits atomically. That is, clearsome and set some in a single operation.

int fd;

unsigned long mask[2];

209

210 APPENDIX G. RTCORE DRIVERS

/* clear bit #4 and 8 */

mask[0] = (1<<4) | (1<<8);

/* set bit #2 */

mask[0] = 1<<2;

ioctl( fd, RTL\_CLEARSETBITMASK, &mask );

int fd;


/* clear bit #4 and 8 */

mask[0] = (1<<4) | (1<<8);

/* set bit #2 */

mask[0] = 1<<2;

ioctl( fd, RTL\_CLEARSETBITMASK, &mask );

Sometimes one wishes to read a specific mask of registers without chang-ing the input or output state of other registers. Writing a specific mask ofbits without changing the state of any other bits is shown below.

int fd;


/* wish to write bits 4 and 8 */

mask[0] = (1<<4) | (1<<8);

/* set bit 4 to a 0 (low) and bit 8 to a 1 (high) */

mask[0] = (1<<8);

ioctl( fd, RTL\_WRITEBITMASK, &mask );

The code below shows reading from certain bits without changing anyoutput of other bits.

int fd;


/* wish to read bits 4 and 8 */

mask[0] = (1<<4) | (1<<8);

ioctl( fd, RTL\_READBITMASK, &mask );

/* mask[1] contains the value of bits 4 and 8 */

G.2. PPS DRIVER 211

G.2 PPS driver

The pulse-per-second (PPS) driver included with RTCore creates a clock(CLOCK PPS) that is synchronized with an external time source.

G.2.1 Input

PPS driver takes input as a signal from a digital IO device that transitionsfrom low to a high state once per second on the boundry of every second.

A second input signal can be used for cross-checking. For example, thedriver is currently setup to allow a #define change in the source code to allowthe primary PPS signal (for example, a rubidium clock) and a secondary PPSsignal (for example, a GPS) to be checked against one another. If they twodiffer by more than 20 microseconds it is reported.

G.2.2 Timing and how it works

The PPS driver re-calculates how many processor timer ticks elapse betweeneach PPS signal. This gives a calibration of how “off” the on-chip timeris from the PPS source. The PPS driver then adjusts its estimate of howlong each timer tick takes. It then hides all this by presenting the user witha CLOCK PPS abstraction. When a application requests a operation onCLOCK PPS the calculation of “corrected” time takes place transparently.For example, when making a call to read the current time with:


clock_gettime( CLOCK_PPS, &next );

The call returns the nearest estimated time that is synchronized with thePPS clock. At each PPS pulse the driver is able to recalibrate its currentestimate of time by comparing the current estimate of time when the PPStransition occurs. This deviation from the PPS time is then removed byslowing or speeding the estimate of actual time during the next second by nomore than 1/4 second per-second and no less than 1/4 the difference betweenthe actual PPS transition and the estimated transition.

In practice the estimated clock differs by no more than 6 microsecondswith typical hardware and a stable PPS source (Rubidium clock or GPS).


G.2.3 Using the driver

The PPS driver can be used anywhere by replacing CLOCK REALTIMEwith CLOCK PPS.

Starting the driver

To start the driver run it as any other application. The compiled in defaultIO device to poll for the PPS signal can be overridden as a command-lineargument. For example pps.rtl /dev/dio1024 0.

Applications

It is often the case that an application will wish to set the absolute time beforeusing the PPS driver since the PPS driver only keeps CLOCK PPS in syncwith the PPS but does not initialized with any particular absolute time. Ini-tially, CLOCK PPS starts with the value returned by CLOCK REALTIME.If an external time source is available with the PPS (a GPS date/timestring, for example) an application can parse it and then set the absoluteCLOCK PPS time with:

struct timespec ts;

/* ...set ts from some source... */

clock_settime( CLOCK_PPS, &ts );

At the next PPS signal the PPS driver will set CLOCK PPS to the timepass in by ts. Once the call to clock settime has been made the PPS driverwill report CLOCK PPS as being invalid.

To check for the state of CLOCK PPS and wait until it is valid andready for use (the set date operation has completed, the PPS signal has beenacquired or setup is complete):

while ( !rb_pps_is_valid ) {

printf("Rb clock is not valid yet, waiting 10 seconds.\n");

usleep(10*1000);

}

The same can be done with gps pps is valid to check the secondaryPPS signal source.

G.2. PPS DRIVER 213

Apart from that, one can use CLOCK PPS just as one would use CLOCK REALTIME.In order to setup for a period every 625 microseconds that is aligned with aneven PPS boundry one can use:

struct timespec next, period = {0, 625000};

/* get the current time and setup so the first wakeup is on a PPS transition */

clock_gettime( CLOCK_PPS, &next );

next.tv_nsec = 0;

next.tv_sec += 1;

while ( 1 ) {

/* sleep */

clock_nanosleep( CLOCK_PPS, TIMER_ABSTIME,

&next, NULL);

if ( !rb_pps_is_valid ) {

printf("Lost Rb pulse, shutting down.\n");

return NULL;

}

/* setup for the next cycle */

timespec_add( &next, &period );

}

G.2.4 Caveats

Jitter value

The PPS driver polls the digital IO device that the PPS signal comes in on.Since missing a pulse would result in inaccurate time calculation the threadthat polls the device runs at the highest priority and does the polling withinterrupts disabled. This means no other thread can run while the driver iswaiting for the PPS transition. This also means that the time spent pollingmust be kept to a minimum to allow other threads to run normally.

The driver source includes an estimated jitter value that can be adjusted.The polling thread will estimate then the next PPS signal will arrive and


schedule itself to wakeup and start polling at 1/2 this jitter value before itexpects the PPS. This ensures that scheduling jitter will not prevent thethread from catching the PPS signal. The polling thread will wait for thefull estimated jitter time for the PPS signal. If it does not see the signal inthat time it will either abort time synchronization or it will estimate whenit expected the PPS and continue on, hoping to catch the next PPS signal(depending on which is configured in the source of the driver).

This jitter value is critical to the performance of the system and variesgreatly with different hardware. It is suggested that the end-user adjust thisto reflect their own system.

SMP systems

The PPS driver works just as well on multi-processor systems as it does onsingle-processor systems as long as the time on processors is synchronized.This is the case with most modern systems.

The polling thread (mentioned above) will only need to run on a singleCPU and can synchronize time for all processors assuming that every pro-cessor timer runs at the same rate. Even minor differences in timer speedamong processors can lead to CLOCK PPS estimates to become inaccuratefor some processors over a long time. It is suggested that end-users makesure that the processor clocks run at the same rate, are phase-locked or someother measure is taken to make sure that the time remains in sync.

G.3 RTLinux PCI Driver

RTCore supports a driver that can dump info of the PCI devices present onthe system through a POSIX interface. This includes normal I/O functionsopen(), close(), lseek(), ioctl(), munmap(), and mmap().

Applications that use the PCI driver must include the following headersat least:

#include <unistd.h>

#include "rtl_pci_ioctl.h"

Once the user has an open file descriptor for one of the PCI buses presenton the system, lseek() can be used to select one of the PCI devices attached

G.3. RTLINUX PCI DRIVER 215

to this PCI bus. Configuration space can be read using ioctl() call, passingbuffer to it:

The following is example setup code for a user:

struct rtl_pci_config config;

int ret, pci_dev = -1;

// open PCI bus0’s device file.

if((fd = open("/dev/pci_0", O_RDONLY)) < 0) {

perror(" open(/dev/pci_0): ");

return -1;

}

// select last device attached to PCI bus0.

if((ret = lseek(fd, pci_dev, SEEK_SET)) < 0) {

perror(" lseek(pci_dev): ");

close(fd);

return -1;

}

pci_dev = ret;

// read config space.

if((ioctl(fd, RTL_PCI_READ_CONFIG, &config)) < 0) {

perror(" ioctl(): ");

close(fd);

return -1;

}

The configuration space information is kept in struct rtl pci config

which looks like:

struct rtl_pci_config {

int vendor_id, rev_id, dev_id;

int class_dev, hdr_type;

int irq, pin, devfunc, slot;

unsigned long bar[PCI_NUM_BAR_RESOURCES];

unsigned long bar_size[PCI_NUM_BAR_RESOURCES];

};


Each PCI device has by default six resources, decribed by the Base Ad-dress Registers(BARs). These resources can be mapped to user applicationthrough mmap() and unmapped using munmap(). When the user is done withthe PCI bus device file, the file should be closed with a normal close() call.

For example code, please refer to the drivers/examples/pci directoryprovided with RTCore. This contains three examples: ”simple example.c”,”all info.c”, and ”pcidump.c”. ”simple example” opens PCI bus0’s devicefile, if it exists, selects last PCI device attached to this bus, prints info fromthe device’s configuration space, tries to map and unmap the resource de-scribed by BAR4 of the device. ”all info” opens PCI bus device node passedas a command line argument to it, iterates over all the PCI devices at-tached to this bus, dumping info from the configuration space of each device.”pcidump” iterates over all the PCI buses present on the system, dumpinginfo about each and every PCI device attached to the buses.

G.4 IEEE-1284 – Parallel Port Digital IO Driver

This driver supports most parallel port devices for digital IO only. It doesnot support parallel communication. The driver follows the standard APIconventions for digital IO in RTCore. Please see the section that describesthis for details.

G.5 VME driver

Provided with RTCore is a VME driver for the Tundra Universe II PCI/VMEbridge and the SBS Bit3 PCI/VME bridges. This provides hard real-timesupport for VME activity.

This driver has been tested on the Universe, UniverseII and the SBS Bit3704 boards. Other should work but have not been fully tested.

As of this release the driver supports:

• VME interrupts

• Access to A16, A24, and A32 address spaces, using the access methodsdefined by the VME specification - D8, D16, D32 and D64.

• Master and slave configurations for the above

G.5. VME DRIVER 217

• Supervisor and user mode accesses

• DMA transfers

• BLT transfers and, for D64 operations, MBLT transfers

Examples are provided in drivers/examples/vme. These demonstratehow to handle VME interrupts, set up master and slave windows to thevarious address spaces, including how to perform DMA transfers in the A32address space.

The driver’s operation is very simple - all access is done with the device/dev/vme 0 with open(), close(), mmap(), munmap(), and ioctl() calls.Users open the device with open() in order to get a file descriptor that canbe used to set up interrupt handlers (via ioctl()) or get access to an addressspace (via mmap()).

If users require multiple windows onto the VME address space, such asone A16 window and one A32 window, it is required that separate open()

calls are performed. Each window is accessed via a separate mmap() call.Please see drivers/examples/vme/ for specific examples of how to use

the VME interface.To open the VME device:

int fd;

fd = open("/dev/vme_0", O_RDWR);

G.5.1 Interrupts

To register an interrupt handler for VME interrupts, once the device hasbeen opened:

void int_handler(int level)

{

}

ioctl(fd, RTL_VME_REG_INT, (unsigned long)int_handler);

To generate VME interrupt number corresponding to the variable num:


int num = 1;

ioctl( fd, RTL_VME_TRIGGER_INT, num );

VME supports interrupt numbers 1 through 7. An attempt to send aninterrupt outside of this range will cause the above ioctl() call to returnan error.

When sending or generating an interrupt, the interrupt will be sent tothe entire VME bus. This means that if there is a VME interrupt handlerregistered on the board that generates the interrupt it will also receive theinterrupt that it just sent. This is important to note since user handlers oftenhave to be constructed such that they discard interrupts that have been sentby the local system to avoid confusion.

G.5.2 Slave memory regions

One must allocate local host memory before advertising it on the VME bus.It can be done through the rtl gpos malloc() or a shared memory regioncan be used that can then be shared with user-processes or other RTCorethreads. Below, is an example with shared memory:

int shm_fd;

void *raddr;

unsigned long size = 4<<10;

shm_fd = shm_open("/dev/vme_super_shm", O_CREAT | O_DMA, 0777);

ftruncate(shm_fd, size);

raddr = mmap( 0, size, PROT_READ|PROT_WRITE, MAP_SHARED, shm_fd, 0 );

Once the memory is allocated, it can be advertised as a slave regionon the VME bus. The example below shows the memory being advertisedon the VME bus in the A32 space allowing 8-byte access. It is possibleto use any combination of A32, A24, D16 with D64, D32, D16, D8 and insupervisor/user mode.

char *vme_ptr;

vme_ptr = mmap( (void *)raddr, size, 0,

MMAP_VME_A24|MMAP_VME_D8|MMAP_VME_SUPER|MMAP_VME_SLAVE|MMAP_VME_DATA,

fd, vme_addr )

G.5. VME DRIVER 219

Any VME device on the bus may access this memory at the addressrepresented by the variable vme addr. However, the local host cannot accessthis memory through the VME bus pointer. It must be accessed as localmemory (variable raddr in the example above).

G.5.3 Master memory regions

The code below shows how get a pointer to a master region of VME memory.The VME bus address is represented by the variable vme addr and the size ofthe window in size. The flags passed into mmap() can be chosen to allocateany combination of 64/32/16/8 bit access and supervisor/user.

char *vme_ptr;

vme_ptr = mmap( 0, size, 0,

RTL_MMAP_VME_A32|RTL_MMAP_VME_D32|RTL_MMAP_VME_SUPER|RTL_MMAP_VME_DATA,

fd, vme_addr );

It is important to note that only 1 mmap() call for a master region canbe made for each open() call. That is, only 1 master mmap() region may becreated for each open file descriptor. If the programmer wishes to access mul-tiple master VME regions then they must open() the VME device multipletimes.

There are further limitations on A32 memory. The underlying hardwareonly provides a limited number of windows into the entire A32 address space.For the Universe and Universe II chips there are 4 such windows. So, onecan only open() and then mmap() a maximum of 4 master windows and 4slave windows.

G.5.4 DMA transfers

DMA transfer into local memory from remote VME memory and to a remoteVME address from local memory is supported. The example below shows atransfer from local memory into a memory address on the VME bus. Thevariable buffer points to a local region of memory that is DMA transferable.

#include <vme.h>

struct vme_dma_desc_s desc;


char *buffer;

unsigned long vme_addr;

desc.vme_addr = (void *)vme_addr;

desc.local_addr = (void *)buffer;

desc.count = size;

desc.flags = RTL_MMAP_VME_A32|RTL_MMAP_VME_D32;

ioctl( fd, RTL_VME_DMA_TOVME, &desc );

The following example shows a transfer from a remote region of memoryto a local buffer.

struct vme_dma_desc_s {

void *vme_addr, *local_addr;

unsigned long count;

} desc;

char *buffer;

unsigned long vme_addr;

desc.vme_addr = (void *)vme_addr;

desc.local_addr = (void *)buffer;

desc.count = size;

ioctl( fd, RTL_VME_DMA_FROMVME, &desc );

G.5.5 Performance

Every effort has been made in this driver to take advantage of the hardwareto provide the fastest and lowest latency transfers and interrupts. However,optimizing for every case and configuration in a general driver is difficult.Small changes in the driver or the user’s application may cause performancedifferences.

If you have any questions about optimizing for your application or ifyou see performance problems (or less than what you would expect) pleasecontact us via email at [email protected].

For the fastest transfers possible one should try to use DMA mode wher-ever possible. In addition, it is better to use D64 when the remote devicesupports it. If that is not possible, the largest data operation size is preferableto smaller ones since this can make a huge performance difference.

G.5. VME DRIVER 221

G.5.6 Chip specific notes

Bit3

The SBS Bit3 version does not yet support DMA. Not all Bit3 chips supportDMA so please contact FSMLabs since adding support to the driver for thosechips that do have DMA should not difficult.

The Bit3 driver does not yet support releasing or munmap of previouslymapped memory. The resources on the chip will not be freed and will resultin fragmentation eventually. Loading and reloading the driver will fix this.

The Bit3 cards require a great deal of configuration and specific jumpersettings based on how the software wishes to use the device. Not all fea-tures or memory operations can be supported in software without the cor-rect jumper settings to allow these operations. Please check the manual forspecific card being used. Specifically:

• Advertising a slave window on the VME bus is done through soft-ware configuration and jumper configuration. The low-order bits setby software are used to locate the window within a 16M range and theremaining bits are used from jumper settings.

• Advertising a slave window in A16, A24 or A32 space in software maynot do what is expected. The software configured values of which ad-dress space to use are overridden by the jumper settings on the card.A32 and A24 (or both) are supported by most Bit3 cards but A16 isnot.

• The jumper settings configure whether the card is system controlleror not. These settings must be correct since the card will not auto-configure.

• Sending or requesting VME interrupt levels may not work as expectedsince each interrupt must be specifically configured for send/receive byjumpers on the card. Software cannot override these values.

• Due to the configuration of the Bit3 card it is not possible to sendand receive the same VME interrupt level. Each interrupt must beconfigured on the card for receive, send or neither.


• Sending VME interrupt levels 3-7 does not require special jumper set-tings but VME interrupt levels 1 and 2 must be configured properly bythe jumpers on the card.

• When VME interrupt handlers are called they will not be passed avalid VME interrupt level. The Bit3 card does not provide this valueto software. Instead, the Bit3 cards provide a “vector” that identifieswhich board generated the interrupt. This is the value that is passedto the interrupt handler.

• Some Bit3 cards have a bug that does not permit sending interrupts tothe VME bus. The 810 is one of these cards. The 6xx series cards donot show this problem. There is a workaround that allows the developerto send some interrupts (IRQ1 and IRQ2) that have not implementedas yet. If this feature is needed, please contact FSMLabs.

Universe

The UniverseII chips provide 8 A32 windows but the UniverseI only supports4 windows. The driver provides access to only 4 for compatibility. If all 8 isneeded, please contact FSMLabs.

G.6 Serial driver

RTCore supports a driver that controls real-time serial hardware through aPOSIX interface. This includes tcgetattr(), tcsetattr(), and the normalI/O functions open(), close(), read(), and write().

Applications that use the serial driver must include the following headersat least:

#include <time.h>

#include <unistd.h>

This provides knowledge about struct termios which is needed to setup the serial port. Once the user has an open file descriptor for the serialdevice, various flags should be set to configure the port. The following isexample setup code for a user:

G.6. SERIAL DRIVER 223

struct rtl_termios term;

fd = open("/dev/ttyS0", O_NONBLOCK);

if (fd < 0) {

printf("Unable to open serial device\n");

return -1;

}

tcgetattr (fd, &term);

term.c_cflag = (CS8 | CREAD | CLOCAL);

term.c_use_fifo = 1;

term.c_fifo_depth = 4;

cfsetospeed (&term, B38400);

tcsetattr (fd, TCSANOW, &term);

First, the user opens the device with open() to get a valid file descriptor.With this, they can now call tcgetattr() to get a filled-out termios structurewith details on the port configuration.

Using this, the user must then set the correct settings for their hardware.Most settings can be left as above. Most users can leave the hardware FIFOsenabled at the given depth. Any speed supported by the hardware can bespecified, and the specific settings can be found in the termios.h header.

Speed values can also be set explicitly with cfsetospeed(), which takesa struct termios pointer and a speed value, such as B115200.

Once the settings are configured, they are applied to the port with tcsetattr().This enables/disables any settings specified and prepares the port for work.Now normal read()/write() calls can be used to read and write data to andfrom the port. Calls to read() will return any data already buffered in fromthe driver, and calls to write() will write what is possible at the moment,and buffer the rest so the calling thread can continue on with other work. Thesize of the write buffer can be modified if needed with RTL SERIAL BUFSIZE.

When the user is done with the device, the file should be closed with anormal close() call.

Note that the driver for the serial driver is intended to be as simple aspossible. It does cover a wide range of hardware on multiple architectures,but may require minimal changes to work on certain hardware. As serialhardware rarely varies, modifications are rarely needed.


For example code, please refer to the drivers/serial directory providedwith RTCore. This contains an example that will route data from point topoint between two real-time threads, using FIFOs to interact with non-real-time data providers and receivers on both ends.

G.7 Video Framebuffer Driver

This section describes the video framebuffer driver that allows real-time videodisplay.

The framebuffer driver exports an interface through the device files /dev/fb0,/dev/fb1 and so on.

G.7.1 Calling Contexts

The /dev/fb* devices allow access to the framebuffer device inside of RTCorethreads and GPOS routines (inside of main() functions in RTCore applica-tions). Operations in interrupt handlers are not allowed due to the normallimits of ioctl(), open() and close().

These devices are only accessible from kernel space as the device onlyexists in RTCore, not in the standard Linux environment. Linux applicationsthat operate on /dev/fb* will instead be acting on the Linux framebufferdevice.

G.7.2 Operations on the Framebuffer

Applications may open /dev/fb* devices and then pass that file descriptorto any of the RTCore graphics functions.

The Linux kernel must be configured with a working framebuffer devicefor this interface to work. The kernel distributed by FSMLabs already in-cludes support for this.

Framebuffer devices support these ioctl() calls:

• RTL FB GET XRES - returns the X resolution

• RTL FB GET YRES - returns the Y resolution

• RTL FB GET VIRTXRES - returns the virtual X resolution

• RTL FB GET VIRTYRES - returns the virtual Y resolution

G.7. VIDEO FRAMEBUFFER DRIVER 225

• RTL FB GET SCREEN BASE - assigns pointer to an unsignedlong that is passed in to the base of the framebuffer to allow directmodification by applications

• RTL FB GET SCREEN BASE SIZE - returns the size (in bytes)of the framebuffer

• RTL FB GET SCREEN OFFSET - returns the X & Y offsets ofthe screen from the base of the screen

• RTL FB SET SCREEN OFFSET - takes a pointer to 2 unsignedlongs. These represent the X and Y offsets to set. Used to pan thedisplay or re-orient it.

See the man page for rtl put pixel for further operations on framebufferdevices.

G.7.3 Examples

There are few examples provided for user to play around along with the framebuffer driver. Before trying to run these examples Make sure that RTCoreand fb.rtl are loaded.

• Sweep This is an example which must be invoked from the console. Inorder to run just invoke sweep.rtl without any parameter. Sweep usesthe framebuffer driver and draws beautiful circle on the console.

• Window This example is available in drivers/example/fb directory. Itworks in tandem with user application which is available in drivers/

example/fb/user directory. This is a graphical user interface applica-tion and must be run from the X window system. You should followthese simple steps:

load rtcore

cd drivers/fb

./fb.rtl

cd ../example/fb

./window.rtl

cd user

./user_app.rtl


user app creates a window and window draws circles in this window.You can move the window anywhere on the screen and the windowtracks the mouse movement. At this time you can not resize the windowbut you can move anywhere.

Before operating on these devices the framebuffer device must be openedand the screen must be configured for the proper resolution and bit-depth.This interfaces do not allow changing of these parameters at this time. To dothis, one should use the fbset utility from the command line before openingthe device.

It is generally recommended that framebuffers be configured to 16 -bitdepth since palette management routines are not supported under this inter-face yet. The interface is also geared towards dealing with colors in R/G/Btriples since all calls take them as arguments.

G.8 Intel 82C55 Digital IO

This driver supports the Measurement Computing PCI-24 and most boardsthat use the Intel 82C55 chip.

This driver allows control of the Intel 82C55 GPIO digital lines. Thedriver follows the standard API conventions for digital IO in RTCore. Pleasesee the section that describes this for details.

G.8.1 Driver specifics

The Intel 82C55 provides 24 input and output lines that are configurable asinput or output in 4 different banks. The banks are described below:

• /dev/dio1024 0 — 8 bits




This driver only supports mode 0 for basic input and output describedin the Intel 82C55 manual. It does now allow data strobing or parallel datacommunication.

G.9. MARVELL GT64260 AND GT64360 DIGITAL IO DRIVER 227

G.9 Marvell GT64260 and GT64360 Digital

IO Driver

This driver allows control of the Marvell GT643,260 GPIO digital lines. Thedriver follows the standard API conventions for digital IO in RTCore. Pleasesee the section that describes this for details.

G.9.1 Driver specifics

The Marvell chipset allows each individual bit to be configured as either inputor output regardless of the state of other bits. Up to 32 lines are availablebut many board configurations do not actually run all the lines out fromthe chip to connectors on the outside. It is often the case that the Marvellchip bit 0 does not correspond to the output pin on the board. Read thedocumentation on your board to be certain of which pins are run out fromthe Marvell and to where.

G.10 Power Management Driver

The RTCore power management driver provides the ability to change theprocessor frequency (when the hardware supports this) on a per-thread basisor immediately and allow power saving by putting the CPU into an idle statewhen the system is not active

G.11 Frequency changing

This driver allows per-thread and immediate changing of CPU frequency.Please see the man page for gpos freq list for details on this feature.

G.12 CPU Idle calls

When it starts, the power management driver creates a RTCore thread that islower priority than all other RTCore tasks and lower priority than the GPOSitself. This low priority thread enters the processor “sleep” or power-savingmode when it executes. How it does this is configured via the driver.


When the RTCore system is idle (no realtime threads need to execute)the GPOS finishes any processing that it must do and then notifies the powermanagement driver that it is now idle. The driver then changes the priorityof the power saving thread so that it executes at higher priority than theGPOS. Once executing, the power-saving thread enters the CPU specificpower saving (idle) mode and waits for an interrupt. If a realtime interruptor realtime thread become active and need to execute during this time theyare immediately scheduled with no delay. The power-saving thread onlyaffects executing of the GPOS.

When the GPOS is allowed to execute again is configured in the powermanagement driver. It can be much more efficient to not allow the GPOSexecute again for a given period of time rather than switching between theGPOS and the power-saving mode rapidly. In the driver is a configurableparameter to set a minimum amount of time that the system will spend inpower-saving mode once it enters it before allowing the GPOS to run again.

G.13 Additional Uses

For information on additional features or functionality please email FSMLabsat: [email protected]. The power management driver is designed tobe a power management infrastructure that is highly flexible to allow thedeveloper to meet the performance and power consumption needs. To doachieve this FSMLabs has provided the power management tools but not acomplete and generic solution to every problem “out of the box”. FSMLabs ishappy to assist in configuring and setting up any power management system.

Appendix H

“New” RTCore Networking

This subsystem of RTCore is a replacement for LNet. The new version isnot yet complete but it is functional and usable for many applications so itis being included in current releases in a “beta” form.

This new networking system has the following goals:

• Simpler driver interface to speed driver development

• Speed/latency improvements through simpler communication betweenprotocol layers

• More complete processor and platform coverage (instead of x86 andPowerPC only)

• Standard socket interface for UDP/IP and similar protocols instead ofthe “nearly sockets” interface of the old version

H.1 E1000 Driver

There are many variants of the E1000 device from Intel. They differentenough that support for one does not imply support for another. The list ofcards below have been tested. Any devices not listed should be considereduntested.

229

230 APPENDIX H. “NEW” RTCORE NETWORKING

H.1.1 Tested Revisions

• 82546EB - Quad-port copper board Tested as a PMC board fromSBE on PowerPC and x86 32-bit machines Known to work with somecaveats. These boards are known to trigger bugs present on the Mar-vell system controller. If your computer contains a Marvell systemcontroller please verify that this bug is not present by testing beforeusing this device.

• 82541GI/PI - Single port copper board Known to not work with thecurrent version. There are EEPROM changes on this board that arenot yet supported.

Appendix I

The RTCore POSIX namespace

RTCore now provides a fully decoupled and clean POSIX namespace forreal-time applications. Historically, it has provided POSIX names to users,in addition to names from the Linux or BSD namespace. This means thatusers also brought Linux and BSD kernel structures and functions into theirapplication. As of RTLinuxPro 2.1, this behavior is deprecated by defaultand users do not get GPOS headers unless explicitly requested.

Users are encouraged to read this section to fully understand the system,but can refer to section I.6 for quick details.

This appendix details the usage of these new headers and implications forusers porting applications from versions before RTLinuxPro 2.1. The impactfor existing users has been minimized as much as possible while providing aclean POSIX environment. It is recommended that both new and existingusers review this chapter to ensure familiarity with how to handle clean and’polluted’ applications.

I.1 Clean applications

Clean applications are ones that use only the RTCore, the provided POSIXenvironment, and do not depend on any names, functions, etc., from theGPOS. These applications can build with the usual CFLAGS and includepaths provided by the rtl.mk file. For example, consider the following simpleapp:

231

232 APPENDIX I. THE RTCORE POSIX NAMESPACE

#include <stdio.h>

#include <sys/types.h>

#include <sys/stat.h>

#include <unistd.h>

int main(int argc, char **argv) {

int ret;

ret = mkfifo("/test", 0777);

if (ret == ENOSPC) {

printf("Error, no space for FIFO\n");

return -1;

}

rtl_main_wait();

unlink("/test");

return 0;

}

This creates a fifo and waits, then unlinks it on exit. It is entirely POSIXand does not require services from the GPOS, whether it is functions, definedvalues, etc. It can be built with a simple Makefile:

include path_to/rtl.mk

all: test.rtl

clean:

rm -f *.rtl

include $(RTL_DIR)/Rules.make

Clean applications can use pure POSIX names as above, or rtl prefixesfor all POSIX functions, and RTL for all POSIX defined values. For exam-ple, it can use RTL ENOSPC instead of ENOSPC, or rtl mkfifo() instead ofmkfifo(). This can be done interchangably in a clean application.

I.2 Polluted applications

Polluted applications are those that use the POSIX environment for real-time applications, but also need to use supporting services provided by the

I.2. POLLUTED APPLICATIONS 233

non-real-time system. Here is the previous clean program under Linux, usingLinux function calls to set up a non real-time interrupt handler.

#include <gpos_bridge/sys/gpos.h>

#include <linux/sched.h>

#include <rtl_stdio.h>

#include <sys/rtl_types.h>

#include <sys/rtl_stat.h>

#include <rtl_unistd.h>

void *dev_id = "test";

void gpos_handler(int irq, void *dev, struct pt_regs *regs) {

return;

}

int main(int argc, char **argv) {

int ret;

ret = rtl_mkfifo("/test", 0777);

if (ret == RTL_ENOSPC) {

rtl_printf("Error, no space for FIFO\n");

return -1;

}

request_irq(4, gpos_handler, SA_SHIRQ, "test", dev_id);

rtl_main_wait();

free_irq(4, dev_id);

rtl_unlink("/test");

return 0;

}

The program now requires some Linux headers. RTCore provides a mainfile that users can include to get most of the GPOS namespace by default -gpos bridge/sys/gpos.h. This does not include all of the namespace, but itdoes get a large portion of it. For this example, the gpos bridge/sys/gpos.h

file is included and Linux’s sched.h for the request irq() and free irq() func-tion prototypes, along with everything else they need. If gpos.h is includedfirst, it will define KERNEL for the programmer, but if they are onlyincluding a few Linux headers by hand, they will need to add a #define

KERNEL before any Linux headers.


The other major difference is that since there is known pollution, anycode dealing with RTCore is changed to use an RTL or rtl prefix (includingPOSIX include files). This ensures that the application uses the RTCorefunction, and avoids any overlap with Linux-provided names. In this exam-ple, it was not necessary. The included files needed for Linux support do notoverlap with anything being used in the real-time space and the names couldhave remained unchanged. However, this is not always the case, and usingthe rtl prefix ensures that the developer always gets the right name, regard-less of what the GPOS has defined, what other patches may have added, etc.This ambiguity only arises in polluted applications.

This program can be built the same as the previous example. It does notrequire any extra build logic to do so. By default, though, it will generate awarning - please see section I.4.3 for details on how to supress this.

For some applications, it may be better to avoid this pollution and splitthe application into two components - one that is a GPOS kernel module,entirely polluted, and one that is RTCore-based only, non-polluted. This pre-vents any possible confusion about naming ambiguities and provides a cleanseparation between real-time kernel components and non-real-time kernelcomponents. The build system does not enforce this, though, as it may notbe desirable for some users.

I.3 PSDD users

PSDD users are by definition polluted, as they are sharing their POSIXnamespace with userspace applications. However, this does not mean thatthere is ambiguity - the same simple rules apply. As with previous releasesof RTLinuxPro, PSDD users build applications with USER CFLAGS, whichprovides the correct include information. The difference now is that all piecesof code that use RTCore-provided services must use rtl or RTL for thefunction.

For example, a normal userspace application uses stdio.h:

#include <stdio.h>

An application with a PSDD component must now include the rtl pre-fixed version of the file for RTCore services:

#include <rtl_stdio.h>

I.4. INCLUDE HIERARCHIES AND RULES 235

Other standard POSIX includes also follow the same formula:

#include <sys/types.h>

#include <sys/rtl_types.h>

In each case, the rtl prefixed file provides the rtl prefixed POSIX func-tions and names. When writing code, the same rules apply - here are twolines which open two files - one GPOS file, and one real-time device:

int fd, rtl_fd;

fd = open("/gpos_file", O_NONBLOCK);

rtl_fd = rtl_open("/rtcore_device", RTL_O_NONBLOCK);

The same mechanism applies for read(), write(), pthread create(), andso on, for the rest of the POSIX functions provided to PSDD by RTCore.Note that a file descriptor associated a GPOS file (fd in this example) is notinterchangable with an RTCore file descriptor - each system has their ownset of file descriptors, thread identifiers, etc.

I.4 Include hierarchies and rules

RTCore provides 3 main hierarchies of headers:

• app

• rtcore

• gpos bridge

Each one provides a specific set of information:

I.4.1 app/

This provides the pure POSIX environment for users. In that directory isa set of POSIX-compliant files that provide standard names and functions.Below that is an rtl directory that provides those same names with the rtlprefix. In-kernel users get this and the subdirectory in their include path bydefault, PSDD users only get the subdirectory (with the rtl prefixes.)

A user that does this:


#include <string.h>

gets standard functions like ’strcmp()’. This also includes the rtl string.hfile, which provides the real ’rtl strcmp()’ function. This allows the two tobe interchangable in clean code, but polluted code can include ’rtl string.h’and get the RTCore name only. This also allows users to include ’rtl errno.h’and get RTCore’s errno set, without conflicting with any other provided setof values.

Users get this directory in their include path by default, along with thertl subdirectory, so no include path additions are needed.

I.4.2 rtcore/

This directory is internal to RTCore, and will not be visible to most users.This contains internal header information specific to RTCore, and does notprovide supported interfaces to RTCore. Users who get this directory treeand do need files in this directory can uniquely get these files with:

#include <rtcore/file_x.h>

I.4.3 gpos bridge/

Users who need to use GPOS-provided names can use facilities provided inthis directory. For example, a user who simply wants to get as much aspossible from the GPOS namespace can do:

#include <gpos_bridge/sys/gpos.h>

As was seen earlier, this gets a lot of names from the GPOS withouthaving to explicitly name them. The user will see a warning that GPOSheaders are being used, but this can be supressed by adding this to the topof the file:

#define __RTCORE_POLLUTED_APP__

This must be done before including any files, so the entire compilationknows that GPOS pollution is intended. When the programmer includesGPOS files, the build will see this, but will not warn the programmer aboutthe fact that pollution has been detected.

I.5. INCLUDING GPOS FILES 237

I.5 Including GPOS files

GPOS files can be included directly, in addition to the gpos.h file discussedabove. However, the GPOS generally does not expect the namespace to bepopulated. Because of this, any GPOS-specific files (such as sched.h in ourearlier example) must be include first, before including RTCore files. TheRTCore header system will determine if the GPOS has created any names,and handle them.

Users who explicitly include GPOS kernel files without including gpos.hmust define KERNEL before including the file, as most GPOS kernelheaders expect this to be defined. (gpos.h defines it for the developer.)

Users can also include rtcore app.h. This identifies the user as an ap-plication and enables build checks to detect GPOS pollution. However, theRTCore-provided POSIX files in app/ include this by default, so the checkis generally done transparently.

I.6 Quick rules

I.6.1 Older apps that must be polluted

Users with older applications that would like to remain polluted should followthese steps:

• Use #include <rtl stdio.h> and similarly prefixed names for POSIXincludes, instead of #include <stdio.h>, for example.

• Use rtl and RTL prefixes before RTCore-based function calls, con-stants, etc.

• Include GPOS-specific files before RTCore include files. If the user isnot going to include gpos bridge/sys/gpos.h, they will need to adda #define KERNEL before including GPOS headers.

• Add #define RTCORE POLLUTED APP to the front of all polluted .cfiles, before includes, to supress any build warnings.

• #include <rtl.h> can be used to easily get much of the pollutionthat was previously present.


I.6.2 Older users that want to avoid pollution

Users that want to easily avoid pollution can simply split pieces that dependon GPOS-specific headers into a separate .c file, compile it in a separate step,and link it to the application.

I.6.3 Exceptions

The RTCore header system can handle most of what a GPOS may definefor it, but there are some rare cases where the GPOS may provide a verycorrupted set of names. These usually result in POSIX names being redefinedto version-specific internal names from the GPOS, which may cause problemsduring a build.

An example of this is signal.h - this may redefine POSIX names to in-ternal structures, and can corrupt the namespace in applications that needto be polluted. In PSDD applications, the non-real-time GPOS thread mayneed to use signal.h and the functions it provides, so for this rare case, it isrecommended that users do the following:

#include <stdio.h>

#include <fcntl.h>

...

#include <signal.h>

/* Handle any GPOS specific signalling functions here */

void cleanup(void (*clean)(int)) {

struct sigaction sigact;

sigset_t set;

sigemptyset(&set);

sigact.sa_mask = set;

sigact.sa_handler = clean;

sigact.sa_flags = 0;

sigaction(SIGCONT, NULL, &sigact);

}

/* Now include RTCore headers */

I.6. QUICK RULES 239

#include <rtl_signal.h>

...

It should be stated that this is a very rare case and will not appear inmost applications. However, if it does and pollution is a requirement, suchas in a userspace real-time application, the best method is to include GPOSheaders, perform work with those headers, and then include RTCore headers.

Appendix J

System Testing

When selecting a platform for RTCore the only way to know if it will reallydo the job is to test it on the actual hardware. While the test environmentdoes not have to exactly mirror the target environment, the closer the testsystem is to the final system, the more reliable the results will be. In generalthe outcome of these tests provide answers to three essential questions:

1. Can I run RTCore on this system at all or is it simply not suited?

2. What is the worst case scheduling jitter to be expected in this systemsetup?

3. What interrupt response may I expect from my peripheral devices?

This will not eliminate the requirement to evaluate the final system setupthe developer wishes to deploy, but it will minimize the risk of running intohardware related problems during project development.

J.1 Running the regression test

The regression test will let the programmer know if RTCore will operateproperly on the selected system. If the regression test fails the system iseither not installed correctly or is simply not suitable for real-time work.Furthermore, if the regression test fails, please contact [email protected].

After the user compiles and loads the updated kernel, change to thertlinuxpro directory and issue the following command:

241

242 APPENDIX J. SYSTEM TESTING

bash scripts/regression.sh

This will then run a set of tests, which MUST all return the status [ OK]. If any of the test fail, contact [email protected]. If the first test passedwithout any errors, running the regression test for a while is generally helpfulalso. To run the test in an infinite loop, issue the following command againfrom the rtlinuxpro directory:

bash scripts/long_regression.sh

(This is the normal regression script run in an infinite loop, printing thenumber of runs completed as it goes.)

J.1.1 Stress testing

The idea of testing under heavy load cannot be stressed enough. It is im-portant to see how the real-time system behaves in terms of scheduling whenplaced under varying loads. Some jitter shift will occur due to hardware load,but this should be minimal.

Running the jitter test is easy. Change directories to the rtlinuxpro/

measurement directory. There the user will find a ’jitter.rtl’ binary. Run thisand it will print out worst case timings seen so far on each CPU. It will onlyprint a message when a new value is seen.

At this point, the real-time threads are running and it is time to place themachine under load. This can vary greatly by hardware, but here is a basicstart. It is important to put the machine under heavy interrupt, memory,and CPU load. First, change to the kernel directory and run:

make dep

make clean

make -j 60

And/or on another console, log in and run several instances of:

find / 2>&1 > /dev/null &

This will add to the thrashing of the GPOS VM. Increase the numberof find processes running, preferably staggered in time so that the buffercache is cycled through. Add other applications until swapping is induced,

J.2. JITTER MEASUREMENT 243

and the system is under heavy load. For SMP machines, it helps to havemore instances running, as each CPU thrashes over the PCI bus. For someembedded boards, running make on the kernel is not feasible, but a highnumber of finds is a good approximation, when done in conjunction with thenext step.

Finally, run a ping flood (ping -f machine) from another machine onthe network, at least over a 100Mbit wire. This, in addition to the disk work,will put the machine under heavy interrupt load. Feel free to add more work,as RTCore will handle the load. It is important that the developer determineswhat the hardware is capable of doing with respect to real-time demands.

Many test applications from other vendors do very short tests, either intime or number of interrupts (some as short as a minute). Due to potentialcache interactions and other factors, it is important that a test machine beplaced under load for a long time, preferably days. FSMLabs performs alltesting under heavy load for a period of at least 48 hours before releasingany kind of performance numbers.

J.2 Jitter measurement

Scheduling jitter is defined as the difference between the time that codewas scheduled to run and the actual point at which it executes. Schedulingoverhead and hardware latencies contribute to this value, and while somejitter will nearly always happen, it is important to get a worst case value forthe hardware.

Note that most companies provide worst case numbers in terms of contextswitch times. This number is in most cases useless except from a marketingstandpoint. Consider an absolute worst case in the real world, where a threadneeds to execute at time x. Context switch tells only a small part of the workthat must happen here. First, the timer interrupt needs to occur indicatingthat it’s time to work. Then the scheduler needs to be woken in order todetermine who gets executed next. Finally there has to be a context switchinto the context of the thread that should be run.

RTCore is well optimized for these situations. When FSMLabs quotesworst case numbers, it is the sum total of not just context switch, but allthree factors:

interrupt latency +scheduling overhead+ context switch = worst case

244 APPENDIX J. SYSTEM TESTING

The previously run test involves a real-time thread scheduled on eachCPU to be run every 1000 microseconds. At each scheduling point, it cal-culates the delta of how far off it was from the expected scheduling point.The code will perform 1000 samples per second and will push the results toa handler that may dump results through to the controlling terminal.

In general, the load on the machine will not affect the running of thereal-time code, although high interrupt rates will cause a shift in the worstcase value. As what has just been covered in J.1.1, the machine should beplaced under heavy load in order to get an accurate worst case value. Oncethis value has been gathered, kill the userspace application and unload thereal-time module.

Appendix K

Sample programs

Here is a collection of the source code for all of the examples used in the book.They are also be provided in the RTLinuxPro distribution in doc/pdf/rtl book code.

K.1 Hello world

#include <stdio.h>

int main(void){

printf("Hello from the RTL base system\n");return 0;

}

K.2 Multithreading

#include <stdio.h>#include <pthread.h>#include <unistd.h>

pthread t thread;


245

246 APPENDIX K. SAMPLE PROGRAMS

struct timespec next;int count = 0; 10


while (1) {timespec add ns(&next, 1000*1000);clock nanosleep(CLOCK REALTIME, TIMER ABSTIME,

&next, NULL);count++;if (!(count % 1000))

printf("woke %d times\n",count); 20

}

return NULL;}

int main(void){


rtl main wait(); 30


return 0;}

K.3 FIFOs

K.3.1 Real-time component

#include <stdio.h>#include <pthread.h>

K.3. FIFOS 247

#include <unistd.h>#include <sys/mman.h>#include <sys/types.h>#include <sys/stat.h>

pthread t thread;int fd1;

10





clock nanosleep(CLOCK REALTIME, TIMER ABSTIME, 20

&next, NULL);

write( fd1, "a message\n", strlen("a message\n"));}

return NULL;}

int main(void){ 30

mkfifo("/communicator", 0666);

fd1 = open("/communicator", O RDWR | O NONBLOCK);

ftruncate(fd1, 16<<10);


rtl main wait();40



close(fd1);unlink("/communicator");

return 0;}

K.3.2 Userspace component

#include <stdio.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <unistd.h>

int main(int argc, char **argv) {int fd;char buf[255];

10

fd = open("/communicator", O RDONLY);while (1) {

read(fd,buf,255);printf("%s",buf);sleep(1);

}}

K.4 Semaphores

#include <stdio.h>#include <pthread.h>#include <unistd.h>#include <semaphore.h>

K.4. SEMAPHORES 249

pthread t wait thread;pthread t post thread;sem t sema;

void *wait code(void *t) 10

{while (1) {

sem wait(&sema);printf("Waiter woke on a post\n");

}}

void *post code(void *t){

struct timespec next; 20



clock nanosleep( CLOCK REALTIME,TIMER ABSTIME, &next, NULL);

printf("Posting to the semaphore\n"); 30

sem post(&sema);}

return NULL;}

int main(void){

sem init(&sema, 1, 0);40

pthread create(&wait thread, NULL, wait code, (void *)0);pthread create(&post thread, NULL, post code, (void *)0);


rtl main wait();

pthread cancel(post thread);pthread join(post thread, NULL);

pthread cancel(wait thread);pthread join(wait thread, NULL); 50

sem destroy(&sema);

return 0;}

K.5 Shared Memory

K.5.1 Real-time component

#include <time.h>#include <fcntl.h>#include <pthread.h>#include <unistd.h>#include <stdio.h>#include <sys/mman.h>#include <errno.h>

#define MMAP SIZE 500310

pthread t rthread, wthread;int rfd, wfd;unsigned char *raddr, *waddr;

void *writer(void *arg){


p.sched priority = 1; 20

K.5. SHARED MEMORY 251

pthread setschedparam(pthread self(), SCHED FIFO, &p);

waddr = (char*)mmap(0,MMAP SIZE,PROT READ|PROT WRITE,MAP SHARED,wfd,0);

if (waddr == MAP FAILED) {printf("mmap failed for writer\n");return (void *)−1;

}

clock gettime(CLOCK REALTIME, &next); 30

while (1) {timespec add ns(&next, 1000000000);clock nanosleep(CLOCK REALTIME, TIMER ABSTIME,

&next, NULL);waddr[0]++;waddr[1]++;waddr[2]++;waddr[3]++;

}} 40

void *reader(void *arg){


p.sched priority = 1;pthread setschedparam(pthread self(), SCHED FIFO, &p);

raddr = (char*)mmap(0,MMAP SIZE,PROT READ|PROT WRITE, 50

MAP SHARED,rfd,0);if (raddr == MAP FAILED) {

printf("failed mmap for reader\n");return (void *)−1;

}



timespec add ns(&next, 1000000000);clock nanosleep(CLOCK REALTIME, TIMER ABSTIME, 60

&next, NULL);printf("rtl_reader thread sees "

"0x%x, 0x%x, 0x%x, 0x%x\n",raddr[0], raddr[1], raddr[2], raddr[3]);

}}


70

wfd = shm open("/dev/rtl_mmap_test", RTL O CREAT, 0600);if (wfd == −1) {

printf("open failed for write on "


}

rfd = shm open("/dev/rtl_mmap_test", 0, 0);if (rfd == −1) {

printf("open failed for read on " 80


}

ftruncate(wfd,MMAP SIZE);

pthread create(&wthread, NULL, writer, 0);pthread create(&rthread, NULL, reader, 0);

rtl main wait(); 90

pthread cancel(wthread);pthread join(wthread, NULL);pthread cancel(rthread);pthread join(rthread, NULL);munmap(waddr, MMAP SIZE);

K.5. SHARED MEMORY 253

munmap(raddr, MMAP SIZE);

close(wfd);close(rfd); 100

shm unlink("/dev/rtl_mmap_test");return 0;

}

K.5.2 Userspace application

#include <stdio.h>#include <unistd.h>#include <sys/mman.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <stdlib.h>#include <errno.h>

#define MMAP SIZE 5003 10

int main(void){

int fd;unsigned char *addr;

if ((fd=open("/dev/rtl_mmap_test", O RDWR))<0) {perror("open");exit(−1);

} 20

addr = mmap(0, MMAP SIZE, PROT READ, MAP SHARED, fd, 0);if (addr == MAP FAILED) {

printf("return was %d\n",errno);perror("mmap");exit(−1);

}


while (1) {printf("userspace: the rtl shared area contains" 30

" : 0x%x, 0x%x, 0x%x, 0x%x\n",addr[0], addr[1], addr[2], addr[3]);

sleep(1);}

munmap(addr, MMAP SIZE);

close(fd);return 0;

} 40

K.6 Cancel Handlers

#include <time.h>#include <unistd.h>#include <pthread.h>#include <stdio.h>

pthread t thread;pthread mutex t mutex;

void cleanup handler(void *mutex){ 10

pthread mutex unlock((pthread mutex t *)mutex);}

void *thread handler(void *arg){

pthread cleanup push(cleanup handler,&mutex);pthread mutex lock(&mutex);while (1) { usleep(1000000); }pthread cleanup pop(0);pthread mutex unlock(&mutex); 20

return 0;}

K.7. THREAD API 255


pthread mutex init (&mutex, NULL);pthread create (&thread, NULL, thread handler, 0);

rtl main wait();30

pthread cancel (thread);pthread join (thread, NULL);pthread mutex destroy(&mutex);return 0;

}

K.7 Thread API

#include <time.h>#include <pthread.h>#include <stdio.h>

pthread t thread1, thread2;void *thread stack;

void *handler(void *arg){

printf("Thread %d started\n",arg); 10

if (arg == 0) { //first thread spawns the secondpthread attr t attr;pthread attr init(&attr);pthread attr setstacksize(&attr, 32768);pthread attr setstackaddr(&attr,thread stack);pthread create(&thread2,&attr,handler,(void*)1);

}

return 0;} 20



thread stack = rtl gpos malloc(32768);if (!thread stack)

return −1;

pthread create(&thread1, NULL, handler, (void*)0);30

rtl main wait();

pthread cancel(thread1);pthread join(thread1, NULL);pthread cancel(thread2);pthread join(thread2, NULL);rtl gpos free(thread stack);return 0;

}

K.8 One Way queues

#include <time.h>#include <stdio.h>#include <unistd.h>#include <pthread.h>#include <onewayq.h>

pthread t thread1, thread2;

DEFINE OWQTYPE(our queue,32,int,0,−1);DEFINE OWQFUNC(our queue,32,int,0,−1); 10

our queue Q;

void *queue thread(void *arg){

int count = 1;struct timespec next;

K.8. ONE WAY QUEUES 257



clock nanosleep(CLOCK REALTIME,TIMER ABSTIME, &next, NULL);

if (our queue enq(&Q,count)) {printf("warning: queue full\n");

}count++;

}}

30

void *dequeue thread(void *arg){

int read count;struct timespec next;


timespec add ns(&next, 500000000);clock nanosleep(CLOCK REALTIME,

TIMER ABSTIME, &next, NULL); 40

read count = our queue deq(&Q);if (read count) {

printf("dequeued %d\n",read count);

} else {printf("queue empty\n");

}}

}50


our queue init(&Q);pthread create (&thread1, NULL,


queue thread, 0);pthread create (&thread2, NULL,

dequeue thread, 0);

rtl main wait();60

pthread cancel (thread1);pthread join (thread1, NULL);pthread cancel (thread2);pthread join (thread2, NULL);return 0;

}

K.9 Processor reserve/optimization

#include <pthread.h>#include <stdio.h>#include <semaphore.h>

pthread t thread;unsigned long mask = 0x2, oldmask;sem t irq sem;

unsigned int irq handler(unsigned int irq, struct rtl frame *regs){ 10

rtl global pend irq(irq);sem post(&irq sem);return 0;

}


rtl irq set affinity(12, &mask, &oldmask);20

while (1) {sem wait(&irq sem);

K.10. SOFT IRQS 259

printf("Got IRQ 12\n");}return NULL;

}

int main(void){

pthread attr t attr; 30

sem init(&irq sem, 1, 0);pthread attr init(&attr);pthread attr setcpu np(&attr, 1);pthread attr setreserve np(&attr, 1);

rtl request irq(12, irq handler);

pthread create(&thread, &attr, thread code, 0);40

rtl main wait();

pthread cancel (thread);pthread join (thread, NULL);

rtl free irq(12);

rtl irq set affinity(12, &oldmask, &mask);sem destroy(&irq sem);

50

return 0;}

K.10 Soft IRQs

#include <time.h>#include <stdio.h>#include <pthread.h>#include <stdlib.h>


pthread t thread;static int our soft irq;


struct sched param p;struct timespec next;

p.sched priority = 1;pthread setschedparam (pthread self(),

SCHED FIFO, &p);



clock nanosleep (CLOCK REALTIME,TIMER ABSTIME, &next, NULL);

rtl global pend irq(our soft irq);}return 0;

}

static int soft irq count;

void soft irq handler(int irq, void *ignore, 30

struct rtl frame *ignore frame) {soft irq count++;printf("Recieved soft IRQ #%d\n",soft irq count);

}


soft irq count = 0;our soft irq = rtl get soft irq(soft irq handler,

"Simple SoftIRQ\n"); 40

if (our soft irq == −1)return −1;

K.11. PSDD SOUND SPEAKER DRIVER 261

pthread create(&thread, NULL, start routine, 0);

rtl main wait();

pthread cancel(thread);pthread join(thread, NULL);rtl free soft irq(our soft irq);return 0; 50

}

K.11 PSDD sound speaker driver

#define RTL RTC A 10#define RTL RTC B 11#define RTL RTC C 12#define RTL RTC D 13

#define RTL RTC PORT(x) (0x70 + (x))#define RTL RTC WRITE(val, port) do { rtl outb p((port),RTL RTC PORT(0)); \rtl outb p((val),RTL RTC PORT(1)); } while(0)#define RTL RTC READ(port) ({ rtl outb p((port),RTL RTC PORT(0)); \

rtl inb p(RTL RTC PORT(1)); }) 10

#include <rtl pthread.h>#include <rtl unistd.h>#include <rtl time.h>#include <rtl signal.h>#include <rtl errno.h>#include <rtl stdio.h>#include <sys/rtl io.h>#include <sys/rtl types.h>#include <sys/rtl stat.h> 20

#include <rtl fcntl.h>#include <sys/rtl ioctl.h>#include <unistd.h>

#define FIFO NO 3


#define RTC IRQ 8

int fd fifo;int fd irq;rtl pthread t thread; 30

char save cmos A;char save cmos B;

static int filter(int x){

static int oldx;int ret;

if (x & 0x80) {x = 382 − x; 40

}ret = x > oldx;oldx = x;return ret;

}

void *sound thread(void *param){

char data; 50

char temp;struct rtl siginfo info;while (1) {

rtl read(fd irq, &info, sizeof(info));(void) RTL RTC READ(RTL RTC C); /* clear IRQ */rtl ioctl(fd irq, RTL IRQ ENABLE);

if (rtl read(fd fifo, &data, 1) > 0) {data = filter(data);temp = rtl inb(0x61); 60

temp &= 0xfc;if (data) {

temp |= 3;

K.11. PSDD SOUND SPEAKER DRIVER 263

}rtl outb(temp,0x61);

}}return 0;

}70


char ctemp;

char devname[30];sprintf(devname, "/dev/rtf%d", FIFO NO);fd fifo = rtl open(devname, RTL O WRONLY|

RTL O CREAT|RTL O NONBLOCK);if (fd fifo < 0) {

rtl printf("open of %s returned %d; errno = %d\n", 80

devname, fd fifo, rtl errno);return −1;

}rtl ioctl (fd fifo, RTF SETSIZE, 4000);

fd irq = rtl open("/dev/irq8", RTL O RDONLY);if (fd irq < 0) {

rtl printf("open of /dev/irq8 returned %d; errno = %d\n",fd irq, rtl errno);

rtl close(fd fifo); 90

return −1;}

rtl pthread create (&thread, NULL, sound thread, NULL);

/* program the RTC to interrupt at 8192 Hz */save cmos A = RTL RTC READ(RTL RTC A);save cmos B = RTL RTC READ(RTL RTC B);

/* 32kHz Time Base, 8192 Hz interrupt frequency */ 100

RTL RTC WRITE(0x23, RTL RTC A);


ctemp = RTL RTC READ(RTL RTC B);ctemp &= 0x8f; /* Clear */ctemp |= 0x40; /* Periodic interrupt enable */RTL RTC WRITE(ctemp, RTL RTC B);

(void) RTL RTC READ(RTL RTC C);rtl ioctl(fd irq, RTL IRQ ENABLE);

while (1) { 110

sleep(1000);}

rtl pthread cancel (thread);rtl pthread join (thread, NULL);

RTL RTC WRITE(save cmos A, RTL RTC A);RTL RTC WRITE(save cmos B, RTL RTC B);rtl close(fd irq);rtl close(fd fifo); 120

return 0;}

real-time programming in...

Documents