the 27 year old microkernel sebastien marineau-mes & colin burgess
TRANSCRIPT
![Page 1: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/1.jpg)
The 27 Year Old MicrokernelSebastien Marineau-Mes & Colin Burgess
![Page 2: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/2.jpg)
2All content copyright QNX Software Systems
Agenda
Background and timeline Hybrid Software Model Anatomy of the microkernel How things work – system calls, process manager Q&A
![Page 3: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/3.jpg)
3All content copyright QNX Software Systems
1985: First memory-protected RTOS
1997: First RTOS to support symmetric multiprocessing (SMP)
A History of Software Innovation
1980 1985 1990 1995 2000 2005
1980: First commercially available microkernel OS
1982: First RTOS to support a hard disk on a PC
1992: First RTOS to offer built-in fault-tolerant networking
1994: US patent for scalable microkernel windowing system
2002: First RTOS vendor to deliver Eclipse-based IDE
2005: First to offer “bound” multi-processing
QNX2 QNX4 QNX6
1990: First POSIX-certified RTOS
2007: Introduces hybrid software model and opens source code
![Page 4: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/4.jpg)
4All content copyright QNX Software Systems
Hybrid Software Model
Developer Enablement> Published source code – runtime components> Transparent development
QNX development teams working in the open Live check-ins for features and bugfixes
Community Enablement> Foundry 27 developer portal> Initial projects: OS, Tools, BSPs, Bazaar
Business Enablement> Free access to development tools for non-commercial and partners> Free access to source for development
Standard business model & pricing for commercial projects> Ability to create and distribute derivative works > Flexible contribution model
![Page 5: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/5.jpg)
5All content copyright QNX Software Systems
Microkernel Architecture
FileSystem
Networking Multi-mediaWindowingProcessManager
Application
Microkernel+
Process Managerare the only trusted
components
MicrokernelArm, Mips, SH4
PowerPC, Xscale, X86
µKMessage Bus
Applications and Drivers Are processes which plug into a message bus• Reside in their own memory-protected address space• Have a well defined message interface• Cannot corrupt other software components• Can be started, stopped and upgraded on the fly
![Page 6: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/6.jpg)
6All content copyright QNX Software Systems
Separation of Duties – Process Manager vs. MicroKernel
Messages
Threads
Synchronization
Scheduling
Signals
Channels
Connections
Interrupts
Timers
Pathname
Process
Virtual Memory
procfs
Debug
Loader
Named Sems
imagefs
Resources
Microkernel Process Manager
procnto
![Page 7: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/7.jpg)
7All content copyright QNX Software Systems
Microkernel Services
Messages
Threads
Synchronization
Scheduling
Signals
Channels
Connections
Interrupts
Simple pre-emptable operations Provides basic system services
> Implements much of the POSIX thread and realtime standard
> Interrupt and exception redirection
> IPC primitives
Most of the microkernel is hardware independent> CPU-dependant layer for low-level cpu interfaces
> CPU-specific optimized routines
Only pieces of code that runs with full system privilege
Microkernel does not run “on its own”> Only reacts to external events: system calls,
interrupts, exceptions
Timers
![Page 8: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/8.jpg)
8All content copyright QNX Software Systems
Process Manager Services
Implements long, complex operations and services> Ex: Process creation and memory
management
Is a multi-threaded process that is scheduled at normal priority> Competes for CPU with all other threads
in the system
Message driven server More on this later
Pathname
Process
Virtual Memory
procfs
Debug
Loader
Named Sem
imagefs
Resources
Process Manager
![Page 9: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/9.jpg)
9All content copyright QNX Software Systems
Procnto Source Layout
procker pathmgr
arm mips ppc sh x86
/services/system/
procmgr memmgr
arm mips …
Mic
roke
rnel
ser
vice
s
Virtua
l Mem
ory
Man
agem
ent
Pro
cess
li
fecy
le
man
agem
ent
Pat
hn
ame
man
agem
ent
Su
pp
ort
F
un
ctio
ns
Looking for source code? Go to www.foundry27.com -> Projects -> Core Operating System -> Source Guide
![Page 10: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/10.jpg)
10All content copyright QNX Software Systems
Kernel entry
Kernel exit
Entry Interrupts off
Unlocked
KernelOperation
whichmay
includemessage
pass
Pre-emptable
Exit
Locked No pre-emptionInterrupts on
Unlocked Pre-emptable
Kernel call operations sequence
Interrupts off
![Page 11: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/11.jpg)
11All content copyright QNX Software Systems
Anatomy of a Kernel Call
A user-mode thread makes a call to a system call stub located in libc> Ex: MsgSendvnc()
The system call stub executes a TRAP instruction (or whatever instruction is appropriate for the particular hardware).
MsgSendvnc:lw $8,16($29)addiu $2,$0,12syscalljr $31
nop
The processor changes privilege state, interrupts are disabled Execution resumes at the appropriate vector services/system/ker/<cpu>/kernel.s
> One of the few pieces of code that is all assembly – see __ker_entry, __ker_sysenter/* * r4k_syscall_handler() * Streamlined path for our most common operation--kernel calls */FRAME(r4k_syscall_handler,sp,0,ra) .set noat /* * Coming from user mode. Save user registers, and get * a fresh kernel stack. Move GP to our own short data * area. */ LD_ACTIVE_AND_KERSTACK(k0,k1) addiu k0,k0,REG_OFF
SAVE_REGS(1)
![Page 12: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/12.jpg)
12All content copyright QNX Software Systems
Kernel Entry
/* * r4k_syscall_handler() * Streamlined path for our most common operation--kernel calls */FRAME(r4k_syscall_handler,sp,0,ra) /* * Coming from user mode. Save user registers, and get * a fresh kernel stack. Move GP to our own short data * area. */ LD_ACTIVE_AND_KERSTACK(k0,k1)
SAVE_REGS(1)
ACQUIRE_KERNEL(INKERNEL_NOW,zero,1)
/* * Interrupts are now OK again */ STI /* * Kernel call number should still be intact in v0. * Save the kernel call number. */ sw v0,SYSCALL(s0)
#if defined(VARIANT_instr) la t1,_trace_call_table#else la t1,ker_call_table#endif
/* * Index the call table and run the C code */
Load kernel stack
System call entry
Save thread context (register set)
Acquire the kernel lock• On uni processor, atomically set the INKERNEL_NOW bit• On SMP systems, spinlock on INKERNEL_NOW
Enable Interrupts
Transfer to kernel call implementation• services/system/ker_call_table.c
![Page 13: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/13.jpg)
13All content copyright QNX Software Systems
Kernel Function Implementation
int kdeclker_timer_create(THREAD *act, struct kerargs_timer_create *kap) { VALID_CLOCKID(kap);
if(kap->event) { RD_VERIFY_PTR(act, kap->event, sizeof(*kap->event)); RD_PROBE_INT(act, kap->event, sizeof(*kap->event) / }
prp = act->process; …All done up-front work, ready to do the real
work
Verify pointers referenced by kernel are valid• RD_PROBE/WR_PROBE functions• RD_VERIFY_*/WR_VERIFY_* functions• If addresses are no accessible, a fault will be generated and kernel call will return with EFAULT
It’s very important to get the validation right, as a fault (due to invalid or malicious parameter passed in to call) could be catastrophic
Entry from kercall table
Validate parameters
![Page 14: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/14.jpg)
14All content copyright QNX Software Systems
Kernel entry
Kernel exit
Entry Interrupts off
Unlocked
KernelOperation
whichmay
includemessage
pass
Pre-emptable
Exit
Locked No pre-emptionInterrupts on
Unlocked Pre-emptable
Microkernel Pre-emption
Kernel call preemption is important Interrupt activity may READY a
higher priority thread to run while a kernel call is in progress
We want to immediately schedule this higher priority THREAD (minimize scheduling latency)
QNX does this in a novel way – pre-emptable kernel
> Defer changing global kernel state> Implementation of kernel ops is 2 stages:
do the work followed by a “commit” On preemption, the active thread’s
IP is backed up to re-execute the SYSENTER instruction
Allows us to only have one kernel stack – not one per thread
Any memory references/calculations done before locking kernel must be re-startable
Interrupts off
![Page 15: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/15.jpg)
15All content copyright QNX Software Systems
Lock Kernel
Argument Verification
User Pointer Verification
Lock kernel to change status
int kdecl ker_sched_get(THREAD *act, struct kerargs_sched_get *kap) { PROCESS *prp; THREAD *thp;
// Verify the target process exists. if((prp = (kap->pid ? lookup_pid(kap->pid) : act->process)) == NULL) return ESRCH; // Verify the target thread exists. if((thp = (kap->tid ? vector_lookup(&prp->threads, kap->tid-1) : act)) == NULL) return ESRCH; // Verify we have the right to examine the target process if(!kerisusr(act, prp)) return ENOERROR;
if(kap->param) { WR_VERIFY_PTR(act, kap->param, sizeof(*kap->param)); WR_PROBE_OPT(act, kap->param, sizeof(*kap->param) / sizeof(int)); kap->param->sched_curpriority = thp->priority; kap->param->sched_priority = thp->real_priority; }
lock_kernel(); SETKSTATUS(act,thp->policy); return ENOERROR;}
Most of the preperatory work is done before locking kernel, if possible.
![Page 16: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/16.jpg)
16All content copyright QNX Software Systems
Exit Kernel
Exit kernel to run user-space thread> Note that currently scheduled thread may have changed
Ex: entered kernel due to HW interrupt, interrupt readied high-prio thread (that high-prio thread becomes RUNNING)
Ex: Blocking kernel causes current thread to be blocked, another to be made RUNNING
> __ker_exit implements this
Adjust the address space if needed> memmgr.aspace()
Do special return processing> Deliver signals, pulses etc> This may cause a reschedule which could cause another loop
through __ker_exit
Restore the context of the (newly) active thread Call SYSEXIT
![Page 17: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/17.jpg)
17All content copyright QNX Software Systems
What about “Non Kernel” System Calls?
In many cases, traditional UNIX system calls are not implemented by the micro-kernel on QNX. > They are implemented in the process manager or in external
servers that extend procnto
In general, many of the lengthier core POSIX operations are done by the process manager
![Page 18: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/18.jpg)
18All content copyright QNX Software Systems
Process Manager
First process in system> Created by kernel (init_objects)
Provides core services to other processes Multi-threaded Process
> First <ncpus> threads are IDLE threads> Additional threads are threadpool worker threads
Message driven server Actually a collection of (almost) independent servers 4 message handlers 11(!) resource managers
> These resource managers are actually mini filesystems.
![Page 19: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/19.jpg)
19All content copyright QNX Software Systems
Process Manager
/
proc/rsrcdbmgr_*
proc/sysmgr_*
procmgr/*
/dev/mem
/dev/null
/dev/text
/dev/tty
/dev/zero
/dev/shmem
/dev/tymem
/dev/semmemmgr/*
SYSMGR_COID
Message Handlers Resource Managers (pathmgr/*)
/proc/boot
/proc
![Page 20: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/20.jpg)
20All content copyright QNX Software Systems
Process Manager
Normal process… but It has certain privileges
> Executes at higher processor privilege level This varies depending on processor architecture
> Executes in kernel address space Not quite true Because proc’s address space and user address spaces don’t
overlap, it may adopt a users address space. This makes for faster message passes between proc and user applications.
> Has permission to use __Ring0() kernel call
![Page 21: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/21.jpg)
21All content copyright QNX Software Systems
Process Manager
__Ring0 Kernel Call> Used by proc when it needs to execute code in the kernel context
Mostly used when manipulation of kernel structures is required Provide atomicity of kernel state modifications to ensure
consistency
> Occasionally used when processor privilege is required Ex: manipulate privileged CPU registers
> Arguments are a function pointer and a data pointer Remember process manager shares address space with kernel
> _NTO_PF_RING0 flag needed to use this kernel call
> Only process manager has this flag set
![Page 22: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/22.jpg)
22All content copyright QNX Software Systems
Process Manager Example
The process manager implements many services which would actually be “kernel calls” in traditional UNIX
Example – mmap()> mmap() is the API through which all mappings are setup by user
processes
> Malloc uses mmap() to allocate heap memory, also known as “anonymous” memory, since it is not a mapping of a named object.
> Not implemented as a kernel call, but rather a message that is sent to the process manager
![Page 23: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/23.jpg)
23All content copyright QNX Software Systems
mmap()
void *_mmap(void *addr, size_t len, int prot, int flags, int fd, off64_t off, unsigned align, unsigned preload, void **base, size_t *size) {
mem_map_t msg;
msg.i.type = _MEM_MAP; msg.i.zero = 0; msg.i.addr = (uintptr_t)addr; msg.i.len = len; msg.i.prot = prot; msg.i.flags = flags; msg.i.fd = fd; msg.i.offset = off; msg.i.align = align; msg.i.preload = preload; msg.i.reserved1 = 0; if(MsgSendnc(MEMMGR_COID, &msg.i, sizeof
msg.i, &msg.o, sizeof msg.o) == -1) { return MAP_FAILED; }
Type of operation
Parameters
Send message to procnto requesting operation be done
![Page 24: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/24.jpg)
24All content copyright QNX Software Systems
memmgr_handler()
The _MEM_MAP message type is picked up and passed to the memmgr message handler
switch(msg->type) { …
case _MEM_MAP: proc_wlock_adp(prp); status = memmgr_map(ctp, prp, &msg->map); break;
case _MEM_CTRL: proc_wlock_adp(prp); status = memmgr_ctrl(prp, &msg->ctrl); break;
![Page 25: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/25.jpg)
25All content copyright QNX Software Systems
Process Manager
MsgSendv()
vmm_mmap() map_create() pa_alloc() pte_manipulate()
MsgReplyv()
malloc()
mmap()
memmgr_map()MsgReceivev()
_MEM_MAP
User ProcessProcess Manager
return msg.o.addr;
![Page 26: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/26.jpg)
26All content copyright QNX Software Systems
Other Process Manager Services
Creating processes! The spawn() send a _PROC_SPAWN message to
create a new process The exec() ‘system call’ is actually a spawn message
with an SPAWN_EXEC flag set! The fork() ‘system call’ is a _PROC_FORK message Procfs debug filesystem
> Similar to unix procfs
> Used by debugger/pidin/ps
![Page 27: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/27.jpg)
27All content copyright QNX Software Systems
Ongoing kernel development
The kernel team is currently working on our next release> Codename “trinity 2”
Features include:> Memory management enhancements such as variable page support
(~15% improvement in system performance)> POSIX PSE52 certification> PPC 9xx processor support> ARMv6 support> Cross-endian QNET capabilities
Trinity 2 is currently feature complete – bugfixing/release process underway> Builds available on foundry27:
http://community.qnx.com/sf/wiki/do/viewPage/projects.core_os/wiki/Trinity2
![Page 28: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/28.jpg)
28All content copyright QNX Software Systems
Roadmap – QNX Source Postings
Source Bundle Release Date Description
Networking Nov 2007 Next Generation Networking stack, protocols, drivers (io-pkt)
Block Filesystems March 2008 Block Filesystems and Utilities
Flash Filesystems March 2008 Flash (NOR/NAND) Filesystems and Utilities
Network Filesystems March 2008 Block/Flash/Network Filesystems and Utilities
Devices and Drivers June 2008 Serial, Audio, USB, PCI frameworks and drivers
System Services June /2008 Additional system service managers
Window Systems Sept 2008 High level Photon server and services
Graphics System Sept 2008 Lower level graphics libraries and drivers
Multimedia Nov 2008 Full multimedia stack
![Page 29: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/29.jpg)
29All content copyright QNX Software Systems
Brainteasers
Need something to chew on? > Try to figure out the questions below and post the answer to the
OS_Tech forum on the OS project
> Prize for the first to answer each question
> QNX employees not eligible
1. What does STI expand to in the MIPS kernel?
2. NEED_PREEMPT(act) checks queued_event_priorityWhat sets queued_event_priority?
3. In the memmgr message handler, what is the purpose of “proc_wlock_adp(prp);”?
![Page 30: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/30.jpg)
30All content copyright QNX Software Systems
Want to learn more?
Check out the projects on www.foundry27.com Download the QNX Momentics suite on
www.qnx.com Download the microkernel source from the QNX
operating system project Read the tech articles and wiki pages (linked off
the project) Participate in the forums on the QNX operating
system project
![Page 31: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess](https://reader035.vdocument.in/reader035/viewer/2022081508/56649e3b5503460f94b2d44e/html5/thumbnails/31.jpg)
Questions?