the 27 year old microkernel sebastien marineau-mes & colin burgess

31
The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

Upload: emmeline-mason

Post on 11-Jan-2016

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

The 27 Year Old MicrokernelSebastien Marineau-Mes & Colin Burgess

Page 2: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

2All content copyright QNX Software Systems

Agenda

Background and timeline Hybrid Software Model Anatomy of the microkernel How things work – system calls, process manager Q&A

Page 3: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

3All content copyright QNX Software Systems

1985: First memory-protected RTOS

1997: First RTOS to support symmetric multiprocessing (SMP)

A History of Software Innovation

1980 1985 1990 1995 2000 2005

1980: First commercially available microkernel OS

1982: First RTOS to support a hard disk on a PC

1992: First RTOS to offer built-in fault-tolerant networking

1994: US patent for scalable microkernel windowing system

2002: First RTOS vendor to deliver Eclipse-based IDE

2005: First to offer “bound” multi-processing

QNX2 QNX4 QNX6

1990: First POSIX-certified RTOS

2007: Introduces hybrid software model and opens source code

Page 4: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

4All content copyright QNX Software Systems

Hybrid Software Model

Developer Enablement> Published source code – runtime components> Transparent development

QNX development teams working in the open Live check-ins for features and bugfixes

Community Enablement> Foundry 27 developer portal> Initial projects: OS, Tools, BSPs, Bazaar

Business Enablement> Free access to development tools for non-commercial and partners> Free access to source for development

Standard business model & pricing for commercial projects> Ability to create and distribute derivative works > Flexible contribution model

Page 5: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

5All content copyright QNX Software Systems

Microkernel Architecture

FileSystem

Networking Multi-mediaWindowingProcessManager

Application

Microkernel+

Process Managerare the only trusted

components

MicrokernelArm, Mips, SH4

PowerPC, Xscale, X86

µKMessage Bus

Applications and Drivers Are processes which plug into a message bus• Reside in their own memory-protected address space• Have a well defined message interface• Cannot corrupt other software components• Can be started, stopped and upgraded on the fly

Page 6: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

6All content copyright QNX Software Systems

Separation of Duties – Process Manager vs. MicroKernel

Messages

Threads

Synchronization

Scheduling

Signals

Channels

Connections

Interrupts

Timers

Pathname

Process

Virtual Memory

procfs

Debug

Loader

Named Sems

imagefs

Resources

Microkernel Process Manager

procnto

Page 7: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

7All content copyright QNX Software Systems

Microkernel Services

Messages

Threads

Synchronization

Scheduling

Signals

Channels

Connections

Interrupts

Simple pre-emptable operations Provides basic system services

> Implements much of the POSIX thread and realtime standard

> Interrupt and exception redirection

> IPC primitives

Most of the microkernel is hardware independent> CPU-dependant layer for low-level cpu interfaces

> CPU-specific optimized routines

Only pieces of code that runs with full system privilege

Microkernel does not run “on its own”> Only reacts to external events: system calls,

interrupts, exceptions

Timers

Page 8: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

8All content copyright QNX Software Systems

Process Manager Services

Implements long, complex operations and services> Ex: Process creation and memory

management

Is a multi-threaded process that is scheduled at normal priority> Competes for CPU with all other threads

in the system

Message driven server More on this later

Pathname

Process

Virtual Memory

procfs

Debug

Loader

Named Sem

imagefs

Resources

Process Manager

Page 9: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

9All content copyright QNX Software Systems

Procnto Source Layout

procker pathmgr

arm mips ppc sh x86

/services/system/

procmgr memmgr

arm mips …

Mic

roke

rnel

ser

vice

s

Virtua

l Mem

ory

Man

agem

ent

Pro

cess

li

fecy

le

man

agem

ent

Pat

hn

ame

man

agem

ent

Su

pp

ort

F

un

ctio

ns

Looking for source code? Go to www.foundry27.com -> Projects -> Core Operating System -> Source Guide

Page 10: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

10All content copyright QNX Software Systems

Kernel entry

Kernel exit

Entry Interrupts off

Unlocked

KernelOperation

whichmay

includemessage

pass

Pre-emptable

Exit

Locked No pre-emptionInterrupts on

Unlocked Pre-emptable

Kernel call operations sequence

Interrupts off

Page 11: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

11All content copyright QNX Software Systems

Anatomy of a Kernel Call

A user-mode thread makes a call to a system call stub located in libc> Ex: MsgSendvnc()

The system call stub executes a TRAP instruction (or whatever instruction is appropriate for the particular hardware).

MsgSendvnc:lw $8,16($29)addiu $2,$0,12syscalljr $31

nop

The processor changes privilege state, interrupts are disabled Execution resumes at the appropriate vector services/system/ker/<cpu>/kernel.s

> One of the few pieces of code that is all assembly – see __ker_entry, __ker_sysenter/* * r4k_syscall_handler() * Streamlined path for our most common operation--kernel calls */FRAME(r4k_syscall_handler,sp,0,ra) .set noat /* * Coming from user mode. Save user registers, and get * a fresh kernel stack. Move GP to our own short data * area. */ LD_ACTIVE_AND_KERSTACK(k0,k1) addiu k0,k0,REG_OFF

SAVE_REGS(1)

Page 12: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

12All content copyright QNX Software Systems

Kernel Entry

/* * r4k_syscall_handler() * Streamlined path for our most common operation--kernel calls */FRAME(r4k_syscall_handler,sp,0,ra) /* * Coming from user mode. Save user registers, and get * a fresh kernel stack. Move GP to our own short data * area. */ LD_ACTIVE_AND_KERSTACK(k0,k1)

SAVE_REGS(1)

ACQUIRE_KERNEL(INKERNEL_NOW,zero,1)

/* * Interrupts are now OK again */ STI /* * Kernel call number should still be intact in v0. * Save the kernel call number. */ sw v0,SYSCALL(s0)

#if defined(VARIANT_instr) la t1,_trace_call_table#else la t1,ker_call_table#endif

/* * Index the call table and run the C code */

Load kernel stack

System call entry

Save thread context (register set)

Acquire the kernel lock• On uni processor, atomically set the INKERNEL_NOW bit• On SMP systems, spinlock on INKERNEL_NOW

Enable Interrupts

Transfer to kernel call implementation• services/system/ker_call_table.c

Page 13: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

13All content copyright QNX Software Systems

Kernel Function Implementation

int kdeclker_timer_create(THREAD *act, struct kerargs_timer_create *kap) { VALID_CLOCKID(kap);

if(kap->event) { RD_VERIFY_PTR(act, kap->event, sizeof(*kap->event)); RD_PROBE_INT(act, kap->event, sizeof(*kap->event) / }

prp = act->process; …All done up-front work, ready to do the real

work

Verify pointers referenced by kernel are valid• RD_PROBE/WR_PROBE functions• RD_VERIFY_*/WR_VERIFY_* functions• If addresses are no accessible, a fault will be generated and kernel call will return with EFAULT

It’s very important to get the validation right, as a fault (due to invalid or malicious parameter passed in to call) could be catastrophic

Entry from kercall table

Validate parameters

Page 14: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

14All content copyright QNX Software Systems

Kernel entry

Kernel exit

Entry Interrupts off

Unlocked

KernelOperation

whichmay

includemessage

pass

Pre-emptable

Exit

Locked No pre-emptionInterrupts on

Unlocked Pre-emptable

Microkernel Pre-emption

Kernel call preemption is important Interrupt activity may READY a

higher priority thread to run while a kernel call is in progress

We want to immediately schedule this higher priority THREAD (minimize scheduling latency)

QNX does this in a novel way – pre-emptable kernel

> Defer changing global kernel state> Implementation of kernel ops is 2 stages:

do the work followed by a “commit” On preemption, the active thread’s

IP is backed up to re-execute the SYSENTER instruction

Allows us to only have one kernel stack – not one per thread

Any memory references/calculations done before locking kernel must be re-startable

Interrupts off

Page 15: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

15All content copyright QNX Software Systems

Lock Kernel

Argument Verification

User Pointer Verification

Lock kernel to change status

int kdecl ker_sched_get(THREAD *act, struct kerargs_sched_get *kap) { PROCESS *prp; THREAD *thp;

// Verify the target process exists. if((prp = (kap->pid ? lookup_pid(kap->pid) : act->process)) == NULL) return ESRCH; // Verify the target thread exists. if((thp = (kap->tid ? vector_lookup(&prp->threads, kap->tid-1) : act)) == NULL) return ESRCH; // Verify we have the right to examine the target process if(!kerisusr(act, prp)) return ENOERROR;

if(kap->param) { WR_VERIFY_PTR(act, kap->param, sizeof(*kap->param)); WR_PROBE_OPT(act, kap->param, sizeof(*kap->param) / sizeof(int)); kap->param->sched_curpriority = thp->priority; kap->param->sched_priority = thp->real_priority; }

lock_kernel(); SETKSTATUS(act,thp->policy); return ENOERROR;}

Most of the preperatory work is done before locking kernel, if possible.

Page 16: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

16All content copyright QNX Software Systems

Exit Kernel

Exit kernel to run user-space thread> Note that currently scheduled thread may have changed

Ex: entered kernel due to HW interrupt, interrupt readied high-prio thread (that high-prio thread becomes RUNNING)

Ex: Blocking kernel causes current thread to be blocked, another to be made RUNNING

> __ker_exit implements this

Adjust the address space if needed> memmgr.aspace()

Do special return processing> Deliver signals, pulses etc> This may cause a reschedule which could cause another loop

through __ker_exit

Restore the context of the (newly) active thread Call SYSEXIT

Page 17: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

17All content copyright QNX Software Systems

What about “Non Kernel” System Calls?

In many cases, traditional UNIX system calls are not implemented by the micro-kernel on QNX. > They are implemented in the process manager or in external

servers that extend procnto

In general, many of the lengthier core POSIX operations are done by the process manager

Page 18: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

18All content copyright QNX Software Systems

Process Manager

First process in system> Created by kernel (init_objects)

Provides core services to other processes Multi-threaded Process

> First <ncpus> threads are IDLE threads> Additional threads are threadpool worker threads

Message driven server Actually a collection of (almost) independent servers 4 message handlers 11(!) resource managers

> These resource managers are actually mini filesystems.

Page 19: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

19All content copyright QNX Software Systems

Process Manager

/

proc/rsrcdbmgr_*

proc/sysmgr_*

procmgr/*

/dev/mem

/dev/null

/dev/text

/dev/tty

/dev/zero

/dev/shmem

/dev/tymem

/dev/semmemmgr/*

SYSMGR_COID

Message Handlers Resource Managers (pathmgr/*)

/proc/boot

/proc

Page 20: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

20All content copyright QNX Software Systems

Process Manager

Normal process… but It has certain privileges

> Executes at higher processor privilege level This varies depending on processor architecture

> Executes in kernel address space Not quite true Because proc’s address space and user address spaces don’t

overlap, it may adopt a users address space. This makes for faster message passes between proc and user applications.

> Has permission to use __Ring0() kernel call

Page 21: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

21All content copyright QNX Software Systems

Process Manager

__Ring0 Kernel Call> Used by proc when it needs to execute code in the kernel context

Mostly used when manipulation of kernel structures is required Provide atomicity of kernel state modifications to ensure

consistency

> Occasionally used when processor privilege is required Ex: manipulate privileged CPU registers

> Arguments are a function pointer and a data pointer Remember process manager shares address space with kernel

> _NTO_PF_RING0 flag needed to use this kernel call

> Only process manager has this flag set

Page 22: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

22All content copyright QNX Software Systems

Process Manager Example

The process manager implements many services which would actually be “kernel calls” in traditional UNIX

Example – mmap()> mmap() is the API through which all mappings are setup by user

processes

> Malloc uses mmap() to allocate heap memory, also known as “anonymous” memory, since it is not a mapping of a named object.

> Not implemented as a kernel call, but rather a message that is sent to the process manager

Page 23: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

23All content copyright QNX Software Systems

mmap()

void *_mmap(void *addr, size_t len, int prot, int flags, int fd, off64_t off, unsigned align, unsigned preload, void **base, size_t *size) {

mem_map_t msg;

msg.i.type = _MEM_MAP; msg.i.zero = 0; msg.i.addr = (uintptr_t)addr; msg.i.len = len; msg.i.prot = prot; msg.i.flags = flags; msg.i.fd = fd; msg.i.offset = off; msg.i.align = align; msg.i.preload = preload; msg.i.reserved1 = 0; if(MsgSendnc(MEMMGR_COID, &msg.i, sizeof

msg.i, &msg.o, sizeof msg.o) == -1) { return MAP_FAILED; }

Type of operation

Parameters

Send message to procnto requesting operation be done

Page 24: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

24All content copyright QNX Software Systems

memmgr_handler()

The _MEM_MAP message type is picked up and passed to the memmgr message handler

switch(msg->type) { …

case _MEM_MAP: proc_wlock_adp(prp); status = memmgr_map(ctp, prp, &msg->map); break;

case _MEM_CTRL: proc_wlock_adp(prp); status = memmgr_ctrl(prp, &msg->ctrl); break;

Page 25: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

25All content copyright QNX Software Systems

Process Manager

MsgSendv()

vmm_mmap() map_create() pa_alloc() pte_manipulate()

MsgReplyv()

malloc()

mmap()

memmgr_map()MsgReceivev()

_MEM_MAP

User ProcessProcess Manager

return msg.o.addr;

Page 26: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

26All content copyright QNX Software Systems

Other Process Manager Services

Creating processes! The spawn() send a _PROC_SPAWN message to

create a new process The exec() ‘system call’ is actually a spawn message

with an SPAWN_EXEC flag set! The fork() ‘system call’ is a _PROC_FORK message Procfs debug filesystem

> Similar to unix procfs

> Used by debugger/pidin/ps

Page 27: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

27All content copyright QNX Software Systems

Ongoing kernel development

The kernel team is currently working on our next release> Codename “trinity 2”

Features include:> Memory management enhancements such as variable page support

(~15% improvement in system performance)> POSIX PSE52 certification> PPC 9xx processor support> ARMv6 support> Cross-endian QNET capabilities

Trinity 2 is currently feature complete – bugfixing/release process underway> Builds available on foundry27:

http://community.qnx.com/sf/wiki/do/viewPage/projects.core_os/wiki/Trinity2

Page 28: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

28All content copyright QNX Software Systems

Roadmap – QNX Source Postings

Source Bundle Release Date Description

Networking Nov 2007 Next Generation Networking stack, protocols, drivers (io-pkt)

Block Filesystems March 2008 Block Filesystems and Utilities

Flash Filesystems March 2008 Flash (NOR/NAND) Filesystems and Utilities

Network Filesystems March 2008 Block/Flash/Network Filesystems and Utilities

Devices and Drivers June 2008 Serial, Audio, USB, PCI frameworks and drivers

System Services June /2008 Additional system service managers

Window Systems Sept 2008 High level Photon server and services

Graphics System Sept 2008 Lower level graphics libraries and drivers

Multimedia Nov 2008 Full multimedia stack

Page 29: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

29All content copyright QNX Software Systems

Brainteasers

Need something to chew on? > Try to figure out the questions below and post the answer to the

OS_Tech forum on the OS project

> Prize for the first to answer each question

> QNX employees not eligible

1. What does STI expand to in the MIPS kernel?

2. NEED_PREEMPT(act) checks queued_event_priorityWhat sets queued_event_priority?

3. In the memmgr message handler, what is the purpose of “proc_wlock_adp(prp);”?

Page 30: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

30All content copyright QNX Software Systems

Want to learn more?

Check out the projects on www.foundry27.com Download the QNX Momentics suite on

www.qnx.com Download the microkernel source from the QNX

operating system project Read the tech articles and wiki pages (linked off

the project) Participate in the forums on the QNX operating

system project

Page 31: The 27 Year Old Microkernel Sebastien Marineau-Mes & Colin Burgess

Questions?