improving ipc by kernel design jochen liedtke shane matthews portland state university

26
Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

Improving IPC by Kernel DesignJochen Liedtke

Shane MatthewsPortland State University

Page 2: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Summary

• Review

• Performance improved

– Architecture Level

– Algorithmic Level

– Interface Level

– Coding Level

Page 3: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3

Micro-kernels

• Minimal OS, providing a set of primitives used to implement thread/address space management and IPC [1]

• Everything else is moved to user-space (servers)

Page 4: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

4

Terminology (L3)

• Dataspace– Memory object, mapped into address space

• Task– Composed of threads, dataspaces, and an address space

• Message– String/memory object

Page 5: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

5

L3 Architecture & IPC

• Active components communicate via messages

• Applies to:– Device drivers

• Implemented as user level tasks

– Hardware Interrupts• Interrupt message from micro-kernel to thread

Page 6: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

6

L3 Redesign Principles

• IPC performance is the master– Security and performance must not be affected

• Synergetic effects taken into consideration– (Think combined effects)– May lead to reinforcement or diminution

• Design must aim at performance goal– Per short message transfer– 350 cycles (7 micro-seconds)

Page 7: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Architectural Level

• Messages

• Process Structure

• Control Blocks

Page 8: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Compound Messages

• Multiple send/receive -> 1 send/receive

• Messages consists of direct/indirect strings, and memory objects

Page 9: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

9

Twofold message copy

• [A space] -> [kernel] -

> [B space]

• O(20 + .75n) cycles,

n:= bytes

• Good for small

messages

• Need something better

as n grows

Page 10: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

10

LRPC and SRC RPC

• Client/server share user level memory– sender -> shared buffer

• Problems– When server to client is 1 to many, shared

regions of address space become critical resources

– Shared regions require explicit opens (unlike L3)

– Message change during/after checking

Page 11: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

11

Direct Message Copy Via Windows

• L3's method

– Destination mapped

into window

– Message copied to

window

• Window

– per address space

– Accessed exclusivly

by kernel

Page 12: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

12

Communication Windows

• Problems

– Must be fast

– Different threads

coxisting within

address space

• L3 Implementation

– One word page

directory B to A.

Page 13: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

13

Process Structure

• Threads running kernel mode have 1 kernel

stack per thread

– Efficient since interupts, page faults, IPC,

already save state on kernel stack

• Continuations

– Pro: • Reduce kernel stack

– Cons: • Require additional copies between kernel and

continutation

• Interfere with other optimizations

Page 14: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

14

Tread Control Blocks

• Implemented as large array in kernel

– fast tcb access

• Array base + tcb # + tcb size

– Saves TLB misses (IPC)

• kernel stacks of sender and reciever located in TCB

page

– Locking done via unmapping on TCB

Page 15: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Algorithmic Level

• Thread Identifier

• Lazy Scheduling

• Short Messages Via Registers

Page 16: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Thread Identifier

• Thread addressed by 64-bit UID in user-

mode

• Thread number in lower 32-bits of UID

– AND with bit mask, add to TCB’s array base

Page 17: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Lazy Scheduling

• IPC operation call or reply & receive next

– Delete sending thread from ready queue

– Insert into waiting queue

– Delete receiving thread from waiting queue

– Insert into ready queue

• Too many queue operations!

Page 18: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Lazy Scheduling cont.

• L3 queue invariants

– Ready queue contains all ready threads

– Waiting queue contains at least all threads

waiting

• TCB contains threads state (ready/waiting)

• Scheduler removes all threads not

belonging to queue during queue parsing

Page 19: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Short Messages Via Registers

• High proportion of messages are short

– Ex. Driver ack/error, hardware interrupts

• 486

– 7 general registers

– 3 needed: sender ID, result code

– 4 available

• 8-byte messages using coding scheme

Page 20: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Interface Level

• Simple RPC stubs

– Load registers, system call, check success

– Compiler generates stubs inline

• Parameter Passing

– Use registers when possible

Page 21: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Coding Level

• Reduce cache and TLB misses

– Short kernel code

• Short jumps, use registers, short address

displacements

– IPC kernel code in one page

– Handle save/restore of coprocessor lazily

• Delayed until different thread needs to use it

Page 22: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Results

• 100% would indicate double the time increase

• Removal of all increase IPC time by 134% for 8 byte message

Page 23: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Results

• L3 VS Mach

• System– Intel 486 DX-50– 256 KB external

cache– 16 MB memory

Page 24: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Results cont.

Page 25: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

3/12/2004 Portland State University

Conclusions

• IPC improved by applying

– Performance based reasoning

– Synergetic effects

– Architecture -> coding

Page 26: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University

26

References

• [1] http://en.wikipedia.org/wiki/Micro_kernel

• [2] Improving IPC by Kernel Design - Jochen Liedtke