![Page 1: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/1.jpg)
Improving IPC by Kernel Design By
Jochen Liedtke
German National Research Center for Computer Science
Presented By Srinivas Sundaravaradan
![Page 2: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/2.jpg)
MACH µ-Kernel system based on message passing
Over 5000 cycles to transfer a short message
Buffering IPC
L3 Similar to MACH
Hardware Interrupts delivered through messages
No Ports
![Page 3: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/3.jpg)
Design PhilosophyFocus on IPC
Any Feature that will increase cost must be closely evaluated. When in doubt, design in favor of IPC
Design for Performance A poorly performing technique is unacceptable Evaluate feature cost compared to concrete baseline Aim for a concrete performance goal
Comprehensive Design Consider synergistic effects of all methods and techniques Cover all levels of implementation, from design to code
![Page 4: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/4.jpg)
Making IPC fasterFewer
Call / Reply & Receive NextCombining messages
Faster15 other optimizations
Architectural levelUse redesign of L3 as opportunity to change kernel design
![Page 5: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/5.jpg)
MethodologyTheoretical minimum
Null message between address spacesreceiver is ready to receive it107 cycles to enter & leave kernel45 cycles for TLB misses172 cycles
Goal350 cyclesAchieved 250 cycles = T
![Page 6: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/6.jpg)
Minimize system calls Why minimize system calls ?
60% of T
Traditional IPC4 system calls
Solution CallReply & Receive next
![Page 7: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/7.jpg)
Minimize system calls
Unblocked
Blocked
Send
Receive (reply)
Send (reply)
Receive (next)
Blocked
Unblocked
Client
Server
Call
Reply and receive next
Receive
![Page 8: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/8.jpg)
Complex Message
Direct String Data to be transferred directly from send buffer to receive buffer
Indirect String Location and size of data to be transferred by reference
Memory Object Description of a region of memory to be mapped in receiver address space (shared memory)
A Complex Message
![Page 9: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/9.jpg)
Ways of Message TransferTwofold Message Copy
user space A -> kernel space -> user space B
LRPC mechanismshare user-level memorysecure ?does not support variable-to-variable transfer
![Page 10: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/10.jpg)
Temporary Mapping…
Two copy message transfer costs 20 + 0.75n cycles
L3 copies data once to a special communication window in kernel space
Window is mapped to the receiver for the duration of the call (page directory entry)
kernel
kernel
copy
mapped with kernel-only permission
add mapping to space B
![Page 11: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/11.jpg)
Temporary Mapping…
Top-levelPage table
2nd-level tables
framesin
memory
![Page 12: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/12.jpg)
Temporary Mapping
![Page 13: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/13.jpg)
Lazy SchedulingScheduler overhead is significant component of IPC cost
Threads doing IPC are often moved to wait queue only to be inserted back again onto the ready queue.
Lazy Scheduling avoid locking of queuesqueue manipulation is avoided
instruction execution TLB misses
![Page 14: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/14.jpg)
Use registers for short messagesMessages are usually short !
ack/error replies from drivershardware interrupt messages
Intel 486 processor 7 general purpose registerssender info, data
May not work for CPU’s with fewer registers
![Page 15: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/15.jpg)
Summary of OptimizationsArchitectural
System Calls, Messages, Direct Transfer, Strict Process Orientation, Thread Control Blocks
AlgorithmicThread Identifier, Virtual Queues, Timeouts/Wakeups, Lazy
Scheduling, Direct Process Switch, Short messagesInterface
Unnecessary Copies, Parameter passingCoding
Cache misses, TLB misses, Segment registers, General registers, Jumps and Checks, Process Switch
![Page 16: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/16.jpg)
Results…
![Page 17: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/17.jpg)
Results
![Page 18: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/18.jpg)
ConclusionsL3’s message passing was 22 times faster than that of
MACH
Kernel redesign focused mainly on IPC
CaveatsPorts and BufferingSpecific to the architecture
![Page 19: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar](https://reader031.vdocument.in/reader031/viewer/2022013112/56649d2c5503460f94a0295c/html5/thumbnails/19.jpg)
Thank You !