automatic compaction of os kernel code via on-demand code loading haifeng he, saumya debray, gregory...

26
Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Automatic Compaction of OS Kernel Code via On-Demand

Code Loading

Haifeng He, Saumya Debray, Gregory Andrews

The University of Arizona

Page 2: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Background

GeneralPurpose

Operating Systems

• Resource constraints

• Limited amount of memory

Reduce memory footprint of OS kernel code as much as possible

Desktop

EmbeddedDevices

Page 3: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

General OS with Embedded Apps.

Executed

• Needed (exception handling)• Not needed but missed by existing analysis

Statically proved as unnecessary by prior work

Unexecuted but still can’t be discarded

About 68% kernel code is not executed

A Linux kernel with minimal configurationProfiling with MiBench suite

32%

18%-24%

Page 4: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Our Approach

Limited amount of main memory

Greater amount of secondary storage

Memory Hierarchy Kernel Code

lives in memory

lives in secondary storage

Hot code

Cold code

On-Demand

Code Loading

Page 5: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

A Big Picture

Main Memory

Remainingkernel code

Code clustering

Memory-residentkernel codeHot code

Code buffer

Accommodate one cluster at a time

Core code

SchedulerMemory management Interrupt handling

Secondary Storage

size(cluser) size(code buffer)

Page 6: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Memory Requirement for Kernel Code

Main Memory

Hot code

Code buffer

Core code

Size is predetermined

Select the most frequently executed code

How much hot code should stay in memory?

The total size of memory-resident code size(core code)x(1 + )where specified by user (e.g. 0%,10%)

Size specified by user

Upper-bound of memory usage for kernel code

Page 7: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Our Approach Reminiscent of the old idea of overlays

Purely software-based approach Does not require MMU or OSs support for VM

Main steps Apply clustering to whole-program control flow

graph Group “related” code together Reduce cost of code loading

Transform kernel code to support overlays Modify control flow edges

Page 8: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Code Clustering Objective

minimize the number of code loading Given:

An edge-weighted whole-program control flow graph A list of functions marked as core code A growth bound for memory-resident code Code buffer size BufSz

Apply a greedy node-coalescing algorithm until no coalescing can be carried out without violating Size of memory-resident code

size(core code)x(1+ ) Size of each cluster BufSz

Page 9: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Code Transformation

Apply code transformation on Inter-cluster control flow edges Control flow edges from memory-

resident code to clusters (but not needed on the other way)

All indirect control flow edges (targets only known at runtime)

Page 10: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Code TransformationAfter clustering

Cluster A

Cluster B

call F

0x220 F:

Rewritten codeCluster A

push &Fcall dyn_loader

dyn_loader

Cluster B (in code buffer)

0x200 0x500

0x520 F:

Runtime library

1. Address look upfor &F

2. Load B into code buffer

3. Translate target addr &F into relative addr in code buffer

Page 11: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

0x500

push &F

0x530 call dyn_loader

0x540

pc

Issue: Call Return in Code Buffer

code buffer : start at 0x500

Runtime

0x200…0x220 F:…0x250 ret

Cluster B

0x100 … push &F0x130 call dyn_loader0x140

Cluster A

Code

Cluster A

return address = 0x540

Page 12: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

0x500

0x520 F:

0x540

0x550 ret

Call Return in Code Buffer

0x200…0x220 F:…0x250 ret

Cluster B

0x100 … push &F0x130 call dyn_loader0x140

Cluster A code buffer : start at 0x500

Code Runtime

Cluster B

pc

A has been overwritten by B!

pc

return address = 0x540

Load B into code buffer

pc

Page 13: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

0x500

push &F

0x530 call dyn_loader

0x540

pc

Issue: Call Return in Code Buffer

code buffer : start at 0x500

Runtime

0x200…0x220 F:…0x250 ret

Cluster B

0x100 … push &F0x130 call dyn_loader0x140

Cluster A

Code

Cluster A

return address = 0x540

Page 14: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

0x500

push &F

0x530 call dyn_loader

0x540

pc

Call Return in Code Buffer

code buffer : start at 0x500

Runtime

0x200…0x220 F:…0x250 ret

Cluster B

0x100 … push &F0x130 call dyn_loader0x140

Cluster A

Code

Cluster A

return address= 0x540= &dyn_restore_A

dyn_restore_AActual ret_addr = 0x140

Fix

Page 15: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

0x500

0x520 F:

0x540

0x550 ret

Call Return in Code Buffer

0x100…0x220 F:…0x250 ret

Cluster B

0x100 … push &F0x130 call dyn_loader0x140

Cluster A code buffer : start at 0x500

Code Runtime

Cluster B

pcreturn address = &dyn_restore_A

pcdyn_restore_A

Actual ret_addr = 0x140

Load B into code buffer

Page 16: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

0x500

push &F

0x530 call dyn_loader

0x540

Call Return in Code Buffer

code buffer : start at 0x500

return address = &dyn_restore_A

Runtime

0x100…0x220 F:…0x250 ret

Cluster B

0x100 … push &F0x130 call dyn_loader0x140

Cluster A

Code

Cluster A

pc

dyn_restore_AActual ret_addr = 0x140

restore

Page 17: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Context Switches and Interrupts Context switches

Interrupt Currently keep interrupt handlers in main

memory

Execute cluster Ain code buffer

conte

xt

sw

itche

s

Execute. May change code buffer

Remember A in Thread 1 task_struct Continue executing.

in code buffer

conte

xt

sw

itche

s

TimeReload A into code buffer

Thread 2

Thread 1

Page 18: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Experimental Setup Start with a minimally configured kernel

(Linux 2.4.31) Compile the kernel with optimization for cod

e size (gcc –Os) Original code size: 590KB

Implemented using binary rewriting tool PLTO

Benchmarks: MiBench, MediaBench, httpd

Page 19: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Memory Usage Reduction for Kernel Code

20%

30%

40%

50%

60%

70%

80%

0.00 0.02 0.04 0.06 0.08 0.10

Memory-resident Code Growth Bound

Mem

ory

reduct

ion r

ati

o

MiBenchMediaBenchhttpd

Code buffer size = 2KB

Reduction decreases because amount of memory-resident codeincreases

Page 20: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Estimated Cost of Code Loading All experiments were run in desktop

environment We estimated the cost of code loading as

follows: Choose Micron NAND flash memory as an

example (2KB page, takes to read a page)

Est. Cost =

130.9μ3access(i)#2KB

size(i)

i

130.9μ3

Page 21: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Overhead of Code Loading

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

0.00 0.02 0.04 0.06 0.08 0.10

Memory-resident Code Growth Bound

Code L

oadin

g O

verh

ead

MiBenchMediaBenchhttpd

UnmodifiedKernel

57% memory reduction

56% memory reduction

55% memory reduction

Page 22: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Related Work Code compaction of OS kernel

D. Chanet et al. LCTES 05 H. He et al. CGO 07

Reduce memory requirement in embedded system C. Park et al. EMSOFT 04 H. Park et al. DATE 06 B. Egger et al. CASE 06, EMSOFT 06

Binary rewriting of OS kernel Flower et al. FDDO-4

Page 23: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Conclusions Embedded devices typically have a limited

amount of memory General-purpose OS kernels contain lots of

code that is not executed in an embedded context

Reduce the memory requirement of OS kernel by using an on-demand code overlay mechanism

Memory requirements reduced significantly with little degradation in performance

Page 24: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Estimated Cost of Code Loading

MiBench

0

20

40

60

80

100

120

140

0.00 0.02 0.04 0.06 0.08 0.10

Growth Bound r

Ru

nti

me(s

ec)

OverlayOrignal

MediaBench

0

2

4

6

8

10

0.00 0.02 0.04 0.06 0.08 0.10

Growth Bound r

Ru

nti

me(s

ec)

OverlayOrignal

Httpd

0

5

10

15

20

25

30

0.00 0.02 0.04 0.06 0.08 0.10

Growth Bound r

Ru

nti

me(s

ec)

OverlayOrignal

Page 25: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

A Big Picture

Code buffer

Main Memory

Hot code

Reuse code buffer

Cold code

Code clustering

Core code

Memory- resident kernel code

Accommodate one cluster at a time

SchedulerMemory management Interrupt handling

Page 26: Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona

Memory Requirement for Kernel Code

Corecode

How much hot code should stay in memory?

Hot codeNeed to be in memorySize is predetermined

Code bufferSize specified by user (we chose 2KB)

Upper-bound of memory usage for kernel code

Select the most frequently executed code

Keep the total size of memory-resident code size(core code)x(1 + )where specified by user (0%,10%)