garbage collection introduction and overview christian schulte excerpted from presentation by...

Post on 15-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Garbage CollectionGarbage CollectionIntroduction and OverviewIntroduction and Overview

Excerpted from presentation by Christian SchulteChristian SchulteProgramming Systems LabProgramming Systems Lab

Universität des Saarlandes, GermanyUniversität des Saarlandes, Germany

schulte@ps.uni-sb.deschulte@ps.uni-sb.de

Garbage Collection…Garbage Collection…

…is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program

Garbage collection…Garbage collection…

Dynamically allocated memoryDynamically allocated memoryLast use by a programLast use by a programExamples for automatic reclamationExamples for automatic reclamation

Kinds of Memory AllocationKinds of Memory Allocation

static int i;

void foo(void) {

int j;

int* p = (int*) malloc(…);

}

Static AllocationStatic Allocation

By compiler (in text area)By compiler (in text area) Available through entire runtimeAvailable through entire runtime Fixed sizeFixed size

static int i;

void foo(void) {

int j;

int* p = (int*) malloc(…);

}

Automatic AllocationAutomatic Allocation

Upon procedure call (on stack)Upon procedure call (on stack) Available during execution of callAvailable during execution of call Fixed sizeFixed size

static int i;

void foo(void) {

int j;

int* p = (int*) malloc(…);

}

Dynamic AllocationDynamic Allocation

Dynamically allocated at runtime (on heap)Dynamically allocated at runtime (on heap) Available until explicitly deallocatedAvailable until explicitly deallocated Dynamically varying sizeDynamically varying size

static int i;

void foo(void) {

int j;

int* p = (int*) malloc(…);

}

Dynamically Allocated MemoryDynamically Allocated Memory

Also: heap-allocated memoryAlso: heap-allocated memory Allocation: malloc, new, …Allocation: malloc, new, …

– before first usage Deallocation: free, delete, dispose, …Deallocation: free, delete, dispose, …

– after last usage Needed forNeeded for

– C++, Java: objects– SML: datatypes, procedures– anything that outlives procedure call

Getting it WrongGetting it Wrong

Forget to free (memory leak)Forget to free (memory leak)– program eventually runs out of memory– long running programs: OSs. servers, …

Free to early (dangling pointer)Free to early (dangling pointer)– lucky: illegal access detected by OS– horror: memory reused, in simultaneous use

• programs can behave arbitrarily• crashes might happen much later

Estimates of effortEstimates of effort– Up to 40%! [Rovner, 1985]

Nodes and PointersNodes and Pointers

Node Node nn– Memory block, cell

Pointer Pointer pp– Link to node– Node access: *p

Children Children children(children(nn))– set of pointers to nodes referred by n

n

p

MutatorMutator

Abstraction of programAbstraction of program– introduces new nodes with pointer– redirects pointers, creating garbage

Nodes referred to by several pointersNodes referred to by several pointers Makes manual deallocation hardMakes manual deallocation hard

– local decision impossible– respect other pointers to node

Cycles instance of sharingCycles instance of sharing

Shared NodesShared Nodes

Last Use by a ProgramLast Use by a Program

Question: When is node Question: When is node MM not any longer not any longer used by program?used by program?– Let P be any program not using M– New program sketch:

Execute P; Use M;– Hence:

M used P terminates– We are doomed: halting problem!

So “last use” undecidable!So “last use” undecidable!

Safe ApproximationSafe Approximation

Decidable and also simpleDecidable and also simpleWhat means safe?What means safe?

– only unused nodes freedWhat means approximation?What means approximation?

– some unused nodes might not be freed IdeaIdea

– nodes that can be accessed by mutator

Reachable NodesReachable Nodes

Reachable from Reachable from root setroot set– processor registers– static variables– automatic variables (stack)

Reachable from reachable nodesReachable from reachable nodes

roo

t

Summary: Reachable NodesSummary: Reachable Nodes

A node A node nn is reachable, iff is reachable, iff– n is element of the root set, or– n is element of children(m) and m is

reachable

Reachable node also called “live”Reachable node also called “live”

Mark and SweepMark and Sweep

Compute set of reachable nodesCompute set of reachable nodesFree nodes known to be not Free nodes known to be not

reachablereachable

Reachability: Safe ApproximationReachability: Safe Approximation

SafeSafe– access to not reachable node

impossible– depends on language semantics– but C/C++? later…

ApproximationApproximation– reachable node might never be

accessed– programmer must know about this!– have you been aware of this?

Example Garbage CollectorsExample Garbage Collectors

Mark-SweepMark-Sweep

OthersOthers– Mark-Compact– Reference Counting– Copying– see Chapter 1&2 of [Lins&Jones,96]

The Mark-Sweep CollectorThe Mark-Sweep Collector

Compute reachable nodes: MarkCompute reachable nodes: Mark– tracing garbage collector

Free not reachable nodes: SweepFree not reachable nodes: SweepRun when out of memory: AllocationRun when out of memory: AllocationFirst used with LISP [McCarthy, First used with LISP [McCarthy,

1960]1960]

AllocationAllocation

node* new() {

if (free_pool is empty)

mark_sweep();

AllocationAllocation

node* new() {

if (free_pool is empty)

mark_sweep();

return allocate();

}

The Garbage CollectorThe Garbage Collector

void mark_sweep() {

for (r in roots)

mark(r);

The Garbage CollectorThe Garbage Collector

void mark_sweep() {

for (r in roots)

mark(r);

all live nodes marked

Recursive MarkingRecursive Marking

void mark(node* n) {

if (!is_marked(n)) {

set_mark(n);

}

}

Recursive MarkingRecursive Marking

void mark(node* n) {

if (!is_marked(n)) {

set_mark(n);

}

}nodes reachable from n marked

Recursive MarkingRecursive Marking

void mark(node* n) {

if (!is_marked(n)) {

set_mark(n);

for (m in children(n))

mark(m);

}

}i-th recursion: nodes on path with length i

marked

The Garbage CollectorThe Garbage Collector

void mark_sweep() {

for (r in roots)

mark(r);

sweep();

The Garbage CollectorThe Garbage Collector

void mark_sweep() {

for (r in roots)

mark(r);

sweep();

all nodes on heap live

The Garbage CollectorThe Garbage Collector

void mark_sweep() {

for (r in roots)

mark(r);

sweep();

all nodes on heap live

and not marked

Eager SweepEager Sweep

void sweep() {

node* n = heap_bottom;

while (n < heap_top) {

}

}

Eager SweepEager Sweep

void sweep() {

node* n = heap_bottom;

while (n < heap_top) {

if (is_marked(n)) clear_mark(n);

else free(n);

n += sizeof(*n);

}

}

The Garbage CollectorThe Garbage Collector

void mark_sweep() {

for (r in roots)

mark(r);

sweep();

if (free_pool is empty)

abort(“Memory exhausted”);

}

AssumptionsAssumptions

Nodes can be markedNodes can be markedSize of nodes knownSize of nodes knownHeap contiguousHeap contiguousMemory for recursion availableMemory for recursion availableChild fields known!Child fields known!

Assumptions: RealisticAssumptions: Realistic

Nodes can be markedNodes can be markedSize of nodes knownSize of nodes knownHeap contiguousHeap contiguousMemory for recursion availableMemory for recursion availableChild fields knownChild fields known

Assumptions: ConservativeAssumptions: Conservative

Nodes can be markedNodes can be markedSize of nodes knownSize of nodes knownHeap contiguousHeap contiguousMemory for recursion availableMemory for recursion availableChild fields knownChild fields known

Mark-Sweep PropertiesMark-Sweep Properties

Covers cycles and sharingCovers cycles and sharing Time depends onTime depends on

– live nodes (mark)– live and garbage nodes (sweep)

Computation must be stoppedComputation must be stopped– non-interruptible stop/start collector– long pause

Nodes remain unchanged (as not moved)Nodes remain unchanged (as not moved) Heap remains fragmentedHeap remains fragmented

Software Engineering IssuesSoftware Engineering Issues

Design goal in SE:Design goal in SE:• decompose systems• in orthogonal components

Clashes with letting each component Clashes with letting each component do its memory managementdo its memory management

• liveness is global property• leads to “local leaks”• lacking power of modern gc methods

Typical CostTypical Cost

Early systems (LISP) Early systems (LISP)

up to 40% [Steele,75] up to 40% [Steele,75] [Gabriel,85][Gabriel,85]

• “garbage collection is expensive” myth

Well engineered system of todayWell engineered system of today

10% of entire runtime [Wilson, 10% of entire runtime [Wilson, 94]94]

Areas of UsageAreas of Usage

Programming languages and systemsProgramming languages and systems– Java, C#, Smalltalk, …– SML, Lisp, Scheme, Prolog, …– Perl, Python, PHP, JavaScript– Modula 3, Microsoft .NET

ExtensionsExtensions– C, C++ (Conservative)

Other systemsOther systems– Adobe Photoshop– Unix filesystem– Many others in [Wilson, 1996]

Understanding Garbage Understanding Garbage Collection: BenefitsCollection: Benefits Programming garbage collectionProgramming garbage collection

– programming systems– operating systems

Understand systems with garbage Understand systems with garbage collection (e.g. Java)collection (e.g. Java)– memory requirements of programs– performance aspects of programs– interfacing with garbage collection

(finalization)

ReferencesReferences

Garbage Collection. Richard Jones Garbage Collection. Richard Jones and Rafael Lins, John Wiley & Sons, and Rafael Lins, John Wiley & Sons, 1996.1996.

Uniprocessor garbage collection Uniprocessor garbage collection techniques. Paul R. Wilson, ACM techniques. Paul R. Wilson, ACM Computing Surveys. To appear.Computing Surveys. To appear.

• Extended version of IWMM 92, St. Malo.

top related