automatic storage management patrick earl simon leonard jack newton
TRANSCRIPT
Automatic Storage Automatic Storage ManagementManagement
Patrick Earl
Simon Leonard
Jack Newton
CMPUT 425/525: Automatic Storage Management 2
OverviewOverview
Terminology Why use Automatic Storage Management? Comparing garbage collection algorithms The “Classic” algorithms Copying garbage collection Incremental Tracing garbage collection Generational garbage collection Conclusions
CMPUT 425/525: Automatic Storage Management 3
TerminologyTerminology
Stack: a memory area where activation records or frames are pushed onto when a procedure is called and popped off when it returns
Heap: a memory area where data structures can be allocated and deallocated in any order.
CMPUT 425/525: Automatic Storage Management 4
TerminologyTerminology(Continued)(Continued)
Roots: values that a program can manipulate directly (i.e. values held in registers, on the program stack, and global variables.)
Node/Cell/Object: an individually allocated piece of data in the heap.
Children Nodes: the list of pointers that a given node contains.
Live Node: a node whose address is held in a root or is the child of a live node.
CMPUT 425/525: Automatic Storage Management 5
TerminologyTerminology(Continued)(Continued)
Garbage: nodes that are not live, but are not free either.
Garbage collection: the task of recovering (freeing) garbage nodes.
Mutator: The program running alongside the garbage collection system.
CMPUT 425/525: Automatic Storage Management 6
Why Garbage Collect?Why Garbage Collect?
Language requirements– In some situations it may be impossible to
know when a shared data structure is no longer in use.
CMPUT 425/525: Automatic Storage Management 7
Why Garbage Collect?Why Garbage Collect?(Continued)(Continued)
Software Engineering– Garbage collection increases abstraction level
of software development.– Simplified interfaces and decreases coupling of
modules.– Studies have shown a significant amount of
development time is spent on memory management bugs [Rovner, 1985].
CMPUT 425/525: Automatic Storage Management 8
Comparing Garbage Comparing Garbage Collection AlgorithmsCollection Algorithms
Directly comparing garbage collection algorithms is difficult – there are many factors to consider.
Some factors to consider:– Cost of reclaiming cells– Cost of allocating cells– Storage overhead– How does the algorithm scale with residency?– Will user program be suspended during garbage collection?– Does an upper bound exist on the pause time?– Is locality of data structures maintained (or maybe even
improved?)
CMPUT 425/525: Automatic Storage Management 9
Classes of Garbage Collection Classes of Garbage Collection AlgorithmsAlgorithms
Direct Garbage Collectors: a record is associated with each node in the heap. The record for node N indicates how many other nodes or roots point to N.
Indirect/Tracing Garbage Collectors: usually invoked when a user’s request for memory fails because the free list is exhausted. The garbage collector visits all live nodes, and returns all other memory to the free list. If sufficient memory has been recovered from this process, the user’s request for memory is satisfied.
CMPUT 425/525: Automatic Storage Management 10
Quick Review: Quick Review: Reference CountingReference Counting
Every cell has an additional field: the reference count. This field represents the number of pointers to that cell from roots or heap cells.
Initially, all cells in the heap are placed in a pool of free cells, the free list.
CMPUT 425/525: Automatic Storage Management 11
Reference CountingReference Counting(Continued)(Continued)
When a cell is allocated from the free list, its reference count is set to one.
When a pointer is set to reference a cell, the cell’s reference count is incremented by 1; if a pointer is to the cell is deleted, its reference count is decremented by 1.
When a cell’s reference count reaches 0, its pointers to its children are deleted and it is returned to the free list.
CMPUT 425/525: Automatic Storage Management 12
01
0
0
0
Reference Counting ExampleReference Counting Example
1
2
1
1
CMPUT 425/525: Automatic Storage Management 13
2
1
1
1
Reference Counting Example Reference Counting Example (Continued)(Continued)
0
1
CMPUT 425/525: Automatic Storage Management 14
2
1
1
1
Reference Counting Example Reference Counting Example (Continued)(Continued)
0
0
1
1
CMPUT 425/525: Automatic Storage Management 15
2
1
1
1
Reference Counting Example Reference Counting Example (Continued)(Continued)
0
0
0
1
1
Returned to free list
CMPUT 425/525: Automatic Storage Management 16
Reference Counting: Reference Counting: Advantages and DisadvantagesAdvantages and Disadvantages
Advantages:– Garbage collection overhead is distributed.– Locality of reference is no worse than mutator.– Free memory is returned to free list quickly.
CMPUT 425/525: Automatic Storage Management 17
Reference Counting: Reference Counting: Advantages and DisadvantagesAdvantages and Disadvantages
(Continued)(Continued)
Disadvantages:– High time cost (every time a pointer is changed,
reference counts must be updated).– Storage overhead for reference counter can be
high.– Unable to reclaim cyclic data structures.– If the reference counter overflows, the object
becomes permanent.
CMPUT 425/525: Automatic Storage Management 18
Reference Counting: Reference Counting: Cyclic Data Structure - Cyclic Data Structure - BeforeBefore
02
0
0
1
2
1
CMPUT 425/525: Automatic Storage Management 19
Reference Counting: Reference Counting: Cyclic Data Structure – Cyclic Data Structure – AfterAfter
01
0
0
1
2
1
CMPUT 425/525: Automatic Storage Management 20
Deferred Reference CountingDeferred Reference Counting
Optimisation– Cost can be improved by special treatment of
local variables.– Only update reference counters of objects on the
stack at fixed intervals.– Reference counts are still affected from pointers
from one heap object to another.
CMPUT 425/525: Automatic Storage Management 21
Quick Review: Mark-SweepQuick Review: Mark-Sweep
The first tracing garbage collection algorithm Garbage cells are allowed to build up until heap
space is exhausted (i.e. a user program requests a memory allocation, but there is insufficient free space on the heap to satisfy the request.)
At this point, the mark-sweep algorithm is invoked, and garbage cells are returned to the free list.
CMPUT 425/525: Automatic Storage Management 22
Mark-SweepMark-Sweep(Continued)(Continued)
Performed in two phases:– Mark phase: identifies all live cells by setting
a mark bit. Live cells are cells reachable from a root.
– Sweep phase: returns garbage cells to the free list.
CMPUT 425/525: Automatic Storage Management 23
Mark-Sweep ExampleMark-Sweep Example
Returned to free list
CMPUT 425/525: Automatic Storage Management 24
Mark-Sweep: Mark-Sweep: Advantages and DisadvantagesAdvantages and Disadvantages
Advantages:– Cyclic data structures can be recovered.– Tends to be faster than reference counting.
CMPUT 425/525: Automatic Storage Management 25
Mark-Sweep: Mark-Sweep: Advantages and DisadvantagesAdvantages and Disadvantages
(Continued)(Continued)
Disadvantages:– Computation must be halted while garbage
collection is being performed– Every live cell must be visited in the mark
phase, and every cell in the heap must be visited in the sweep phase.
– Garbage collection becomes more frequent as residency of a program increases.
– May fragment memory.
CMPUT 425/525: Automatic Storage Management 26
Mark-Sweep: Mark-Sweep: Advantages and DisadvantagesAdvantages and Disadvantages
(Continued)(Continued)
Disadvantages:– Has negative implications for locality of
reference. Old objects get surrounded by new ones (not suited for virtual memory applications).
However, if objects tend to survive in clusters in memory, as they apparently often do, this can greatly reduce the cost of the sweep phase.
CMPUT 425/525: Automatic Storage Management 27
Mark-Compact CollectionMark-Compact Collection
Remedy the fragmentation and allocation problems of mark-sweep collectors.
Two phases:– Mark phase: identical to mark sweep.– Compaction phase: marked objects are
compacted, moving most of the live objects until all the live objects are contiguous.
CMPUT 425/525: Automatic Storage Management 28
Mark-Compact: Mark-Compact: Advantages and DisadvantagesAdvantages and Disadvantages
(Continued)(Continued)
Advantages:– The contiguous free area eliminates
fragmentation problem. Allocating objects of various sizes is simple.
– The garbage space is "squeezed out", without disturbing the original ordering of objects. This ameliorate locality.
CMPUT 425/525: Automatic Storage Management 29
Mark-Compact: Mark-Compact: Advantages and DisadvantagesAdvantages and Disadvantages
(Continued)(Continued)
Disadvantages:– Requires several passes over the data are
required. "Sliding compactors" takes two, three or more passes over the live objects.
One pass computes the new location Subsequent passes update the pointers to refer to new
locations, and actually move the objects
CMPUT 425/525: Automatic Storage Management 30
Copying Garbage CollectionCopying Garbage Collection
Like mark-compact, copying garbage collection does not really "collect" garbage.
Rather it moves all the live objects into one area and the rest of the heap is know to be available.
Copying collectors integrate the traversal and the copying process, so that objects need only be traversed once.
The work needed is proportional to the amount of live date (all of which must be copied).
CMPUT 425/525: Automatic Storage Management 31
Semispace Collector Using the Semispace Collector Using the Cheney AlgorithmCheney Algorithm
The heap is subdivided into two contiguous subspaces (FromSpace and ToSpace).
During normal program execution, only one of these semispaces is in use.
When the garbage collector is called, all the live data are copied from the current semispace (FromSpace) to the other semispace (ToSpace).
CMPUT 425/525: Automatic Storage Management 32
Semispace Collector Using Semispace Collector Using the Cheney Algorithmthe Cheney Algorithm
A
B C
D
FromSpace ToSpace
CMPUT 425/525: Automatic Storage Management 33
Semispace Collector Using Semispace Collector Using the Cheney Algorithmthe Cheney Algorithm
FromSpace ToSpace
A B C D
A
B C
D
CMPUT 425/525: Automatic Storage Management 34
Semispace Collector Using the Semispace Collector Using the Cheney AlgorithmCheney Algorithm
(Continued)(Continued)
Once the copying is completed, the ToSpace is made the "current" semispace.
A simple form of copying traversal is the Cheney algorithm.
The immediately reachable objects from the initial queue of objects for a breadth-first traversal.
A scan pointer is advanced through the first object location by location.
Each time a pointer into FromSpace is encountered, the referred-to-object is transported to the end of the queue and the pointer to the object is updated.
CMPUT 425/525: Automatic Storage Management 35
Cheney Algorithm: ExampleCheney Algorithm: Example
Root Nodes
AB F
EDC
A
A
A
B
B
B
C
C
C
D
D
E
A B C D E F
BAscan
scan
scan
scan
scan
free
free
free
free
free
CMPUT 425/525: Automatic Storage Management 36
Semispace Collector Using the Semispace Collector Using the Cheney Algorithm Cheney Algorithm
(Continued)(Continued)
Multiple paths must not be copied to tospace multiple times.
When an object is transported to tospace, a forwarding pointer is installed in the old version of the object.
The forwarding pointer signifies that the old object is obsolete and indicates where to find the new copy.
CMPUT 425/525: Automatic Storage Management 37
Copying Garbage Collection: Copying Garbage Collection: Advantages and DisadvantagesAdvantages and Disadvantages
Advantages:– Allocation is extremely cheap.– Excellent asymptotic complexity.– Fragmentation is eliminated.– Only one pass through the data is required.
CMPUT 425/525: Automatic Storage Management 38
Copying Garbage Collection: Copying Garbage Collection: Advantages and DisadvantagesAdvantages and Disadvantages
(Continued)(Continued)
Disadvantages:– The use of two semi-spaces doubles memory
requirement needs– Poor locality. Using virtual memory will cause
excessive paging.
CMPUT 425/525: Automatic Storage Management 39
Problems with Simple Tracing Problems with Simple Tracing CollectorsCollectors
Difficult to achieve high efficiency in a simple garbage collector, because large amounts of memory are expensive.
If virtual memory is used, the poor locality of the allocation/reclamation cycle will cause excessive paging.
Even as main memory becomes steadily cheaper, locality within cache memory becomes increasingly important.
CMPUT 425/525: Automatic Storage Management 40
Problems with Simple Tracing Problems with Simple Tracing CollectorsCollectors
(Continued)(Continued)
With a simple semispace copy collector, locality is likely to be worse than mark-sweep.
The memory issue is not unique to copying collectors.
Any efficient garbage collection involves a trade-off between space and time.
The problem of locality is an indirect result of the use of garbage collection.
CMPUT 425/525: Automatic Storage Management 41
Incremental Tracing Collectors Incremental Tracing Collectors OverviewOverview
Introduction to Incremental CollectorsCoherence and ConservatismTricolor MarkingWrite Barrier AlgorithmsBaker’s Read Barrier Algorithm
CMPUT 425/525: Automatic Storage Management 42
Incremental Tracing CollectorsIncremental Tracing Collectors
Program (Mutator) and Garbage Collector run concurrently.– Can think of system as similar to two threads.
One performs collection, and the other represents the regular program in execution.
Can be used in systems with real-time requirements. For example, process control systems.
CMPUT 425/525: Automatic Storage Management 43
Coherence & ConservatismCoherence & Conservatism
Coherence: A proper state must be maintained between the mutator and the collector.
Conservatism: How aggressive the garbage collector is at finding objects to be deallocated.
CMPUT 425/525: Automatic Storage Management 44
TricoloringTricoloring
White – Not yet traversed. A candidate for collection.
Black – Already traversed and found to be live. Will not be reclaimed.
Grey – In traversal process. Defining characteristic is that it’s children have not necessarily been explored.
CMPUT 425/525: Automatic Storage Management 45
The Tricolor AbstractionThe Tricolor Abstraction
CMPUT 425/525: Automatic Storage Management 46
Tricoloring InvariantTricoloring Invariant
There must not be a pointer from a black object to a white object.
CMPUT 425/525: Automatic Storage Management 47
Violation of Coloring InvariantViolation of Coloring Invariant
Before After
A
B C
D
A
B C
D
CMPUT 425/525: Automatic Storage Management 48
Steps in ViolationSteps in Violation
Read a pointer to a white objectAssign that pointer to a black objectOriginal pointer must be destroyed without
collection system noticing.
CMPUT 425/525: Automatic Storage Management 49
Read BarrierRead Barrier
Barriers are essentially memory access detection systems.
We detect when any pointers to any white objects are read.
If a read to the pointer occurs, we conceptually color that object grey.
CMPUT 425/525: Automatic Storage Management 50
Write BarrierWrite Barrier
When a pointer is written to an object, we record the write somehow.
The recorded write is dealt with at a later point.
Read vs. Write efficiency considerations.
CMPUT 425/525: Automatic Storage Management 51
Write Barrier AlgorithmsWrite Barrier Algorithms
Snapshot-at-beginningIncremental update
CMPUT 425/525: Automatic Storage Management 52
Snapshot-at-beginningSnapshot-at-beginning
Conceptually makes a copy-on-write duplication of the pointer graph.
Can be implemented with a simple write barrier that records pointer writes and adds the old addresses to a stack to be traversed later.
CMPUT 425/525: Automatic Storage Management 53
Snapshot-at-beginning Snapshot-at-beginning ExampleExample
Before After
A
B C
D
A
B C
D
Stack
Pointer toD is nowOn stack
CMPUT 425/525: Automatic Storage Management 54
Comments on Snapshot-at-Comments on Snapshot-at-beginningbeginning
Very conservative.All overwritten pointer values are saved and
traversed.No objects can be freed while collection
process is occurring.
CMPUT 425/525: Automatic Storage Management 55
Incremental Update Write-Incremental Update Write-Barrier AlgorithmBarrier Algorithm
No copy of tree is made.Catches overwrites of pointers that have
been copied.– If a pointer is not copied before being written, it
will be freed.The object with the overwritten pointer is
colored grey and the algorithm must search that node again at the end.
CMPUT 425/525: Automatic Storage Management 56
Incremental Update ExampleIncremental Update Example
Before After
A
B C
D
A
B C
D
CMPUT 425/525: Automatic Storage Management 57
Comments on Incremental Comments on Incremental UpdateUpdate
Things that are freed during collection are far more likely to be collected than with the snapshot algorithm. (Less conservative)
Although the collector restarts the traversal in some places, it is guaranteed to do a full search and will eventually terminate.
CMPUT 425/525: Automatic Storage Management 58
Baker’s Read Barrier Baker’s Read Barrier AlgorithmsAlgorithms
Incremental CopyingNon-copying Algorithm (The Treadmill)
CMPUT 425/525: Automatic Storage Management 59
Incremental CopyingIncremental Copying
Variation of Copying Collector“Garbage collection cycle begins with an
atomic flip.”All objects directly pointed to by the root
are copied into tospace.
CMPUT 425/525: Automatic Storage Management 60
Read Barrier in Incremental Read Barrier in Incremental CopyingCopying
Whenever an object is read that is not already in ToSpace, the read barrier catches that and copies the object over to ToSpace at that point.
Normal “background scavenging” occurs simultaneously to ensure that all objects are traversed and reclamation can occur.
CMPUT 425/525: Automatic Storage Management 61
Incremental Copying ExampleIncremental Copying Example
A
BC
D
FromSpace ToSpace
Atomic Flip, then a read to D occurs…
E
DA B C
CMPUT 425/525: Automatic Storage Management 62
Comments on Read BarrierComments on Read Barrier
If implemented in software can be quite slow due to numerous reads to heap.
Specialized hardware is available on some unique machines that allow this type of tracing to be done quickly.
CMPUT 425/525: Automatic Storage Management 63
Baker’s Incremental Non-Baker’s Incremental Non-Copying AlgorithmCopying Algorithm
Doubly Linked Lists
New area for allocations since started collection
To/From spacesFree list
New
Free
FromTo
Allocation
Scanning
CMPUT 425/525: Automatic Storage Management 64
Example - AllocationExample - Allocation
Take an object from the free list and move it to the new list.
New
Free
FromTo
Allocation
Scanning
CMPUT 425/525: Automatic Storage Management 65
Example - ScanningExample - Scanning
Searching nodes in ToSpace for references to objects in FromSpace.
When found, object is unlinked in FromSpace and is linked in ToSpace.
New
Free
FromTo
Allocation
Scanning
CMPUT 425/525: Automatic Storage Management 66
Treadmill WorkingsTreadmill Workings
When starting collection cycle:– New list is empty– From list contains all New and To objects from
last cycle.Collection proceeds and scanning and
allocation are performed.When finished:
– From list is merged with Free list.
CMPUT 425/525: Automatic Storage Management 67
Comments on TreadmillComments on Treadmill
As in Incremental Copying, the garbage found in the FromSpace is reclaimed in constant time.
Conservative with new objectsConservative also in that reached objects
will not be removed even if they become garbage before scan ends.
CMPUT 425/525: Automatic Storage Management 68
Incremental Collectors Incremental Collectors SummarySummary
Incremental Tracing CollectorsTricolor Marking and InvariantRead and Write BarriersSnapshot-at-beginningIncremental UpdateBaker’s Incremental CopyingBaker’s Non-copying (Treadmill)
CMPUT 425/525: Automatic Storage Management 69
Generational Garbage Generational Garbage CollectionCollection
Attempts to address weaknesses of simple tracing collectors such as mark-sweep and copying collectors:– All active data must be marked or copied.– For copying collectors, each page of the heap is
touched every two collection cycles, even though the user program is only using half the heap, leading to poor cache behavior and page faults.
– Long-lived objects are handled inefficiently.
CMPUT 425/525: Automatic Storage Management 70
Generational Garbage Generational Garbage CollectionCollection
(Continued)(Continued)
Generational garbage collection is based on the generational hypothesis:
Most objects die young.As such, concentrate garbage collection
efforts on objects likely to be garbage: young objects.
CMPUT 425/525: Automatic Storage Management 71
Generational Garbage Generational Garbage Collection: Object LifetimesCollection: Object Lifetimes
When we discuss object lifetimes, the amount of heap allocation that occurs between the object’s birth and death is used rather than the wall time.
For example, an object created when 1Kb of heap was allocated and was no longer referenced when 4 Kb of heap data was allocated would have lived for 3Kb.
CMPUT 425/525: Automatic Storage Management 72
Generational Garbage Generational Garbage Collection: Object LifetimesCollection: Object Lifetimes
(Continued)(Continued)
Typically, between 80 and 98 percent of all newly-allocated heap objects die before another megabyte has been allocated.
CMPUT 425/525: Automatic Storage Management 73
Generational Garbage Generational Garbage CollectionCollection
(Continued)(Continued)
Objects are segregated into different areas of memory based on their age.
Areas containing newer objects are garbage collected more frequently.
After an object has survived a given number of collections, it is promoted to a less frequently collected area.
CMPUT 425/525: Automatic Storage Management 74
Generational Garbage Generational Garbage Collection: ExampleCollection: Example
Old Generation New Generation
Root SetS
A
B
C
Memory Usage Memory Usage
CMPUT 425/525: Automatic Storage Management 75
Generational Garbage Generational Garbage Collection: ExampleCollection: Example
(Continued)(Continued)
Old Generation New Generation
Root SetS
A
B
C
Memory Usage Memory Usage
R
CMPUT 425/525: Automatic Storage Management 76
Generational Garbage Generational Garbage Collection: ExampleCollection: Example
(Continued)(Continued)
Old Generation New Generation
Root SetS
A
B
C
Memory Usage Memory Usage
R D
CMPUT 425/525: Automatic Storage Management 77
Generational Garbage Generational Garbage Collection: ExampleCollection: Example
(Continued)(Continued)
This example demonstrates several interesting characteristics of generational garbage collection:– The young generation can be collected independently of
the older generations (resulting in shorter pause times).– An intergenerational pointer was created from R to D.
These pointers must be treated as part of the root set of the New Generation.
– Garbage collection in the new generation result in S becoming unreachable, and thus garbage. Garbage in older generations (sometimes called tenured garbage) can not be reclaimed via garbage collections in younger generations.
CMPUT 425/525: Automatic Storage Management 78
Generational Garbage Generational Garbage Collection: ImplementationCollection: Implementation
Usually implemented as a copying collector, where each generation has its own semispace:
Old Generation New Generation
FromSpace FromSpace
ToSpace ToSpace
CMPUT 425/525: Automatic Storage Management 79
Generational Garbage Generational Garbage Collection: IssuesCollection: Issues
Choosing an appropriate number of generations:– If we benefit from dividing the heap into two
generations, can we further benefit by using more than two generations?
Choosing a promotion policy:– How many garbage collections should an object
survive before being moved to an older generation?
CMPUT 425/525: Automatic Storage Management 80
Generational Garbage Generational Garbage Collection: IssuesCollection: Issues
(Continued)(Continued)
Tracking intergenerational pointers:– Inter-generational pointers need to be tracked,
since they form part of the root set for younger generations.
Collection Scheduling– Can we attempt to schedule garbage collection
in such a way that we minimize disruptive pauses?
CMPUT 425/525: Automatic Storage Management 81
Generational Garbage Collection: Generational Garbage Collection: Multiple GenerationsMultiple Generations
Generation 1Generation 2Generation 3Generation 4
CMPUT 425/525: Automatic Storage Management 82
Generational Garbage Collection: Generational Garbage Collection: Multiple GenerationsMultiple Generations
(Continued)(Continued)
Advantages:– Keeps youngest generation’s size small.– Helps address mistakes made by the promotion policy by creating
more intermediate generations that still get garbage collected fairly frequently.
Disadvantages:– Collections for intermediate generations may be disruptive.– Tends to increase number of inter-generational pointers, increasing
the size of the root set for younger generations.
Most generational collectors are limited to just two or three generations.
CMPUT 425/525: Automatic Storage Management 83
Generational Garbage Collection: Generational Garbage Collection: Promotion PoliciesPromotion Policies
A promotion policy determines how many garbage collections cycles (the cycle count) an object must survive before being advanced to the next generation.
If the cycle count is too low, objects may be advanced too fast; if too high, the benefits of generational garbage collection are not realized.
CMPUT 425/525: Automatic Storage Management 84
Generational Garbage Collection: Generational Garbage Collection: Promotion PoliciesPromotion Policies
(Continued)(Continued)
With a cycle count of just one, objects created just before the garbage collection will be advanced, even though the generational hypothesis states they are likely to die soon.
Increasing the cycle count to two denies advancement to recently created objects.
Under most conditions, it increasing the cycle count beyond two does not significantly reduce the amount of data advanced.
CMPUT 425/525: Automatic Storage Management 85
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
Inter-generational pointers can be created in two ways:– When an object containing pointers is promoted to an
older generation.– When a pointer to an object in a newer generation is
stored in an object.
The garbage collector can easily detect promotion-caused inter-generational pointers, but handling pointer stores is a more complicated task.
CMPUT 425/525: Automatic Storage Management 86
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
Pointer stores can be tracked via the use of a write barrier:– Pointer stores must be accompanied by extra
bookkeeping instructions that let the garbage collector know of pointers that have been updated.
Often implemented at the compiler level.
CMPUT 425/525: Automatic Storage Management 87
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
(Continued)(Continued)
However, write barriers only provide a conservative estimation of live intergenerational pointers:
Old Generation New Generation
Root Set
CMPUT 425/525: Automatic Storage Management 88
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
(Continued)(Continued)
Tracking inter-generational pointers are often the largest cost of generational garbage collection.
1 percent of a typical Lisp program’s total instruction count are pointer stores. If a write barrier adds 10 instructions to a pointer store, overall performance will drop by 10 percent.
CMPUT 425/525: Automatic Storage Management 89
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
(Continued)(Continued)
Entry Tables– Pointers from older generations point indirectly
to younger generations via an entry table:Generation 2 Generation 1Generation 3
Entry Table Entry Table
CMPUT 425/525: Automatic Storage Management 90
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
(Continued)(Continued)
Entry Table: Advantages– When a younger generation is collected, only the entry
table for that generation needs to be scanned.
Entry Table: Disadvantages– Entry table may contain several entries to the same
object, making scans of the object table proportional to the number of pointer stores rather than to the number of inter-generational pointers.
– High overhead because of extra level of indirection.
CMPUT 425/525: Automatic Storage Management 91
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
(Continued)(Continued)
Remembered Sets– The write barrier checks to see if a pointer
being stored in an old objects points to an object in a newer generation. If so, the address of the old object is added to the remembered set (if that object is not already in the set).
CMPUT 425/525: Automatic Storage Management 92
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
(Continued)(Continued)
Remembered Sets (Continued)New GenerationOld Generation
Remembered Set
CMPUT 425/525: Automatic Storage Management 93
Generational Garbage Collection: Generational Garbage Collection: Inter-generational PointersInter-generational Pointers
(Continued)(Continued)
Remembered Sets: Advantages– Scanning is proportional to the number of
stored-into objects, not the number of store operations.
Remembered Sets: Disadvantages– Pointer store checking can be expensive.
CMPUT 425/525: Automatic Storage Management 94
Generational Garbage Collection: Generational Garbage Collection: Collection SchedulingCollection Scheduling
Generational garbage collection aims to reduce pause times. When should these (hopefully short) pause times occur?
Two strategies exist:– Hide collections when the user is least likely to
notice a pause, or– Trigger efficient collections when there is likely
to be lots of garbage to collect.
CMPUT 425/525: Automatic Storage Management 95
Generational Garbage Collection: Generational Garbage Collection: AdvantagesAdvantages
In practice it has proven to be an effective garbage collection technique.
Minor garbage collections are performed quickly.
Good cache and virtual memory behavior.
CMPUT 425/525: Automatic Storage Management 96
Generational Garbage Collection: Generational Garbage Collection: DisadvantagesDisadvantages
Performs poorly if any of the main assumptions are false:– That objects tend die young.– That there are relatively few pointers from old objects
to young ones.
Frequent pointer writes to older generations will increase the cost of the write barrier, and possibly increase the size of the root set for younger generations.
CMPUT 425/525: Automatic Storage Management 97
Garbage Collection: SummaryGarbage Collection: Summary
Method Conservatism Space Time Fragmentation Locality
Mark Sweep Major Basic 1 traversal + heap scan
Yes Fair
Mark Compact Major Basic Many passes of heap
No Good
Copying Major Two Semispaces
1 traversal No Poor
Reference Counting
No Reference count field
Constant per Assignment
Yes Very Good
Deferred Reference Counting
Only for stack variables
Reference Count Field
Constant per Assignment
Yes Very Good
Incremental Varies depending on algorithm
Varies Can be Guaranteed Real-Time
Varies Varies
Generational Variable Segregated Areas
Varies with number of live objects in new generation
Yes (Non-Copying)
No (Copying)
Good
Tracin
gIn
cremen
tal
CMPUT 425/525: Automatic Storage Management 98
Garbage Collection: Garbage Collection: ConclusionsConclusions
Relieves the burden of explicit memory allocation and deallocation.
Software module coupling related to memory management issues is eliminated.
An extremely dangerous class of bugs is eliminated.
CMPUT 425/525: Automatic Storage Management 99
Garbage Collection: ConclusionsGarbage Collection: Conclusions(Continued)(Continued)
Zorn’s study in 1989/93 compared garbage collection to explicit deallocation:
– Non-generational Between 0% and 36% more CPU time. Between 40% and 280% more memory.
– Generational garbage collection Between 5% to 20% more CPU time. Between 30 and 150% more memory.
Wilson feels these numbers can be improved, and they are also out of date.
A well implemented garbage collector will slow a program down by approximately 10 percent relative to explicit heap deallocation.
CMPUT 425/525: Automatic Storage Management 100
Garbage Collection: ConclusionsGarbage Collection: Conclusions(Continued)(Continued)
Despite this cost, garbage collection a feature in many widely used languages:– Lisp (1959)– Perl (1987)– Java (1995)– C# (2001)– Microsoft’s Common Language Runtime (2002)
CMPUT 425/525: Automatic Storage Management 101
Garbage Collection: PointersGarbage Collection: Pointers
Heap of fish applet (Mark and Sweep garbage collection example)http://www.artima.com/insidejvm/applets/HeapOfFish.html
Java HotSpot Garbage Collection Strategieshttp://developer.java.sun.com/developer/technicalArticles/Networking/HotSpot/
The Memory Management Referencehttp://www.memorymanagement.org/
Uniprocessor Garbage Collection Techniques (Wilson)
http://www.cs.ualberta.ca/~duane/courses/425-525/WilsonACMDraft.pdf
Garbage Collection: Algorithms for Automatic Dynamic Memory Management
(Richard Jones and Rafael Lins)
CMPUT 425/525: Automatic Storage Management 102
Questions?Questions?
If you have any questions, please feel free to
e-mail one of us:Patrick Earl [email protected]
Simon Leonard [email protected]
Jack Newton [email protected]