garbage collection in the next c++ standard hans-j. boehm, mike spertus, symantec
TRANSCRIPT
![Page 1: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/1.jpg)
Garbage Collection in the Next C++ Standard
Hans-J. Boehm,
Mike Spertus, Symantec
![Page 2: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/2.jpg)
The Context (1)
• Conservative garbage collection for C and C++ has been used for 20+ years.
– Usually works, possibly with a small amount of tweaking.– Especially for 64-bit applications.
• More attractive with multi-core processors.– Explicit memory management gets harder with threads.– Some parallel programming techniques much more
difficult/expensive without GC.– GC parallelizes better than malloc/free.
• GC-based leak detectors are also common.• One major limiting factor:
– C and C++ standards don’t fully sanction garbage collecting implementations.
– Programmers are hesitant to use nonstandard tools.
![Page 3: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/3.jpg)
The Context (2)
• C++ standard is undergoing revision.• “C++0x” expected somewhere near 2010 or 2011.
– Initial committee draft was put out for review.
• Many other new features:– “Concepts” (Templates type-checked in isolation).– Threads support (threads API, memory model, atomics).
• struggling with object lifetime issues.
– Library-based classic reference counting (shared_ptr).– R-value references (references to otherwise inaccessible
values) support low-cost shared_ptr moves.
• Microsoft’s C++/CLI provides a separate garbage-collected heap.
![Page 4: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/4.jpg)
Our Goal
• “Transparent” garbage collection.– Ordinary pointers; works with existing library
code.– Supports
• Code designed for GC• Leak detection• “Litter collection”
– Supports atomic pointers with cheap assignment.
![Page 5: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/5.jpg)
Our Proposal, version 1
• GC support in the implementation “mandatory”.• GC use optional, but must be consistent across
application.– If you have to trace a section of the heap, you might
as well collect it.• Program sections specify “gc_forbidden”,
“gc_required”, or “gc_safe” (default).– Linker diagnoses conflicts.
• Annotations can specify when integral types may contain pointers.
• This proposal is currently on hold, not in CD.
![Page 6: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/6.jpg)
Issues with original proposal (1)
• gc_required / gc_forbidden must be consistent for whole program:– Too coarse.– Need to deal with plug-ins with limited
interface.
![Page 7: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/7.jpg)
Issues with original proposal (2)
• Finalization is needed for interaction of GC with explicit resource management.
• Finalization is problematic in the presence of dead variable elimination.
class C {
int indx;
// E[indx] contains
// associated data.
// Finalizer cleans up E[indx]
void foo() {
int i = indx;
// this dead here.
// May be finalized?
bar(E[i]);
}
![Page 8: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/8.jpg)
Our proposal, version 2
• Minimal compromise proposal– Garbage collected implementations are allowed, not
required.• Officially allows collection of memory allocated
with built-in operator new.– malloc() is arguably in the domain of the C
committee.– malloc() garbage collection may be harder to
retrofit.• Not intended as long term replacement for
proposal 1.• In current Committee Draft.
![Page 9: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/9.jpg)
Proposal 2 components
1. Allow unreachable objects to be reclaimed.
2. Provide a simple API to• Explicitly prevent reclamation of specified
objects (declare_reachable()).• Declare that certain objects do not need to
be traced because they contain no pointers (declare_no_pointers()).
![Page 10: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/10.jpg)
Reclamation of unreachable objects in C++
• Existing conservative collectors reclaim objects not reachable via pointer chains from variables.
• Leak detectors make similar assumptions.
intptr_t q = ~(intptr_t)p;
p = 0;
…
p = (foo *)(~q);
… *p …
• But current standard does not guarantee that unreachable objects are dead.
• Disallow this!• Unavoidably a
compatibility issue
![Page 11: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/11.jpg)
This isn’t as easy as it looks …
• Initial attempt:– Objects that were once unreachable may not
be dereferenced (incl. deallocation).
• Insufficient:
int_ptr_t q = ~(intptr_t)p;
…
foo *r = (foo *)(~q);
p = 0;
… *r …
![Page 12: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/12.jpg)
A better formulation
• Only safely-derived pointers may be dereferenced.• A safely-derived pointer was computed without
intervening integer arithmetic from another safely-derived pointer.
• Safely-derived pointers may only be stored in– pointer objects.– integer objects of sufficient size.– aligned character arrays.
• Whether a value is safely derived depends on how it was computed, not on the bits representing the pointer.– Sometimes p safely derived, r not, but p == r.
• Draft standard contains a precise inductive definition. Thanks to Clark Nelson (Intel).
![Page 13: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/13.jpg)
API addition 1
• Declare_reachable() / undeclare_reachable() allow a pointer to be dereferenced even if it is not safely-derived.– No-ops in non-GC implementation.– Allow old code to be retrofitted.
• Undeclare_reachable() returns safely derived copy of pointer.
![Page 14: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/14.jpg)
Declare_reachable() example
declare_reachable(p);
int_ptr_t q = ~(intptr_t)p;
p = 0;
…
p = undeclare_reachable(foo *)(~q);
… *p …
![Page 15: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/15.jpg)
Implementation Challenges
• Implemented as global GC-visible multiset representation, but:– Declare_reachable() applies to complete
objects. Undeclare_reachable() argument need not match exactly.
– Matching calls don’t need to come from the same thread: Scalability with thread/processor count.
![Page 16: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/16.jpg)
API Addition 2
• Declare_no_pointers(p,n) / undeclare_no_pointers(p,n) declares the address range [p, p+n) to not hold pointers; safely derived pointers may not be stored there.
• Allows the programmer to specify more “type” information.
• Much more compatible with C++ constructor/destructor model than allocation-time specifications.
• Can be applied to static/stack/heap objects.• Undeclare_no_pointers() must be called before
explicit deallocation.
![Page 17: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/17.jpg)
Declare_no_pointers () example
class foo { foo * next; char cmprsd[N]; public: foo() { … declare_no_pointers(cmprsd, N); } ~foo() { … undeclare_no_pointers(cmprsd, N); } …}
![Page 18: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/18.jpg)
Implementation Challenges
• Efficient handling for frequently constructed stack objects.
• Scalability.
![Page 19: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/19.jpg)
Prototype Implementation
• Currently just track registered ranges.– Processing deferred to GC time.
• Keep a small number of ranges in a thread-local data structure.
• Very small ranges and smaller objects are currently ignored.
![Page 20: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/20.jpg)
Preliminary Performance Measurementspr
oces
sor
nsec
s/op
-pai
r
threads
![Page 21: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/21.jpg)
Conclusions
• Current C++0x draft explicitly allows garbage-collected implementations.
• Support APIs differ from existing implementations.– For good reasons, we think.
• New set of implementation challenges.• More extensive GC support will be
considered after C++0x.• Not too late for comments.
![Page 22: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec](https://reader035.vdocument.in/reader035/viewer/2022062511/55142c49550346e7488b5d46/html5/thumbnails/22.jpg)
Questions?