pointer and escape analysis for multithreaded programs alexandru salcianu martin rinard laboratory...
DESCRIPTION
Outline Example Analysis Experimental Results Related Work ConclusionsTRANSCRIPT
Pointer and Escape Analysis for Multithreaded Programs
Alexandru Salcianu Martin Rinard
Laboratory for Computer ScienceMassachusetts Institute of Technology
{salcianu, rinard}@lcs.mit.edu
Goal
Automatically extract precise points-to and escape information
for multithreaded programs
Analyze and optimize multithreaded programs which use
region-based memory allocation
Application
Outline• Example• Analysis• Experimental Results• Related Work• Conclusions
Parallel Fibonacci ComputationFib(3)
Parallel Fibonacci ComputationFib(3)
Fib(2) Fib(1)
Spawn threads
Parallel Fibonacci ComputationFib(3)
Fib(2) Fib(1)
Fib(1) Fib(0)
Spawn threads
Parallel Fibonacci ComputationFib(3)
Fib(2) Fib(1)
Fib(1) Fib(0)
Join threads
Parallel Fibonacci ComputationFib(3)
Fib(2) Fib(1)
Fib(1) Fib(0)
Join threads
Parallel Fibonacci ComputationFib(3)
Fib(2) Fib(1)
Fib(1) Fib(0)
Final result
Fibonacci Codewhile(1) { int i = read_input(); Fib f = new Fib(i); Fib.run();}
class Fib implements Runnable { int source; Fib(int i) { source = i; }
public void run() { Task t = new Task(new Integer(source)); t.start(); t.join(); System.out.println(t.target); }}
Class Task extends Thread { Integer source, target;
Task(Integer s) { source=s;}
public void run() { int v = source.intValue(); if(v<=1) { target = value; } else { Task t1 = new Task(new Integer(v-1)); Task t2 = new Task(new Integer(v-2)); t1.start(); t2.start(); t1.join(); t2.join(); int x = t1.target.intValue(); int y = t2.target.intValue(); target = new Integer(x+y); }}
Parallel Fibonacci ComputationFib(3)
Fib(2) Fib(1)
Fib(1) Fib(0)
Final result
Impact of Dynamic Object Allocation
• More garbage collection• Execution time overhead
• The garbage collection cycles interfere with the application
• Real time constraints are difficult to meet
• Try to solve this by exploiting the strong correlation between lifetime of objects and lifetime of computation
Solution
Execute each computation in its own memory region:
•computation allocates its objects in that region
•when computation ends, all objects are deallocated
Advantages of Regions• Good news: no need for garbage
collection !
• More predictable programs
• Great for real time applications with hard time constraints
Advantages of Regions• Good news: no need for garbage
collection !
• More predictable programs
• Great for real time applications with hard time constraints
• Adopted in the Real Time Specification for Java (Bollela et al., 2000)
Using Regions in Example
while(1) { int i = read_input() ; Fib f = new Fib(i); Region r = new Region(); r.enter(f);}
r.enter(f) will execute the run() method of f inthe memory region r.
Lifetime of region = lifetime of computation
Nested Regions• Short-lived computations are embedded
into bigger computations• The nesting of regions corresponds to
the nesting of computations• Hierarchy of memory regions• Lifetime of a child region is included in
the lifetime of its parent region
Nested Regions Example
MemoryRegionObject
Nested Regions Example
ParentMemoryRegion
Object
ChildMemoryRegion
Nested Regions Example
ParentMemoryRegion
Object
ChildMemoryRegion
Safety Problem
ParentMemoryRegion
Object
ChildMemoryRegion
Danglingreference
Dynamic Check Approach
MemoryRegionObjectReferencing
Down RegionsIs NOT OK
ReferencingUp Regions
Is OK
Dynamic checks to make sure all references go up
Problems with Dynamic Check Approach
• Execution time overhead
• Programs have to cope with a new kind of runtime exception
• Detecting the error at runtime may not be that useful …
Our Goal• Analyze the program and statically
check that it never creates dangling references
• If no object is reachable from outside the computation that creates it, clearly no dangling references
Dynamic Check Approach
MemoryRegionObjectReferencing
Down RegionsIs NOT OK
ReferencingUp Regions
Is OKEscapedobject
Our Goal• Analyze the program and statically check
that it never creates dangling references
• If no object is reachable from outside the computation that creates it, clearly no dangling references
• Escape analysis: given a computation, find out which objects escape from the computation, i.e., are reachable from outside the computation
Region Safety Analysis1. Identify all run() methods that might
be called by Region.enter()• Each such method + threads it starts
represent one possible computation2. Use pointer and escape analysis to
check that for every computation, no object created inside it is reachable from outside
3. If so, no dangling references4. Can remove all checks!
Why Do We Need a New Analysis?
• Existing analyses treat threads in a very conservative way:• All objects reachable from a thread
are considered to escape• No attempt is done to recapture them
• But in Fib example, all objects escape into some thread, but none of them escape the whole computation
Key Contribution of Analysis• Analyze interactions between threads• Can recognize when objects do not
escape a multithreaded computation• Even when the objects are accessed by
multiple threads within the computation
Outline• Example• Analysis• Experimental Results• Related Work• Conclusions
Analysis Key Features• Uses graphs to model heap
• Nodes represent objects• Edges represent references
• Intra-procedural analysis is flow sensitive• Inter-procedural analysis is bottom-up• Compositional at both method and
thread level:• Analyzes a method / thread once,
specializes the result for each use• Records enough info to analyze the
interactions between parallel threads
Nodes• NI = inside nodes
• represent objects created within the analyzed part of the program
• one inside node for each object creation site; represents all objects created at site
• thread nodes represent thread objects• NO = outside nodes
• placeholders for unknown nodes• will be disambiguated in the inter-procedural/
inter-thread analysis• key element for compositionality
nI
nO
Outside node types• NP = parameter nodes
• represent objects passed as incoming parameters
• NL = load nodes• represent objects loaded from a node
reachable from outside the analyzed part of the program
• one load node for each load statement in a method
Edges• Used to model heap references
• Inside edges• represent references created by the analyzed
part of the program
• Outside edges• represent heap references read from nodes
reachable from outside the analyzed part of the program
n1 n2
n3 n4
f
f
Escape function• A node escapes if it is reachable from
outside the analyzed part of the program:• Parameter nodes escape• Nodes corresponding to unanalyzed
started threads escape• Nodes reachable from an escaped node
escape too• The escape function records how each
node escapes: through a parameter, through an unanalyzed started thread etc.
Parallel Interaction Graph• Models the result of the execution of the
analyzed part of the program
• Contains:• Inside edges• Outside edges• Escape function• Started threads
• Action ordering
Inherited from basealgorithm for sequential programsKey extension for multithreaded programsImproves precisionof analysis
Intra-procedural analysis• Analysis scope = one method
• Initial state:• formals point to parameter nodes• each parameter nP escapes through
itself: e(nP) = { nP }• no thread has been started yet
• Transfer functions for each type of instruction
void static foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
1a
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
1a
2b
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
1a
2b
f
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
1 is started
1a
2b
f
1 2and 1escape into
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
1 is started
1 2 and 1escape into
, 3
1a
2b
f
3c
g
Inter-thread analysis• Extends the scope of the analysis from
a method to a method + threads it starts
• Given a program point P• Find a parallel interaction graph that
reflects the interaction of:• Current method up to P• Threads it [transitively] starts
Inter-thread analysis• Suppose there is only one started
thread
• First step: get the parallel interaction graph at the end of the run() method of that thread
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
1 is started
1 2 and 1escape into
, 3
1a
2b
f
3c
g
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
Class SThread extends Thread { public void run() { x = this.f; y = new C(); x.g = y; }}
1 is started
1 2 and 1escape into
, 3
1a
2b
f
3c
g
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
Class SThread extends Thread { public void run() { x = this.f; y = new C(); x.g = y; }}
1 is started
1 2 and 1escape into
, 3
1a
2b
f
3c
g
4this
y
5
f
g
x
6
4 5 and 4escape through
, 6
Inter-thread analysis• We want to combine the two parallel
interaction graphs in a single one
• Need to disambiguate the outside nodes
• Second step: map the outside nodes from one graph to nodes from the other graph• Initial mappings• Rules for extending them
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
Class SThread extends Thread { public void run() { x = this.f; y = new C(); x.g = y; }}
1 is started
1 2 and 1escape into
, 3
1a
2b
f
3c
g
4this
y
5
f
g
x
6
4 5 and 4escape through
, 6
2
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
Class SThread extends Thread { public void run() { x = this.f; y = new C(); x.g = y; }}
1 is started
1 2 and 1escape into
, 3
1a
2b
f
3c
g
4this
y
5
f
g
x
6
4 5 and 4escape through
, 6
2
2
void foo() { a = new SThread(); b = new C(); a.f = b; a.start(); c = b.g;}
Class SThread extends Thread { public void run() { x = this.f; y = new C(); x.g = y; }}
1 is started
1 2 and 1escape into
, 3
1a
2b
f
3c
g
4this
y
5
f
g
x
6
4 5 and 4escape through
, 6
2
2
1
1a
2b
f
3c
g
2
4this
y
5
f
g
x
6
2
1
1a
2b
f
3c
g
2
4this
y
5
f
g
x
6
2
1
1a
2b
f
3cg
5
6
1a
2b
f
3c
g
2
4this
y
5
f
g
x
6
2
1
1a
2b
f
3cg
5
6
Nothing escapes in the new graph !
has been analyzed;1Threadobjects no longer escape through it !
1a
2b
f
3c
g
2
4this
y
5
f
g
x
6
2
1
1a
2b
f
3cg
5
6
Load nodes 53 andcan be removed
1a
2b
f
3c
g
2
4this
y
5
f
g
x
6
2
1
1a
2b
f
c
g
6
No node escapes from the scoperepresented by method foo() +the thread it starts.
Final look over the analysis• What makes the inter-thread analysis
possible?
• Analysis deals with unknown execution contexts by using placeholders (outside nodes)
• In the inter-thread analysis, the matching rules are able to disambiguate these placeholders
Outline• Example• Analysis• Experimental Results• Related Work• Conclusions
Analyzed applications
Http web serverQuote quote serverBarnes scientific
computationWater scientific
computationTree synthetic benchmarkArray synthetic benchmark
Use of regions in the applications
• Http and Quote use one region per each connection
• Barnes and Water:• Sequence of computations• Each computation spawns multiple
threads and executes in its own memory region
Intra vs. Inter-thread
Http Intra-threadQuote Intra-threadBarnes Inter-threadWater Inter-threadTree Intra-threadArray Intra-thread
Analysis was able to check that regionsare correctly used!
Analysis vs. Backend time
0
10
20
30
40
50
60
70
80
http quote barnes water tree array
seconds
Analysis time
Backend time
Execution time
0
5
10
15
20
25
30
35
40
45
50
http quote barnes water tree array
seconds
Original version
Checks
No checks
Outline• Motivation• Analysis• Experimental Results• Related Work• Conclusions
Related Work• Standard escape/pointer analyses:
• Blanchet (OOPLSA99)• Bogda and Hoelzle (OOPSLA99)• Choi, Gupta, Serrano, Sreedhar and
Midkiff (OOPSLA99)• Whaley and Rinard (OOPSLA99)
• Treat threads very conservatively:• Any object reachable from a thread is
considered to escape forever
Related Work• Rugina and Rinard (PLDI 99) go beyond
this but deal only with structured parallelism: parbegin / parend blocks of code• We are able to analyze general threads
(POSIX style)
• Ruf (PLDI 00) is able to remove synchronizations on objects synced on by a single thread• In some cases, we can do so even for
objects synced on by multiple threads
Outline• Motivation• Analysis• Experimental Results• Related Work• Conclusions
Conclusions• Inter-thread analysis is challenging but possible
•The main use of the analysis will be to check statically that programs use regions correctly
•Removing dynamic checks can also provide a modest performance improvement