Interprocedural Path Profiling
David Melski
Thomas Reps
University of Wisconsin
Introduction
• What is path profiling?– Counts the number of times particular path
fragments are executed
• Our work: extensions of Ball-Larus– New interprocedural techniques– New intraprocedural techniques
Applications
• Program Optimization– Path-qualified dataflow analysis (Ammons, Larus)
• Software Maintenance– Path spectra (Reps et al.)– “Oddball” paths– Debugger applications
Ball-Larus Tech.
• “Remove” cycles – Add surrogate edges– Remove backedges
Left with a DAG: have a finite
number of acyclic paths
Ball-Larus Tech.
• Label each vertex v with numPaths[v],
(the number of paths from v to Exit.)
• Use bottom-up traversal
Ball-Larus Tech.
• Label edges such that:
For each path p, p’s path number—the sum of p’s edges—is a unique value
pathNum = 0
pathNum = 3
pathNum = 1
pathNum = 2
Ball-Larus Tech.
• Add Instrumentation:• Var. pathNum
Example: start w/ pathNum=0pathNum+=0 (pathNum=0)
pathNum+=0 (pathNum=0)pathNum+=0 (pathNum=0)
pathNum+=0 (pathNum=0)
pathNum+=0 (pathNum=0)Profile[pathNum] ++pathNum=2
Edge Labels
Introductory Example (main)int main() { double t, result = 0.0; int i = 1;
while( i <= 18 ) { if( (i%2) == 0 ) { t = pow( i, 2 ); result += t; } if( (i%3) == 0 ) { t = pow( i, 2 ); result += t; } i++; } return 0;}
( ) ( )2 32
1
92
1
6
j kj k
Supergraph G*
• Unique vertices Entryglobal and Exitglobal
• CFG for each procedure P– Unique EntryP and ExitP
• call and return-site vertices for each procedure call
• call-edges and return-edges connect calls to procedures
Invalid Path Example
• Do not want to consider invalid paths for profiling
Valid Paths
pow
m ain
Entryg l o b a l
Exitg l o b a l
• Label interprocedural edges with parens
• Don’t accept paths with mismatched parens
( )
[}
{]
Invalid: ( { ] )Same-Level: ( { } )Unbalanced-Left: ( {
Interprocedural Cycles
• Complicates interprocedural profiling
Creating G*-fin
• Modify G*:– In each procedure:
• Add Gexit• Remove Backedges
– (Removes cycles in control-flow graphs)u8: Gexitpow
Creating G*-fin
• Modify G* (cont):– Remove recursive
edges– (Removes cycles in
call graph)
R
main
Entryg l o b a l
Exitg l o b a l
Observable Paths
• In G*-fin:– A finite number of unbalanced-left paths. – Each unbalanced-left path defines an
observable path—an item that we log in a profile.
– (observable paths are unbalanced-left because they may end in the middle of a procedure)
Context-Prefix and Active-Suffix
S
Q
R
Entryg l o b a l
Exitg l o b a l
• Each path has– a context-prefix– an active-suffix– a counter
• The counter counts the executions of the active-suffix in the context of the context-prefix
Overview: Instrumentation
• Each procedure Q takes additional parameters: – pathNum (passed by reference)– numPathsExitQ (passed by value)
• On a procedure call to Q from P, calculate numPathsExitQ for current context:– numPathsExitQ = yr(numPathsExitP)
Overview: Instrumentation
numPathsExitPow = yr(numPathsExitMain)
pathNumOnEntryPow = pathNum
E.g., Function call:
Overview: Instrumentation
pathNum += re(numPathsExitPow)
E.g., Edge Traversal:
Overview: Instrumentation
pathNum += re(numPathsExitPow)Profile[pathNum] ++
pathNum = pathNumOnEntryPowpathNum += re(numPathsExitPow)
E.g., Backedge Traversal:
Assigning y functions
• Solve the following equations:
Exit
GExit
Entry
P
P
Q
Exit vertex
GExit vertex
Call vertex to Q with return vertex
Otherwise
x x
x
c rc r
v ww v
.
.
succ( )
1
f g x f x g x . ( ) ( )
Assigning ycall functions
numPaths[ExitPow] = Yrtn(numPaths[ExitMain])
numPaths[EntryPow] = Yentry(Yrtn(numPaths[ExitMain])
numPaths[Call] = Yentry(Yrtn(numPaths[ExitMain])
Ycall = Yentry Yrtn
Discussion
• Relationship of y functions to Interprocedural DFA (e.g., Sharir and Pnueli’s j functions):
Exit
GExit
Entry
P
P
Q
Exit vertex
GExit vertex
Call vertex to Q with return vertex
Otherwise
x x
x
c rc r
v ww v
.
.
succ( )
1
Otherwise
tex return ver with Q to vertex Call
xExit verte.
)succ(
Entry
Exit
Q
P
w
vw
v
rc rc
xx
Conclusion
• Interprocedural Context Path Profiling– still some difficulties:
• Doubly exponential observable paths
(but can “prune” paths)• instrumentation is somewhat more costly
(2 ops per edge instead of 1)
– Need static analysis to find y and r functions
Conclusion
• Developed a toolkit of path-profiling techniques:– Interprocedural vs. intraprocedural– Edge functions vs. edge values– Context vs. piecewise
Possible Approaches
• Remove most call-edges and return-edges• Inline every non-recursive procedure• Duplicate every non-recursive procedure• Parameterize instrumentation in each
procedure to behave differently for different contexts
Handling Recursion
• Assign values to each edge EntryglobalDEntryRi:
otherwise)1(
0 if0
jREntryij
i
Handling Recursion
• Before a recursive call to R:– save pathNum (in pathNumBeforeCall)– set pathNum to value on EntryglobalDEntryR
• After a recursive call– update profile with pathNum– restore pathNum to pre-call value (from pathNumBeforeCall)
Theory: Context-Free DAGs
• Let L be a context-free language over S• Let G be a directed graph whose edges are
labeled with members of S• A path in G is an L-path if its word is in L• (L,G) is a context-free DAG if the number
of L-paths through G is finite
Interprocedural Piecewise Profiling
• Modification of context profiling:– For each procedure P:
• Add the vertex GEntryP
• Add the edge EntryglobalDGEntryP
– Replace each surrogate edge EntryPDv with GEntryPDv
• Use Unbalanced-Right-Left paths in G*-fin• (must handle unbalanced-right paths)
Other Techniques
• Intraprocedural context path profiling– Context indicates the path taken to a loop
header
• Hybrid techniques– Exploit parameterization of the instrumentation
Discussion
• Keeping the numbering dense:– the number of paths can be
exponential in the size of the graph– might require O(n) bits for pathNum
– dense numbering important– piecewise or hybrid may be more
practical