1 presented by krishna balasubramanian lightweight defect localization for java valentin dallmeier,...
Post on 22-Dec-2015
219 Views
Preview:
TRANSCRIPT
11
Presented ByPresented ByKrishna BalasubramanianKrishna Balasubramanian
Lightweight Defect Lightweight Defect Localization for JavaLocalization for Java
VValentin Dallmeier, Christian alentin Dallmeier, Christian Lindig, and Andreas ZellerLindig, and Andreas Zeller
22
ContentsContents IntroductionIntroduction
– Coverage method for defect localizationCoverage method for defect localization– Why use Call Sequences?Why use Call Sequences?
Motivation: Experiment performed Motivation: Experiment performed
– Defect Indicated by Call SequencesDefect Indicated by Call Sequences
Approach: Summarizing Call SequencesApproach: Summarizing Call Sequences
– Deriving Call Sequences from TracesDeriving Call Sequences from Traces– Going from Objects to ClassesGoing from Objects to Classes– Incoming vs Outgoing CallsIncoming vs Outgoing Calls– Collecting TracesCollecting Traces– Overhead of the methodOverhead of the method
33
IntroductionIntroduction
Defect Localization is time consumingDefect Localization is time consuming
Compare coverage of passing and failing Compare coverage of passing and failing runsruns
Method executed only on failing runs is Method executed only on failing runs is defectivedefective
This might not always be the caseThis might not always be the case
44
Certain failures occur only through a sequence of method Certain failures occur only through a sequence of method calls tied to an objectcalls tied to an object
For streams in Java:For streams in Java:– Destructor closes stream after usage if not done so Destructor closes stream after usage if not done so
alreadyalready
– File handles run out if too many left openFile handles run out if too many left open
– A call to read() --> finalize() -X-> close()A call to read() --> finalize() -X-> close()
– Defect OccursDefect Occurs
Indicated by a sequence of method calls.Indicated by a sequence of method calls.
Call Sequences pointing to defectCall Sequences pointing to defect
55
3 Questions Explored3 Questions Explored
1.1. Sequences of Method Calls vs single calls?Sequences of Method Calls vs single calls?
2.2. Per Object vs Global collection?Per Object vs Global collection?
3.3. Defect indication in Callee or Caller?Defect indication in Callee or Caller?
Experiment conducted Experiment conducted – Instrumented a Java program Instrumented a Java program – Collected sequences on a per-object basisCollected sequences on a per-object basis
66
ContributionsContributions
Results:Results:– Sequences better defect predictors Sequences better defect predictors
– Per-object sequences better defect Per-object sequences better defect predictorspredictors
– Caller more likely to be defective than Caller more likely to be defective than CalleeCallee
– Lightweight Lightweight performance comparable to performance comparable to coverage-based approachescoverage-based approaches
77
Defect Indication using Call SequencesDefect Indication using Call Sequences
16 aspect Log { 17 pointcut assign(Object newval, Object targ):18 set(* test..*) && args(newval) && target(targ);1920 before(Object newval, Object targ): assign(newval,targ) { 21 Signature sign = thisJoinPoint.getSignature();22 System.out.println(targ.toString() + "." + sign.getName() +23 ":=" + newval);24 } 2526 pointcut tracedCall():27 call(* test..*(..)) && !within(Log);2829 after() returning (Object o): tracedCall() { 30 // Works if you comment out either of these two lines31 thisJoinPoint.getSignature();32 System.out.println(thisJoinPoint);33 } 34 }
Figure 1: Part of an AspectJ program that causes the Java Virtual Machine to crash
JVM crashes when AspectJ JVM crashes when AspectJ program compiled -> rfailprogram compiled -> rfail
AspectJ compiler has 2929 AspectJ compiler has 2929 classesclasses
Passing run : Comment Passing run : Comment line 32 -> rpassline 32 -> rpass
Compare rfail and rpass Compare rfail and rpass Method getThisJoin-PointVar() Method getThisJoin-PointVar()
of Class BcelShadow called of Class BcelShadow called only in rfailonly in rfail
BcelShadow.getThisJoinPoiBcelShadow.getThisJoinPointVar() potential ntVar() potential candidate with defectcandidate with defect
Proved to be incorrectProved to be incorrect
Bug fixed elsewhereBug fixed elsewhere
88
Failure Failure Sequence of method calls Sequence of method calls only in only in rfailrfail
Sequences collected per Object for Sequences collected per Object for Incoming/Outgoing callsIncoming/Outgoing calls
Sequence of outgoing calls for object Sequence of outgoing calls for object ThisJoinPointVisitor collected only in rfail:ThisJoinPointVisitor collected only in rfail:
ThisJoinPointVisitor.isRef(),ThisJoinPointVisitor.isRef(), ThisJoinPointVisitor.canTreatAsStatic(),ThisJoinPointVisitor.canTreatAsStatic(), MethodDeclaration.traverse(),MethodDeclaration.traverse(), ThisJoinPointVisitor.isRef(),ThisJoinPointVisitor.isRef(), ThisJoinPointVisitor.isRef()ThisJoinPointVisitor.isRef()
AspectJ bug fixed in class ThisJoinPointVisitor !!AspectJ bug fixed in class ThisJoinPointVisitor !!
Difference in coverage may not point to a Difference in coverage may not point to a defectdefect
Difference in call sequences may do so.Difference in call sequences may do so.
Defect Indication using Call SequencesDefect Indication using Call Sequences
99
Performance MeasuresPerformance Measures
Comparison yielded 556 differing Comparison yielded 556 differing sequences of length 5.sequences of length 5.
Originating Class of sequence is Originating Class of sequence is determined.determined.
Each Sequence is assigned a weightEach Sequence is assigned a weight
Classes with most important sequences Classes with most important sequences are ranked at the top.are ranked at the top.
1010
Class ThisJoinPointVisitor ranked 10 out of 542 Class ThisJoinPointVisitor ranked 10 out of 542 executed classesexecuted classes
Programmer has to examine only:Programmer has to examine only:– 1.8% of the executed classes 1.8% of the executed classes – 3.3% of the executed code3.3% of the executed code– 0.3% of the total classes0.3% of the total classes– 0.8% of the entire code0.8% of the entire code
ThisJoinPointVisitor class not included in Coverage ThisJoinPointVisitor class not included in Coverage MethodMethod
This is worse than a random guess!This is worse than a random guess!
ObservationsObservations
1111
Approach: Summarizing Call SequencesApproach: Summarizing Call Sequences
Object receives & initiates millions of method Object receives & initiates millions of method callscalls
Means required to capture and summarize this.Means required to capture and summarize this.
Approach:Approach:– Record observed sequences in sets, not the full trace.Record observed sequences in sets, not the full trace.
– Sequence sets are aggregated per class Sequence sets are aggregated per class
– Incoming and Outgoing calls consideredIncoming and Outgoing calls considered
– Overhead in collecting and analyzing traces kept to a Overhead in collecting and analyzing traces kept to a minimum.minimum.
1212
Trace: Recording of all calls an object Trace: Recording of all calls an object receivesreceives
Recording whole trace becomes Recording whole trace becomes unmanageable due to the large number of unmanageable due to the large number of calls received by the objectcalls received by the object
Abstract representation: Abstract representation: – Record only Record only characteristic sequencescharacteristic sequences of trace of trace
Deriving Call Sequences from Traces
1313
Figure 2: The trace of an object is abstracted to a sequence set using a sliding window.Figure 2: The trace of an object is abstracted to a sequence set using a sliding window.
1.1. Window is slid over trace to get the sequence set Window is slid over trace to get the sequence set 2.2. Window content characterizes the trace.Window content characterizes the trace.3.3. Wider window gives a more precise characteristic set.Wider window gives a more precise characteristic set.
Deriving Call Sequences from Traces
1414
Let trace S be defined as string of calls: Let trace S be defined as string of calls: < m1, m2, … mn> < m1, m2, … mn>
Window size = k, sequence set P(S,k): Window size = k, sequence set P(S,k): PP((SS, , kk) = {w | w is a substring of ) = {w | w is a substring of S S ^ |w| = ^ |w| = kk}}
Example,Example, Window size k =2Window size k =2
Trace S = <abcabcdc>Trace S = <abcabcdc>Then, the resulting set of sequences isThen, the resulting set of sequences isP(S,2) = {ab, bc, ca, cd, dc}P(S,2) = {ab, bc, ca, cd, dc}
Different traces may lead to same setDifferent traces may lead to same set
For trace T = <abcdcdca>,For trace T = <abcdcdca>,P(S,2) = P(T,2)P(S,2) = P(T,2)
Deriving Call Sequences from Traces: Effect of Window Size
1515
Trace Trace Sequence set: Entails a loss of information Sequence set: Entails a loss of information
Window size is importantWindow size is important
Window size k >2, Window size k >2, P(S,k) != P(T,k)P(S,k) != P(T,k)
Context sensitivity of approach dependent on Window size.Context sensitivity of approach dependent on Window size.
Exponential growth of sequence sets does not happenExponential growth of sequence sets does not happen– Method calls do not happen randomly.Method calls do not happen randomly.– They are part of static code with loops that lead to similar They are part of static code with loops that lead to similar
sequences of calls.sequences of calls.
Underlying regularity make sequence sets a useful and compact Underlying regularity make sequence sets a useful and compact abstractionabstraction
Deriving Call Sequences from Traces: Effect of Window Size
1616
Going from Objects to ClassesGoing from Objects to Classes Collecting call traces of objects individually pose memory Collecting call traces of objects individually pose memory
issuesissues
Traces collected at class levelTraces collected at class level
Objects are traced individually Objects are traced individually
Sequence sets of objects are aggregatedSequence sets of objects are aggregated
Traces X and Y of two objects areTraces X and Y of two objects areX = < a b c d d c >; Y = <a a b c a b > X = < a b c d d c >; Y = <a a b c a b >
P(X, 2) = {ab, bc, cd, dd, dc}P(X, 2) = {ab, bc, cd, dd, dc} P(Y, 2) = {ab, bc, ca, aa}P(Y, 2) = {ab, bc, ca, aa} P(X, 2) U P(Y, 2) = {aa, ab, bc, cd, dd, dc, ca}P(X, 2) U P(Y, 2) = {aa, ab, bc, cd, dd, dc, ca}
Union characterizes the behavior of the class.Union characterizes the behavior of the class.
To compare classes in Passing and Failing sequence runs, To compare classes in Passing and Failing sequence runs, Sequence sets are compared.Sequence sets are compared.
1717
Incoming vs Outgoing CallsIncoming vs Outgoing Calls An object receives incoming calls An object receives incoming calls
– Trace says how the object is used by the clientTrace says how the object is used by the client
An object initiates outgoing callsAn object initiates outgoing calls
– Trace says how the object is implementedTrace says how the object is implemented
Both traces are used to detect control flow deviations Both traces are used to detect control flow deviations between a passing and a failing runbetween a passing and a failing run
However, they differ in their ability to relate deviations However, they differ in their ability to relate deviations to defects.to defects.
1919
Incoming vs Outgoing CallsIncoming vs Outgoing Calls : An Example : An Example
Figure 3: Traces of incoming calls (left) and outgoing calls (right) for the Figure 3: Traces of incoming calls (left) and outgoing calls (right) for the aQueue object.aQueue object.
Queue object aQueue Queue object aQueue implemented as a implemented as a Linked ListLinked List
Incoming calls are:Incoming calls are:– add()add()– get()get()
Outgoing calls are:Outgoing calls are:– add()add()– firstElement()firstElement()– removeFirst()removeFirst()
2020
Incoming CallsIncoming Calls Incoming Calls determine client behaviorIncoming Calls determine client behavior
Detects non-conforming clients.Detects non-conforming clients.
Client behavior recorded as sequence sets Client behavior recorded as sequence sets
Class of receiving object is known Class of receiving object is known only method only method names are recorded.names are recorded.
Trace of incoming calls for the aQueue object:Trace of incoming calls for the aQueue object: <add(), isEmpty(), . . . , add(), add()><add(), isEmpty(), . . . , add(), add()>
Deviation is detected … Deviation is detected …
2121
AdvantagesAdvantages Number of methods calls an Object receives is Number of methods calls an Object receives is
restricted by its classrestricted by its class– Leads to smaller tracesLeads to smaller traces– Window size tuned to the number of methodsWindow size tuned to the number of methods
Class behavior can be learnt across different Class behavior can be learnt across different applicationsapplications
LimitationLimitation
* * Difficult to identify the client that causes the Difficult to identify the client that causes the deviation deviation
Led to analyzing Outgoing callsLed to analyzing Outgoing calls
Incoming CallsIncoming Calls
2222
Outgoing CallsOutgoing Calls Method calls for aQueue:Method calls for aQueue:
<LinkedList.add(), LinkedList.size(), Logger.add(), . . . ><LinkedList.add(), LinkedList.size(), Logger.add(), . . . >
Object calls several classesObject calls several classes
Method names are no longer uniqueMethod names are no longer unique
Class Name and Method name recorded in traceClass Name and Method name recorded in trace
Detection of a sequence not in learned set leads to the Detection of a sequence not in learned set leads to the Queue classQueue class
Trace of Outgoing Calls guides the programmer to the Trace of Outgoing Calls guides the programmer to the defectdefect
2323
Collecting TracesCollecting Traces Before execution of program, bytecode is instrumentedBefore execution of program, bytecode is instrumented
Program collects traces, computes the sequence sets Program collects traces, computes the sequence sets and emits them in XML formatand emits them in XML format
Analyzing sequence sets is done offlineAnalyzing sequence sets is done offline
Use Bytecode Engineering Library for instrumentationUse Bytecode Engineering Library for instrumentation
Requires only programs class filesRequires only programs class files
Works on any JVMWorks on any JVM
2424
Each object builds a trace of its Incoming & Outgoing callsEach object builds a trace of its Incoming & Outgoing calls
Trace data stored in global hash tablesTrace data stored in global hash tables
Indexed by an object’s identityIndexed by an object’s identity
Each object creates a unique integer for identification in its Each object creates a unique integer for identification in its constructorconstructor
Incoming callIncoming call– Callee adds its name and signature to its own traceCallee adds its name and signature to its own trace
Outgoing callOutgoing call– Callee adds its name, signature and class to the callers traceCallee adds its name, signature and class to the callers trace– Requires caller’s id for thisRequires caller’s id for this
Collecting TracesCollecting Traces
2525
Example of instrumentation to trace Example of instrumentation to trace Outgoing callsOutgoing calls
class Caller extends Object { class Caller extends Object { class Callee extends Object {class Callee extends Object {
…… ......
public void m() { public void m() { public void message(Object x) {public void message(Object x) {
Callee c;Callee c; Tracer.addCallTracer.addCall
...... (({ { message id for Callee.message})message id for Callee.message});;
Tracer.storeCaller(this.id);Tracer.storeCaller(this.id); { {body of message }body of message }
c.message(anObject);c.message(anObject); } }
{ { body of m body of m }}}} }}
}}Figure 4 Instrumentation of caller and callee to capture outgoing calls.Figure 4 Instrumentation of caller and callee to capture outgoing calls.
1.1. Id of caller stored in Tracer in method Caller.m before Id of caller stored in Tracer in method Caller.m before invocation of callee.messageinvocation of callee.message
2.2. At the start of Callee.message, Tracer.addCall adds the At the start of Callee.message, Tracer.addCall adds the method id of Callee.message to the trace of the calling method id of Callee.message to the trace of the calling objectobject
3.3. Hence addCall only receives the message idHence addCall only receives the message id
4.4. This is an integer key associated with a method, its class, This is an integer key associated with a method, its class, and signature.and signature.
2626
Original trace is not stored due to large sizeOriginal trace is not stored due to large size
Sequence set for each class is computed onlineSequence set for each class is computed online
Sequence sets are Sequence sets are – Small in sizeSmall in size– Kept in memoryKept in memory– Emitted when the program quitsEmitted when the program quits
LimitationsLimitations– Window size must be fixed for a program runWindow size must be fixed for a program run
Sequence sets for many window sizes could be Sequence sets for many window sizes could be computed offline from a raw tracecomputed offline from a raw trace
– Trace is ordered, sequence set is notTrace is ordered, sequence set is not Some of the trace’s inherent notion of time is lostSome of the trace’s inherent notion of time is lost
Generating Sequence SetsGenerating Sequence Sets
2727
Evaluation of overheadEvaluation of overhead
Evaluation done on programs from SPEC Evaluation done on programs from SPEC JVM 98 benchmark suiteJVM 98 benchmark suite
SPEC JVM 98 benchmark suite SPEC JVM 98 benchmark suite – Collection of Java programsCollection of Java programs– Deployed as 543 class filesDeployed as 543 class files– Total size of 1.48 MBTotal size of 1.48 MB
Compared overhead with JCoverage, a tool Compared overhead with JCoverage, a tool for coverage analysisfor coverage analysis
2828
Instrumenting to trace Incoming calls took 100 KB or Instrumenting to trace Incoming calls took 100 KB or 38 class files per second38 class files per second
Instrumented class files increased in size by 26%Instrumented class files increased in size by 26%
Running this takes longer and requires more memory Running this takes longer and requires more memory than original programthan original program
Tracing AspectJ compiler for window size of 8 gave a Tracing AspectJ compiler for window size of 8 gave a modest overheadmodest overhead
Considered more typical for the approach followedConsidered more typical for the approach followed
Overhead of InstrumentationOverhead of Instrumentation
top related