02/13/2007 1
Indexing Noncrashing Failures: A Dynamic Program Slicing-Based Approa
ch
Chao Liu, Xiangyu Zhang, Jiawei Han, Yu Zhang, Bharat K. Bhargava
University of Illinois at Urbana-ChampaignPurdue University
Supported by NSF 0242840, 0219110
2
Overview
Problem: Automatically cluster program failures that are due
to the same bug. Solution:
By looking at the similarity between the dynamic slices of program failures.
3
Outline
Motivation
Failure Indexing in Formulation
Dynamic Slicing-Based Failure Indexing
Experiments
Conclusion
4
Automated Failure Reporting End-users as Beta testers
Valuable information about failure occurrences in reality 24.5 million/day in Redmond (if all users send)
– John Dvorak, PC Magazine Widely adopted because of its usefulness
Microsoft Windows, Linux Gentoo, Mozilla applications … Any applications can implement this functionality
5
Failure Report
Automatic reports (windows/mozilla) Application name, version (e.g., winword.exe) . Module name, version (e.g., mso.dll) Offset into module (for example, 00003cbb). Calling context.
Manual reports (bugzilla) Textual description of the symptoms Failure inducing input
6
After Failures Collected …
Failure triage Failure prioritization:
What are the most severe bugs? Worst 1% bugs = 50% failures
Duplicate failure removal Same failures can be reported multiple times
Patch suggestion Automatically locating the patch by querying the patch
database with the reported failure
7
Cluster failure reports that may correspond to the same fault .
A Solution: Failure Indexing
Most Severe
Less Severe
Least Severe
Failure Reports
Failure Reports
++++ ++
++ +++++
++ ++++
++
+
X
Y
0
8
Current Status of Failure Indexing Great success in indexing crashing failures
Same crashing venues likely imply the same failure
E.g., Microsoft Dr. Watson System, Mozilla Quality Feedback Agent …
Elusive: How to index noncrashing failures Noncrashing failures are mainly due to semantic b
ugs Hard to index because crashing contexts not avail
able anymore
9
Noncrashing Failures
Examples. Unwanted dialogs. Undesired visual outputs, e.g. colors, layouts. Periodical loss of focus. Periodical loss of connection. Abnormal memory consumption. Abnormal performance.
Caused by semantic bugs.
10
Semantic Bugs Dominate
16%
3%
78%
3%
Semantic Bugs:• Application specific• Only few are detectable• Mostly require annotations or specifications
Memory-related Bugs:• Many are detectable
Others
Concurrency bugs
Bug Distribution [Li et al., ICSE’07]• 264 bugs in Mozilla and 98 bugs in Apache manually checked• 29,000 bugs in Bugzilla automatically checked
Courtesy of Zhenmin Li
11
Existing Approaches to Indexing Noncrashing Failures T-Proximity [Podgurski et al., ICSE 2003]
Failures exhibiting similar behaviors (e.g., similar branchings) are indexed together
Entire execution is considered R-Proximity [Liu and Han, FSE 2006]
Failures likely due to the same bug are indexed together
Bug location for each failure is automatically found through statistical debugging tool SOBER [Liu et al., FSE 2005]
12
Comments on Existing Approaches Ideal Solution (possible through manual effort)
Index by root causes (i.e., the exact fault location) Finding root causes for every failure is exactly what failure indexi
ng wants to circumvent T-Proximity
Indexing based on the entire execution But usually only a small part of an execution is failure-relevant
R-Proximity Indexing by likely fault location – failure-relevant Better quality than T-Proximity, but requires a set of passing exec
utions to find the likely fault location Theme of this paper
Can we index noncrashing failures as effectively as R-Proximity without any successful executions?
13
Outline
Motivation
Failure Indexing in Formulation
Dynamic Slicing-Based Failure Indexing
Experiments
Conclusion
14
Failure Indexing in Formulation A failure indexing technique is a function pair
: Signature function that represents a failing execution in certain ways
: Distance function that calculates the dissimilarity between two failure signatures
Indexing result A proximity matrix where the (i, j) cell is the dissimilarity bet
ween failure and , i.e.,
Failures and are indexed together if is small
),( D
D
))(),((),(),( jiD xxDjiM ),(),( jiM D
ix jx
ix jx
15
Metrics for Indexing Effectiveness No quantitative metric for indexing effectiveness exi
sts Indexing effectiveness
Cohesion: To what extent failures due to the same bug are close to each other
Separation: To what extent failures due to different bug are separated from each other
Silhouette coefficient A measure adapted from data mining A value ranges from -1 to 1, the higher the better More details in paper (Section 2.2)
16
Outline
Motivation
Failure Indexing in Formulation
Dynamic Slicing-Based Failure Indexing
Experiments
Conclusion
18
Dynamic Slicing
Full dynamic slice (FS) is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and
Laski, 1988]
……
10. A =
…...
20. B =
……
30. P =
31. If (P<0) {
......
35. A = A + 1
36. }
37. B=B+1
……
40. Error(A)
FS (A@40) = {10, 30, 35, 40}FS (A@40) = {10, 30, 35, 40}
19
Data Slicing
Full dynamic slice (FS) is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and Laski, 1988]
Data slice (DS): only data dependence is considered.
……
10. A =
…...
20. B =
……
30. P =
31. If (P<0) {
......
35. A = A + 1
36. }
37. B=B+1
……
40. Error(A)DS (A@40) = {10, 35, 40}DS (A@40) = {10, 35, 40}
20
Distance between Dynamic Slices For any two non-empty dynamic slices an
d of the same program, the distance between them is
||
||1),(
ji
jiji ee
eeeeD
ieje
5.0|}40,35,31,10{|
|}40,10{|
||
||1),(
}40,31,10{},40,35,10{
21
2121
21
ee
eeeeD
ee
21
Outline
Motivation
Failure Indexing in Formulation
Dynamic Slicing-Based Failure Indexing
Experiments
Conclusion
22
Experiment Result
Experiment setup Benchmark (gzip 1.2.3) obtained from the Softwar
e-artifact Infrastructure Repository (SIR from Nebraska Lincoln), together with a test suite
6,184 lines of C code Ground-truth determination
group 1group 2
group 1 &2
-
23
Two Semantic Bugs in Gzip-1.2.3
Ground Truth: 217 input test cases (executions) in total 82 cases fail due to both faults, no crashes 65 fail due to Fault 1, 17 fail due to Fault 2
deflate.c/*Fault 1*/
/*Fault 2*/
24
Indexing Result R-Proximity is the mos
t effective Expected because it u
ses information from both passing and failing executions
T-Proximity is the worst Expected because it e
ssentially indexes the entire execution, rather than the failure relevant part
FS-Proximity and DS-proximity More effective than T-
Proximity because indexing on failure-relevant information
Less effective than R-Proximity because of no access to passing executions
Red crosses are for failures due to Fault 1 Blue circles are for failures due to Fault 2
Proximity Graph(PG): the axes are meaningless, if two objects are distant in the PG, they are distant in their original space
25
Indexing Result- A Closer Look (1) Data slices can
precisely capture the error propagation mechanism of Fault two.
Red crosses are for failures due to Fault 1 Blue circles are for failures due to Fault 2
26
Indexing Result- A Closer Look (2)
Data slices can precisely capture the two different error propagation mechanisms of Fault 1
Red crosses are for failures due to Fault 1 Blue circles are for failures due to Fault 2
27
Observations
Dynamic slicing based failure proximity is more effective than T-Proximity
DS-Proximity is more accurate than FS-Proximity
DS-Proximity is able to produce more cohesive individual clusters. However, clusters belong to the same bug may be
distant due to the different error propagations. Not as good as R-Proximity But does not require passing reports.
28
Outline
Motivation
Failure Indexing in Formulation
Dynamic Slicing-Based Failure Indexing
Experiments
Conclusion
29
Conclusions
Indexing noncrashing failures An increasingly important question as crashing failures are
tackled more and more nicely Not intensively studied yet
Dynamic slicing-based failure indexing Effective and does not rely on passing executions
A framework to develop and evaluate more indexing techniques Decomposition of an indexing technique into signature func
tion and distance function – Many instantiations Quantitative evaluation metrics for scientific study