02/13/20071 indexing noncrashing failures: a dynamic program slicing-based approach chao liu,...

02/13/2007 1

Indexing Noncrashing Failures: A Dynamic Program Slicing-Based Approa

ch

Chao Liu, Xiangyu Zhang, Jiawei Han, Yu Zhang, Bharat K. Bhargava

University of Illinois at Urbana-ChampaignPurdue University

Supported by NSF 0242840, 0219110

2

Overview

Problem: Automatically cluster program failures that are due

to the same bug. Solution:

By looking at the similarity between the dynamic slices of program failures.

3

Outline

Motivation

Failure Indexing in Formulation

Dynamic Slicing-Based Failure Indexing

Experiments

Conclusion

4

Automated Failure Reporting End-users as Beta testers

Valuable information about failure occurrences in reality 24.5 million/day in Redmond (if all users send)

– John Dvorak, PC Magazine Widely adopted because of its usefulness

Microsoft Windows, Linux Gentoo, Mozilla applications … Any applications can implement this functionality

5

Failure Report

Automatic reports (windows/mozilla) Application name, version (e.g., winword.exe) . Module name, version (e.g., mso.dll) Offset into module (for example, 00003cbb). Calling context.

Manual reports (bugzilla) Textual description of the symptoms Failure inducing input

6

After Failures Collected …

Failure triage Failure prioritization:

What are the most severe bugs? Worst 1% bugs = 50% failures

Duplicate failure removal Same failures can be reported multiple times

Patch suggestion Automatically locating the patch by querying the patch

database with the reported failure

7

Cluster failure reports that may correspond to the same fault .

A Solution: Failure Indexing

Most Severe

Less Severe

Least Severe

Failure Reports

Failure Reports

++++ ++

++ +++++

++ ++++

++

+

X

Y

0

8

Current Status of Failure Indexing Great success in indexing crashing failures

Same crashing venues likely imply the same failure

E.g., Microsoft Dr. Watson System, Mozilla Quality Feedback Agent …

Elusive: How to index noncrashing failures Noncrashing failures are mainly due to semantic b

ugs Hard to index because crashing contexts not avail

able anymore

9

Noncrashing Failures

Examples. Unwanted dialogs. Undesired visual outputs, e.g. colors, layouts. Periodical loss of focus. Periodical loss of connection. Abnormal memory consumption. Abnormal performance.

Caused by semantic bugs.

10

Semantic Bugs Dominate

16%

3%

78%

3%

Semantic Bugs:• Application specific• Only few are detectable• Mostly require annotations or specifications

Memory-related Bugs:• Many are detectable

Others

Concurrency bugs

Bug Distribution [Li et al., ICSE’07]• 264 bugs in Mozilla and 98 bugs in Apache manually checked• 29,000 bugs in Bugzilla automatically checked

Courtesy of Zhenmin Li

11

Existing Approaches to Indexing Noncrashing Failures T-Proximity [Podgurski et al., ICSE 2003]

Failures exhibiting similar behaviors (e.g., similar branchings) are indexed together

Entire execution is considered R-Proximity [Liu and Han, FSE 2006]

Failures likely due to the same bug are indexed together

Bug location for each failure is automatically found through statistical debugging tool SOBER [Liu et al., FSE 2005]

12

Comments on Existing Approaches Ideal Solution (possible through manual effort)

Index by root causes (i.e., the exact fault location) Finding root causes for every failure is exactly what failure indexi

ng wants to circumvent T-Proximity

Indexing based on the entire execution But usually only a small part of an execution is failure-relevant

R-Proximity Indexing by likely fault location – failure-relevant Better quality than T-Proximity, but requires a set of passing exec

utions to find the likely fault location Theme of this paper

Can we index noncrashing failures as effectively as R-Proximity without any successful executions?

13

Outline

Motivation



Experiments

Conclusion

14

Failure Indexing in Formulation A failure indexing technique is a function pair

: Signature function that represents a failing execution in certain ways

: Distance function that calculates the dissimilarity between two failure signatures

Indexing result A proximity matrix where the (i, j) cell is the dissimilarity bet

ween failure and , i.e.,

Failures and are indexed together if is small

),( D

D

))(),((),(),( jiD xxDjiM ),(),( jiM D

ix jx

ix jx

15

Metrics for Indexing Effectiveness No quantitative metric for indexing effectiveness exi

sts Indexing effectiveness

Cohesion: To what extent failures due to the same bug are close to each other

Separation: To what extent failures due to different bug are separated from each other

Silhouette coefficient A measure adapted from data mining A value ranges from -1 to 1, the higher the better More details in paper (Section 2.2)

16

Outline

Motivation



Experiments

Conclusion

17

Dynamic Slicing-Based Failure Indexing Dynamic slicing as the failure signature

function

18

Dynamic Slicing

Full dynamic slice (FS) is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and

Laski, 1988]

……

10. A =

…...

20. B =

……

30. P =

31. If (P<0) {

......

35. A = A + 1

36. }

37. B=B+1

……

40. Error(A)

FS (A@40) = {10, 30, 35, 40}FS (A@40) = {10, 30, 35, 40}

19

Data Slicing

Full dynamic slice (FS) is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and Laski, 1988]

Data slice (DS): only data dependence is considered.

……

10. A =

…...

20. B =

……

30. P =

31. If (P<0) {

......

35. A = A + 1

36. }

37. B=B+1

……

40. Error(A)DS (A@40) = {10, 35, 40}DS (A@40) = {10, 35, 40}

20

Distance between Dynamic Slices For any two non-empty dynamic slices an

d of the same program, the distance between them is

||

||1),(

ji

jiji ee

eeeeD

ieje

5.0|}40,35,31,10{|

|}40,10{|

||

||1),(

}40,31,10{},40,35,10{

21

2121

21

ee

eeeeD

ee

21

Outline

Motivation



Experiments

Conclusion

22

Experiment Result

Experiment setup Benchmark (gzip 1.2.3) obtained from the Softwar

e-artifact Infrastructure Repository (SIR from Nebraska Lincoln), together with a test suite

6,184 lines of C code Ground-truth determination

group 1group 2

group 1 &2

-

23

Two Semantic Bugs in Gzip-1.2.3

Ground Truth: 217 input test cases (executions) in total 82 cases fail due to both faults, no crashes 65 fail due to Fault 1, 17 fail due to Fault 2

deflate.c/*Fault 1*/

/*Fault 2*/

24

Indexing Result R-Proximity is the mos

t effective Expected because it u

ses information from both passing and failing executions

T-Proximity is the worst Expected because it e

ssentially indexes the entire execution, rather than the failure relevant part

FS-Proximity and DS-proximity More effective than T-

Proximity because indexing on failure-relevant information

Less effective than R-Proximity because of no access to passing executions

Red crosses are for failures due to Fault 1 Blue circles are for failures due to Fault 2

Proximity Graph(PG): the axes are meaningless, if two objects are distant in the PG, they are distant in their original space

25

Indexing Result- A Closer Look (1) Data slices can

precisely capture the error propagation mechanism of Fault two.


26

Indexing Result- A Closer Look (2)

Data slices can precisely capture the two different error propagation mechanisms of Fault 1


27

Observations

Dynamic slicing based failure proximity is more effective than T-Proximity

DS-Proximity is more accurate than FS-Proximity

DS-Proximity is able to produce more cohesive individual clusters. However, clusters belong to the same bug may be

distant due to the different error propagations. Not as good as R-Proximity But does not require passing reports.

28

Outline

Motivation



Experiments

Conclusion

29

Conclusions

Indexing noncrashing failures An increasingly important question as crashing failures are

tackled more and more nicely Not intensively studied yet

Dynamic slicing-based failure indexing Effective and does not rely on passing executions

A framework to develop and evaluate more indexing techniques Decomposition of an indexing technique into signature func

tion and distance function – Many instantiations Quantitative evaluation metrics for scientific study

30

Further discussion,contact [email protected]@cs.purdue.edu

02/13/20071 indexing noncrashing failures: a dynamic program slicing-based approach chao liu,...

Documents

reported failure slide

failure occurrences

symptoms failure

indexing crashing failures

failures likely

severe bugs

cluster failure reports

functionality slide