bug isolation via remote program sampling ben liblit, alex aiken, alice x.zheng, michael i.jordan...

28
Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, A lice X.Zheng, Michael I.J ordan Presented by: Xia Cheng

Upload: maria-brooks

Post on 18-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug Isolation via Remote Program Sampling

Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan

Presented by: Xia Cheng

Page 2: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Outline of My Talk

Bug Isolation Using Predicate Elimination Statistical Debugging Related Work Privacy and Security Future Work Conclusions

Page 3: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug Isolation

Using Predicate Elimination Instrumentation strategyElimination strategiesData collection and analysisRefinement over timePerformance Impact

Page 4: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug Isolation Instrumentation strategy

Automatic isolation of deterministic bugs Release 1.2 of the ccrypt encryption tool

Randomly sampling function Return values may identify key operations that behave differe

ntly in successful versus crashed runs

Group return value into three classes Negative values, zero, and positive values

Page 5: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug IsolationInstrument ccrypt

Syntactic call site

Return scalar values

Update one of three counters

triple of counters

Negative value zero positive value

Page 6: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug Isolation Elimination Strategies - Discard irrelevant predicates

Elimination by universal falsehood Disregard any counter that is zero on all runs Represent predicates that can never be true

Elimination by lack of failing coverage Disregard any triple of counters all three of which

are zero on all failed runs Not even reached in failing executions

Page 7: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug Isolation Discard irrelevant predicates (cont…)

Elimination by lack of failing example Disregard any counter that is zero on all failed runs Not be true for a failure to occur

Elimination by successful counter example Disregard any counter that has a non-zero value on any

successful run Can be true without a subsequent program failure

Page 8: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug Isolation

universal falsehood discards 1569 counters-zero on all runs

leaves 141 candidate predicates

lack of failing coverage discards 526 counter triples –all zero on all crashes

leaves 132 candidate predicates

lack of failing example discards 1665 counters-zero on all crashes

Leaves 45 candidate predicates

successful counterexample

Discards 139 counters-non-zero on any successful run

Leaves 1571 candidate predicates

•Data Collection and Analysis•Successful counterexample distinct, the another three partially overlapped•Falsehood and counterexample test disjoint properties, combined to good effect•Falsehood and counterexample at successful runs, be analyzed together•Failing example eliminates the most features, combine with counterexample•Failing coverage is an inherently weaker strategy

2990 trial runs at sampling rate 1/1000; 88 of these end in a crash

Page 9: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug Isolation•Refinement over time

•Elimination strategies benefit from increasing the number of runs•On average, 1750 runs are enough to isolate twenty candidate features

•Greater diversity benefits the analysis

Page 10: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Bug Isolation Performance Impact

Sampling transformation a simpler but slower pattern of checking the next-

sample countdown at each and every site

the performance impact minimal for sampled instrumentation

Overhead for 1/1000 sampling is less than 4%

Page 11: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Outline of My Talk

Bug Isolation Using Predicate Elimination Statistical Debugging Related Work Privacy and Security Future Work Conclusions

Page 12: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Statistical Debugging

Instrumentation strategy Crash prediction using logistic regression Data collection and analysis Performance Impact

Page 13: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Statistical Debugging Automatic isolation of non-deterministic bu

gs

Instrumentation strategy

Instrument bc to guess and randomly check a large number of predicates

Goal: identify predicates capturing bad behavior-false on success, true on crashing

Cast an extremely broad net An eye toward pointer and buffer error

Page 14: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Statistical Debugging Crash prediction using logistic regression

Goal: narrow down the set of features Method: balance good classification performance

with aggressive feature selection Binary classifier

Given by quantizing the logistic function output Takes feature values as input, and outputs a prediction of

either 0 or 1 Feature selection

Achieved by regularizing the function parameters to ignore most input feature,

Forcing it to form a model that predicts success or failure using just a small selection of sampled features

Page 15: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Statistical Debugging to learn a good classifier, maximize the log

likelihood of the training set

distribution is modeled as logistic function

penalized log likelihood function

Page 16: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Statistical Debugging

Data collection and analysis

bc data set consists of 4390 runs with distinct random inputs and distinct randomized 1/1000 sampling

Page 17: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Statistical Debugging

Page 18: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Statistical Debugging

• Performance Impact

Page 19: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Outline of My Talk

Bug Isolation Using Predicate Elimination Statistical Debugging Related Work Privacy and Security Future Work Conclusions

Page 20: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Related Work Performance profiling and optimization

Triggers - periodic hardware timers/interrupts, periodic software event counters or both [ M.Arnold, 2000]

Digital Continuous Profiling Infrastructure [Anderson 1997] choosing sampling intervals randomly

Trace collection on program understanding Difficulty

Minimizing performance overhead and managing large quantities of captured data

Directly adapt dynamic trace analysis techniques to domain

Page 21: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Related Work (cont…) Sharing the following techniques but with new approach

Daikon, fairly unstructured guesses and eliminate not holding ones [Ernst 2001]

new approach: gathering data from production code

DIDUCE, identify bugs using analysis of executions [Hangal 2002] new approach: more probabilistic, correlating predicate violations

with increased likelihood of failure

Software tomography, through the GAMMA system, low-overhead distributed monitoring of deployed code [Bowring 2002]

new approach: bug isolation

Page 22: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Outline of My Talk

Bug Isolation Using Predicate Elimination Statistical Debugging Related Work Privacy and Security Future Work Conclusions

Page 23: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Privacy and Security

Page 24: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Privacy and Security Statistical model - an mechanism for protecting

user anonymity

Logistic regression - parameters updated with a new trace

Statistical approach with noise - against malicious users

Collaborative filtering system

Page 25: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Outline of My Talk

Bug Isolation Using Predicate Elimination Statistical Debugging Related Work Privacy and Security Future Work Conclusions

Page 26: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Future Work Public Deployment of Cooperative Bug Iso

lation Scalable Statistical Bug Isolation Path Optimization in Programs and its Ap

plication to Debugging Statistical Debugging: Simultaneous Identi

fication of Multiple Bugs The cooperative Bug Isolation Project, visi

t www.cs.wisc.edu/cbi/

Page 27: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Outline of My Talk

Bug Isolation Using Predicate Elimination Statistical Debugging Related Work Privacy and Security Future Work Conclusions

Page 28: Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng

Conclusions sampling infrastructure - gathering

information from the set of runs produced by the user community

Bernoulli process to do the sampling several sample applications

Sharing the overhead of assertionsPredicating guessing and elimination to

isolate a deterministic bugRegularizing logistic regression to isolate a

non-deterministic memory corruption error