building a better backtrace: techniques for postmortem program

30
Building a Better Backtrace: Techniques for Postmortem Program Analysis Ben Liblit & Alex Aiken

Upload: trinhnhan

Post on 10-Feb-2017

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Building a Better Backtrace: Techniques for Postmortem Program

Building a Better Backtrace: Techniques for Postmortem

Program Analysis

Ben Liblit & Alex Aiken

Page 2: Building a Better Backtrace: Techniques for Postmortem Program

A Few Grim Realities

• Programs fail post-deployment– Ship with known bugs– Users discover new bugs

• Users are lousy testers– Never do the same thing twice– Wild variation in execution environment– Poor bug reporting, if any

• Users’ bugs are the ones that really matter

Page 3: Building a Better Backtrace: Techniques for Postmortem Program

Program Analysis for Pessimists

• Assume & prepare for postmortem analysis– Compile-time analysis, stashed away for later– Lightweight (deployable) instrumentation

• Analyze failed program instances– Mix of automated / interactive tools– Not quite static analysis, not quite dynamic

• Help humans find and fix bugs that matter

Page 4: Building a Better Backtrace: Techniques for Postmortem Program

This Talk: Reconstructing Execution Chronologies

• Control flow decision history captures important properties

• Fundamental questions– “How in the world did it get here?”– “What happened just before this point?”– “How can I make this happen again?”

• Broader interest than just crashes

Page 5: Building a Better Backtrace: Techniques for Postmortem Program

Striking a Compromise

• Heavyweight approaches– Replay debugging– Program tracing

• Lightweight approaches– Examine stack trace in debugger– printf() debugging

• Middleweight (our) approach– “How might we have gotten here, given …?”

Page 6: Building a Better Backtrace: Techniques for Postmortem Program

Striking a Compromise

• Heavyweight approaches– Replay debugging– Program tracing

• Lightweight approaches– Examine stack trace in debugger– printf() debugging

• Middleweight (our) approach– “How might we have gotten here, given …?”

Page 7: Building a Better Backtrace: Techniques for Postmortem Program

Striking a Compromise

• Heavyweight approaches– Replay debugging– Program tracing

• Lightweight approaches– Examine stack trace in debugger– printf() debugging

• Middleweight (our) approach– “How might we have gotten here, given …?”

Page 8: Building a Better Backtrace: Techniques for Postmortem Program

Striking a Compromise

• Heavyweight approaches– Replay debugging– Program tracing

• Lightweight approaches– Examine stack trace in debugger– printf() debugging

• Middleweight (our) approach– “How might we have gotten here, given …?”

Page 9: Building a Better Backtrace: Techniques for Postmortem Program

The Big Idea: “Gotten Here” is Control Flow Reachability

Page 10: Building a Better Backtrace: Techniques for Postmortem Program

The Big Idea: “Gotten Here” is Control Flow Reachability

Page 11: Building a Better Backtrace: Techniques for Postmortem Program

The Big Idea: “Gotten Here” is Control Flow Reachability

Page 12: Building a Better Backtrace: Techniques for Postmortem Program

The Big Idea: “Gotten Here” is Control Flow Reachability

• Interested in paths– “How”, not just “yes/no”

• Transitive paths within one function

• Multiple functions?– Matched call/return paths– This is a form of context

free language reachability

?

?

Page 13: Building a Better Backtrace: Techniques for Postmortem Program

( )

[ ]

Global Control Flow Graph

call return

entry exit

call return

Page 14: Building a Better Backtrace: Techniques for Postmortem Program

Variations in Matching Grammar

• Complete execution– All calls & returns must be matched

{()(){()}[{}(())]}

Page 15: Building a Better Backtrace: Techniques for Postmortem Program

Variations in Matching Grammar

• Aborted execution– Some calls without returns– We use a variant of this

{()(){()}[{}(())]}

Page 16: Building a Better Backtrace: Techniques for Postmortem Program

CFL Reachability Algorithm

• Similar to transitive graph search– Use a work list to incrementally extend frontier– Forward from α or backward from ω– Transitively adding flow edges is one case

• Several additional cases for calls/returns• Complexity

– O(N3) for arbitrary grammar and graph– O(E) for our analyses (and many others)

Page 17: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site Only

• Work backward from crash site• Remember why each path is extended

– Record justifications in route map– route(x, z) = { r1, …, rn }

– ri = cross from x to y, then see route(y, z)• x and y must be “adjacent”: one of four cases

• route(α, ω) defines possible chronologies

Page 18: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site Only

• One case, unmatched call, determines stack

(

Page 19: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site Only

• One case, unmatched call, determines stack– Unmatched parens: {()(){()}[{}(())]}– Stack trace: {[(

(

Page 20: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site Only

• One case, unmatched call, determines stack– Unmatched parens: {()(){()}[{}(())]}– Stack trace: {[(

• But we probably havea specific stack tracein mind…

(

Page 21: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site + Stack Trace

• S ::= vector of call edges• Build |S + 1| clones of

global flow graph

Page 22: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site + Stack Trace

• S ::= vector of call edges• Build |S + 1| clones of

global flow graph• Two types of call edge

– (i must match )i

• Stays on same layer

Page 23: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site + Stack Trace

• S ::= vector of call edges• Build |S + 1| clones of

global flow graph• Two types of call edge

– (i must match )i

• Stays on same layer

– ci must be unmatched• Only way to next layer• Determined by S

c6

c3

c14

Page 24: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site + Stack Trace

• Possible histories– Start at α on top layer– End at ω on bottom layer– route(α, 0, ω, |S|)

• Backward, not forward– More deterministic

• Complexity– O(E) work, |S + 1| times

c6

c3

c14

Page 25: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction WithCrash Site + Event Trace

• V ::= vector of trace nodes• Use |V + 1| layered clones, as before• Must report event when crossing trace node

– On each layer, knock out all trace nodes but one• On bottommost layer, no trace nodes at all!

– Further restricts set of possible paths• Complexity: O(E|V|)

Page 26: Building a Better Backtrace: Techniques for Postmortem Program

Reconstruction With …

• Stack trace + event trace• Multiple event traces• Ambiguous traces• Incomplete event trace

– Recent-branch registers• Program counter sampling• Finite state machine of your choosing…

Page 27: Building a Better Backtrace: Techniques for Postmortem Program

Practical Considerations

• Dynamic dispatch / function pointers– Usual static techniques (points-to, receiver-class, etc.)– Event tracing can help– Note: stack trace is never dynamic

• Interactivity– Backward analysis is best: most bugs are close to crash– FIFO work list, demand-driven search– Deterministic versus non-deterministic state machines

Page 28: Building a Better Backtrace: Techniques for Postmortem Program

Areas For Future Exploration

• Sparsity of trace information– Identify state-preserving regions– Explore such regions only once

• Summarization / visualization– Basis: dominator tree walk-back– Opportunity for novel algorithms here

Page 29: Building a Better Backtrace: Techniques for Postmortem Program

Areas For Future Exploration

• Adaptive Gap Reduction– Programmer inquiries guide future annotation

• “Which way did this branch really go?”• “How many times did this loop really execute?”

– Identification of key inflection points– Insert lightweight event tracing nodes

• Related work in efficient path profiling– More evidence for future reconstructions

Page 30: Building a Better Backtrace: Techniques for Postmortem Program

Summary and Conclusions

• Program analysis in an imperfect world– Post-crash: unique challenges / leverage points

• CFL path recovery as basis for analysis– Efficient, demand-driven, adaptable

• Future work– Adaptive annotation to fill in gaps– Leveraging multiple runs– Data value modeling