a technique for enabling and supporting debugging of field failures (icse 2007)
Post on 19-Jun-2015
70 Views
Preview:
TRANSCRIPT
A Technique for Enabling and Supporting Debugging of
Field Failures
James Clause and Alessandro OrsoGeorgia Institute of Technology
This work was supported in part by NSF awardsCCF-0541080 and CCR-0205422 to Georgia Tech.
1
3
3
Field failures: Anomalous behavior (or crashes) of deployed software that occur on user machines
4
Crash logs
4
Crash logs
User-providedinformation
4
Our solution
5
Our solution
5
Record
Our solution
5
Record
Replay
Our solution
5
Record
Replay
Minimize
Our solution
5
Record
Replay
Minimize
Debug
Usage Scenario
6
In house In the field
Replay / Minimize(off line)
Record (on line)Develop
Replay / Debug
Execution
repository
✔/✘
Existing record / replay approaches
7
Regression testing
(e.g. Elbaum et al. 06, Orso et al. 06, Orso and Kennedy 05, Saff et al. 05, Mercury WinRunner)
• Replay only a portion of an execution by recording events for specific subsystems
Both types of technique are not amenable tominimization and may cause unacceptable overhead
Deterministic debugging
(e.g. Chen et al. 01, King et al. 05, Narayanasamy et al. 05, Netzer and Weaver 94, Srinivasan et al. 04, VMWare)
• Replay an entire execution by recording every component of an application
Outline
✓Motivation & background
• Our technique
• record
• replay
• minimization
• Empirical evaluation
• Conclusion & future work
8
Record & Replay
• Goal: develop an approach that has low overhead and is amenable to minimization
• Key insight: avoid focusing on low-level (internal) events
• expensive (large number of events)
• not amenable to minimization (high interdependence)
9
Record & Replay
• Goal: develop an approach that has low overhead and is amenable to minimization
• Key insight: avoid focusing on low-level (internal) events
• expensive (large number of events)
• not amenable to minimization (high interdependence)
➡ Focus on high-level (external) interactions with the environment
• efficient (fewer, more “expensive” interactions)
• amenable to minimization (low interdependence)
10
Environment interactions
11
Environment interactions
Streams
11
Environment interactions
Streams Files
11
Environment interactions
Streams Files
11
Environment interactions
Streams Files
11
Interaction events: FILE — interaction with a file POLL — checks for availability of data on a stream PULL — read data from a stream
Event log:
Environment data (files):
12
Environment data (streams):
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
POLL KEYBOARD NOK
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
POLL KEYBOARD NOK
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
KEYBOARD: {5680}
POLL KEYBOARD OK
POLL KEYBOARD NOK
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
KEYBOARD: {5680}
POLL KEYBOARD OK
POLL KEYBOARD NOK
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
KEYBOARD: {5680} hello
POLL KEYBOARD OK
PULL KEYBOARD 5
POLL KEYBOARD NOK
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
KEYBOARD: {5680} hello
POLL KEYBOARD OK
PULL KEYBOARD 5
POLL KEYBOARD NOK
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
KEYBOARD: {5680} hello
POLL KEYBOARD OK
PULL KEYBOARD 5
POLL KEYBOARD NOK
POLL NETWORK OK
NETWORK: {3405}
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
KEYBOARD: {5680} hello
POLL KEYBOARD OK
PULL KEYBOARD 5
POLL KEYBOARD NOK
POLL NETWORK OK
NETWORK: {3405}
❙
Event log:
Environment data (files):
12
Environment data (streams):
FILE foo.1
foo.1
KEYBOARD: {5680} hello
POLL KEYBOARD OK
PULL KEYBOARD 5
POLL KEYBOARD NOK
POLL NETWORK OK
NETWORK: {3405}
❙
Environment data (files):
13
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2...PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK...
foo.1 foo.2 bar.1
Environment data (files):
14
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2...PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK...
foo.1 foo.2 bar.1
Environment data (files):
14
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2...PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK...
foo.1 foo.2 bar.1
Environment data (files):
14
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2...PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK...
foo.1 foo.2 bar.1
✔
Environment data (files):
14
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2...PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK...
foo.1 foo.2 bar.1
✔
Environment data (files):
14
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2...PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK...
foo.1 foo.2 bar.1
✔
Environment data (files):
14
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2...PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK...
foo.1 foo.2 bar.1
✔✔
✔✔✔✔
✔✔
✔✔
✔✔
✔
Minimize
15
Goal: focus debugging effort
Minimize
15
Goal: focus debugging effort
Executionrecording
Minimize
15
Goal: focus debugging effort
Executionrecording
↺
Time
minimization
Minimize
15
Goal: focus debugging effort
Executionrecording
Executionrecording
↺
Time
minimization
Minimize
15
Goal: focus debugging effort
Executionrecording
Executionrecording
↺
Time
minimization
✂Data
minimization
Minimize
15
Goal: focus debugging effort
Executionrecording
Executionrecording
Executionrecording
↺
Time
minimization
✂Data
minimization
Minimize: time
Environment data (files):
16
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK
Minimize: time
Environment data (files):
16
Event log:
Environment data (streams):KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
FILE foo.1POLL KEYBOARD NOKPOLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1POLL NETWORK NOKPOLL NETWORK OKFILE foo.2PULL NETWORK 1024FILE foo.2POLL KEYBOARD NOK
Minimize: time
Environment data (files):
17
Event log:
Environment data (streams):
FILE foo.1
POLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1
POLL NETWORK OKFILE foo.2PULL NETWORK 1024FILE foo.2
KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
Minimize: time
Environment data (files):
17
Event log:
Environment data (streams):
FILE foo.1
POLL KEYBOARD OKPULL KEYBOARD 1POLL NETWORK OKPULL NETWORK 1024FILE bar.1
POLL NETWORK OKFILE foo.2PULL NETWORK 1024FILE foo.2
KEYBOARD: {5680}hello ❙ {4056}c ❙ {300}...
NETWORK: {3405}<html><body>... ❙ {202}...
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
✔
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
✘
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
Minimize: data
18
Atoms
Chunks
Whole
entities
Data minimization Environment
The tool: ADDAAssisting the Debugging
of Deployed Applications
• Record and Replay:
• Works on x86 (c-lib based) binaries
• Based on dynamic instrumentation (Pin)
• Maps c-library calls to interaction events
• Minimization:
• Set of extensible scripts
19
Limitations
Two main limitations:
• Technique: May not replay non-deterministic failures
• Implementation:Does not handle window system events (yet)
20
Empirical evaluation
• Research questions
• Can ADDA produce minimized executions that can be used to debug the original failure?
• How much overhead does ADDA impose?
• Subject:
• Pine — widely-used email / news client
• Data:
• Two real field failures from Pine’s history
• Set of 20 failing executions, 10 per failure
21
Empirical evaluation
• Research questions
• Can ADDA produce minimized executions that can be used to debug the original failure?
• How much overhead does ADDA impose?
• Subject:
• Pine — widely-used email / news client
• Data:
• Two real field failures from Pine’s history
• Set of 20 failing executions, 10 per failure
22
Minimization results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
# entities streams size files size
23
Ave
rage
val
ue a
fter
min
imiz
atio
n
Header-color fault Address book fault
Minimization results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
# entities streams size files size
23
Ave
rage
val
ue a
fter
min
imiz
atio
n
Header-color fault Address book fault
Moreover, these results are conservative: recorded executions only contain the minimal amount of data needed to perform an action.
Minimization results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
# entities streams size files size
23
Ave
rage
val
ue a
fter
min
imiz
atio
n
Header-color fault Address book fault
Overhead• Offline: less than 75 minutes for minimization
• Online: negligible overhead while recording
Moreover, these results are conservative: recorded executions only contain the minimal amount of data needed to perform an action.
Specific Example: Address Book Failure
• Complete execution
• 34 entities (files and streams)
• ≈800kb
• Minimized execution
• 5 partial entities (4 files,1 stream)
• ≈72kb
Future work
• More studies: additional applications and real users
• Extend technique / implementation
• Support windowing system
• Investigate ad-hoc minimization algorithms
• Include non-deterministic events (if needed)
25
Conclusions
• Novel approach that supports debugging
field failures
• Prototype implementation for x86 binaries
• Preliminary empirical evaluation: for the cases considered, our technique can
1. minimize failing executions,
2. preserve their failing behavior, and
3. impose low overhead on users26
Questions?
27
top related