low overhead debugging with dise
TRANSCRIPT
![Page 1: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/1.jpg)
Low Overhead Debuggingwith DISE
Marc L. Corliss E Christopher Lewis Amir RothDepartment of Computer and Information Science
University of Pennsylvania
![Page 2: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/2.jpg)
Low Overhead Debugging with DISE – Marc Corliss 2
Overview• Goal: Low overhead interactive debugging
• Solution: Implement efficient debugging primitives• e.g. breakpoints and watchpoints• using Dynamic Instruction Stream Editing (DISE) [ISCA ‘03]:
General-purpose tool for dynamic instrumentation
![Page 3: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/3.jpg)
Low Overhead Debugging with DISE – Marc Corliss 3
Breakpoints and Watchpoints• Breakpoint
• Interrupts program at specific point
• Watchpoint• Interrupts program when value of expression changes
• ConditionalBreakpoint/Watchpoint• Interrupts program only when predicate is true
break test.c:100
watch x
break test.c:100 if i==93
![Page 4: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/4.jpg)
Low Overhead Debugging with DISE – Marc Corliss 4
User
Debugging ArchitectureDebugger Application
int main(){
}
• User/debugger transitions• Debugger/application transitions
• High overhead• May be masked by user/debugger transitions• Otherwise perceived as latency
SpuriousTransitions
![Page 5: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/5.jpg)
Low Overhead Debugging with DISE – Marc Corliss 5
Eliminating Spurious Transitions• Instrument app. with breakpoint/watchpoint logic
• No spurious transitions• Static approaches already exist
• During compilation or post-compilation (binary rewriting)
• We propose dynamic instrumentation• Using DISE
![Page 6: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/6.jpg)
Low Overhead Debugging with DISE – Marc Corliss 6
Talk Outline Introduction• Watchpoint implementations• DISE background• Watching with DISE• Evaluation• Related work and conclusion
![Page 7: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/7.jpg)
Low Overhead Debugging with DISE – Marc Corliss 7
Watchpoint Implementations• Single-stepping• Virtual memory support• Dedicated hardware registers
![Page 8: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/8.jpg)
Low Overhead Debugging with DISE – Marc Corliss 8
Single-Stepping
User Applicationint main(){
}
Trap after every statement
Debugger
+ Easy to implement– Poor performance (many spurious transitions)
Debugger Applicationint main() {
}diff?yes
run
no
no
nodiff?
diff?
diff?
![Page 9: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/9.jpg)
Low Overhead Debugging with DISE – Marc Corliss 9
Virtual Memory SupportTrap when pages containing watched variables written
Debugger Applicationint main() {
}diff?yes
run
diff?
diff?page written
page written
page written
+ Reduces spurious transitions– Coarse-granularity (still may incur spurious transitions)– Spurious transitions on silent writes
![Page 10: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/10.jpg)
Low Overhead Debugging with DISE – Marc Corliss 10
Dedicated Hardware RegistersTrap when particular address is written
+ Reduces spurious transitions– Spurious transitions on silent writes– Number and size of watchpoints limited
Debugger Applicationint main() {
}diff?yes
run
diff?watchpt written
watchpt written
![Page 11: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/11.jpg)
Low Overhead Debugging with DISE – Marc Corliss 11
Conditional WatchpointsTrap like unconditional, debugger evaluates predicate
Debugger Applicationint main() {
}
+ Simple extension of unconditional implementation– Introduces more spurious transitions
yes
run
diff?pred?
diff?pred?
diff?pred?
![Page 12: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/12.jpg)
Low Overhead Debugging with DISE – Marc Corliss 12
Instrumenting the ApplicationEmbed (conditional) watchpoint logic into application
Debugger Applicationint main() {
}
run
diff?pred?
diff?pred?
diff?pred?
+ Eliminates all spurious transitions– Adds small overhead for each write
![Page 13: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/13.jpg)
Low Overhead Debugging with DISE – Marc Corliss 13
• Dynamic Instruction Stream Editing (DISE) [ISCA ‘03]• Programmable instruction macro-expander• Like hardware SED (DISE = dynamic instruction SED)• General-purpose mechanism for dynamic instrumentation
I$ executeDISEapp app+instrumentation
srli r9,4,r1cmp r1,r2,r1bne r1,Errorstore r4,8(r9)
store r4,8(r9)
• Example: memory fault isolation
DISE
![Page 14: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/14.jpg)
Low Overhead Debugging with DISE – Marc Corliss 14
DISE Productions• Production: static rewrite rule
T.OPCLASS==store=> srli T.RS,4,dr0
cmp dr0,dr1,dr0 bne dr0,Error T.INST
srli r9,4,dr0cmp dr0,dr1,dr0bne dr0,Errorstore r4,8(r9)
store r4,8(r9)
• Expansion: dynamic instantiation of production
PatternDirective
DISERegisterParameterized
replacementsequence
![Page 15: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/15.jpg)
Low Overhead Debugging with DISE – Marc Corliss 15
Watching with DISE• Monitor writes to memory• Check if watched value(s) modified
– Requires expensive load(s) for every write• Optimization: address match gating
• Split into address check (fast) and value check (slow)• Check if writing to watched address• If so, then handler routine called• Handler routine does value check
![Page 16: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/16.jpg)
Low Overhead Debugging with DISE – Marc Corliss 16
Watchpoint Production• Interactive debugger injects production:
T.OPCLASS == store=> T.INST # original instruction
lda dr1,T.IMM(T.RS) # compute addressbic dr1,7,dr1 # quad align addresscmpeq dr1,dwr,dr1 # cmp to watched addressccall dr1,HNDLR # if equal call handler
![Page 17: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/17.jpg)
Low Overhead Debugging with DISE – Marc Corliss 17
Other Implementation Issues• Conditional watchpoints
• Inline simple predicates in replacement sequence• Put complex predicates in handler routine
• Multiple watchpoints/complex expressions• For small #, inline checks in replacement sequence• For large #, use bloom filter
Key point: DISE is flexible
![Page 18: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/18.jpg)
Low Overhead Debugging with DISE – Marc Corliss 18
Virtues of DISE• Versus dedicated hardware registers
• General-purpose: DISE has many other uses• Safety checking [ISCA ‘03], security checking [WASSA ‘04],
profiling [TR ‘02], (de)compression [LCTES ‘03], etc.
• Efficient: no spurious transitions to the debugger• Flexible: more total watchpoints permitted
• Versus static binary transformation• Simple-to-program: transformation often cumbersome• Efficient: no code bloat, no transformation cost• Less intrusive: Debugger and application separate
![Page 19: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/19.jpg)
Low Overhead Debugging with DISE – Marc Corliss 19
Evaluation• Show DISE efficiently supports watchpoints
• Compare performance to other approaches
• Analyze debugging implementations in general• Characterize performance of different approaches
![Page 20: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/20.jpg)
Low Overhead Debugging with DISE – Marc Corliss 20
Methodology• Simulation using SimpleScalar Alpha
• Modeling aggressive, 4-way processor• Benchmarks
• (subset of) SPEC Int 2000• Watchpoints for each benchmark
• HOT, WARM1, WARM2, COLD• Debugger/application transition overhead
• 100,000 cycles
![Page 21: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/21.jpg)
Low Overhead Debugging with DISE – Marc Corliss 21
Unconditional WatchpointsGCC
• Single-stepping has slowdowns from 6,000-40,000
![Page 22: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/22.jpg)
Low Overhead Debugging with DISE – Marc Corliss 22
Unconditional Watchpoints
• VM sometimes good, sometimes awful• Erratic behavior primarily due to coarse-granularity
GCC
![Page 23: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/23.jpg)
Low Overhead Debugging with DISE – Marc Corliss 23
Unconditional Watchpoints
• Hardware registers usually good (no overhead)• Hardware registers perform poorly for HOT
• Significant number of silent writes
GCC
![Page 24: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/24.jpg)
Low Overhead Debugging with DISE – Marc Corliss 24
Unconditional WatchpointsGCC
• DISE overhead usually less than 25%
![Page 25: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/25.jpg)
Low Overhead Debugging with DISE – Marc Corliss 25
Conditional Watchpoints
• In many cases DISE outperforms hardware regs.• Spurious transitions for HW regs. whenever WP written• DISE/HW registers can differ by 3 orders of magnitude
![Page 26: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/26.jpg)
Low Overhead Debugging with DISE – Marc Corliss 26
Conditional Watchpoints
• Instrumentation overhead more consistent• Instrumentation adds small cost on all writes• Non-instrumentation adds high cost on some writes
![Page 27: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/27.jpg)
Low Overhead Debugging with DISE – Marc Corliss 27
Multiple Watchpoints
• For <5 watchpoints can use hardware registers• Performance good 1-3, degrades at 4 due to silent writes
• For >4 watchpoints must use virtual memory• Performance degrades due to coarse-granularity
GCC
![Page 28: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/28.jpg)
Low Overhead Debugging with DISE – Marc Corliss 28
Multiple Watchpoints
• For <4 watchpoints DISE/Inlined slightly worse• DISE/Inlined much better for >3 watchpoints
GCC
![Page 29: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/29.jpg)
Low Overhead Debugging with DISE – Marc Corliss 29
Multiple Watchpoints
• For <4 DISE/B.F. slightly worse than Inlined• DISE/B.F. replacement sequence includes load
• For >3 DISE/B.F. does the best• DISE/Inlined replacement sequence too large
GCC
![Page 30: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/30.jpg)
Low Overhead Debugging with DISE – Marc Corliss 30
Evaluation Results• DISE watchpoints have low overhead
• DISE overhead usually less than 25%• In many cases DISE outperforms other approaches• Silent writes/conditionals ⇒ spurious transitions• DISE flexibility helps keep low overhead in all scenarios
• Overhead of instrumentation more consistent• Small cost on all writes rather than occasional large cost• Non-instrumentation has 1x to 100,000x slowdown
![Page 31: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/31.jpg)
Low Overhead Debugging with DISE – Marc Corliss 31
Related Work• iWatcher [Zhou et. al ‘04]
• Hardware-assisted debugger• Associates program-specified functions with memory locations
• Address-based versus instruction-based• Not general-purpose mechanism like DISE
• More significant hardware modifications than DISE• Other related areas
• Static transformation [Kessler ‘90, Wahbe et al. ‘93]
• Instrumentation mechanisms [Valgrind, ATOM, EEL, Etch]
![Page 32: Low Overhead Debugging with DISE](https://reader031.vdocument.in/reader031/viewer/2022022219/6214da7a8c4b4509031437d9/html5/thumbnails/32.jpg)
Low Overhead Debugging with DISE – Marc Corliss 32
Conclusion• DISE effectively supports low overhead debugging
• Virtues: general-purpose, flexible, simple-to-program,efficient, non-intrusive
• Characterize interactive debugging implementations• Instrumentation has consistently low overhead