rugrat: runtime test case generation using dynamic compilers
DESCRIPTION
RUGRAT: Runtime Test Case Generation using Dynamic Compilers. Ben Breech NASA Goddard Space Flight Center. Lori Pollock John Cavazos University of Delaware. Motivating Example. if ((sptr = malloc (size + 1)) == NULL) {. findmem ();. if ((sptr = malloc (size + 1)) == NULL). - PowerPoint PPT PresentationTRANSCRIPT
RUGRAT:RUGRAT:Runtime Test Case Generation Runtime Test Case Generation
using Dynamic Compilersusing Dynamic Compilers
Ben BreechBen BreechNASA Goddard Space Flight CenterNASA Goddard Space Flight Center
Lori PollockLori Pollock
John CavazosJohn CavazosUniversity of DelawareUniversity of Delaware
Motivating ExampleMotivating Exampleif ((sptr = malloc (size + 1)) == NULL) {if ((sptr = malloc (size + 1)) == NULL) {
findmem ();findmem ();if ((sptr = malloc (size + 1)) == NULL)if ((sptr = malloc (size + 1)) == NULL)
xlfail (“insufficient string space”); }xlfail (“insufficient string space”); }
How do I test this callsite?How do I test this callsite?
Make the machine run out of memory?Make the machine run out of memory?Flip the conditional, recompile, flip back?Flip the conditional, recompile, flip back?
Pretend it doesn’t exist during testing?Pretend it doesn’t exist during testing?
Generalizing the ProblemGeneralizing the Problem
Code to handle Code to handle uncommon situationsuncommon situations• Difficult to testDifficult to test• May need external environment event to triggerMay need external environment event to trigger
Examples:Examples:• Error handling codeError handling code• Testing program security mechanismsTesting program security mechanisms
ObservationObservation
Hard to reach code executes when program Hard to reach code executes when program thinksthinks something uncommon has occurred something uncommon has occurred
if ((sptr = malloc (size + 1)) == NULL) {if ((sptr = malloc (size + 1)) == NULL) { findmem ();findmem ();
xlfail (“insufficient string space”); }xlfail (“insufficient string space”); }if ((sptr = malloc (size + 1)) == NULL)if ((sptr = malloc (size + 1)) == NULL)
Could test Could test findmemfindmem() by simulating error() by simulating error• e.g., could add instructions to program so program believes e.g., could add instructions to program so program believes
mallocmalloc failed failed
RUGRAT ApproachRUGRAT Approach
Use Use Dynamic CompilersDynamic Compilers to generate to generate test cases for hard to reach code.test cases for hard to reach code.
AutomaticallyAutomatically add instructions to add instructions to program program during executionduring execution to simulate to simulate uncommon situation.uncommon situation.
Dynamic CompilersDynamic Compilers
Dynamic compilers perform compilation Dynamic compilers perform compilation tasks tasks duringduring program execution program execution
code
Analysistransformation
optimization
Create basic block
translate
Basicblock
Mod. Basicblock
Executeon CPU
Dynamic Compiler
RUGRAT ArchitectureRUGRAT Architecture
code
Analysistransformation
optimization
Create basic block
translate
Basicblock
Mod. Basicblock
Executeon CPU
Dynamic Compiler
Create basic block
Dynatest Generator
Testspec
Test OracleTest
Report
Test SpecTest Spec
Details where/how for inserting testsDetails where/how for inserting tests Current prototype limited (environment Current prototype limited (environment
vars). Can express:vars). Can express:• Function locationsFunction locations
• ““test all calls to function x” test all calls to function x” • ““test only second call to x in function y”test only second call to x in function y”
• Failure value (e.g., 0, -1, etc)Failure value (e.g., 0, -1, etc)• Some side effectsSome side effects
Dynatest GeneratorDynatest Generator
Scans instructions for location to insert test Scans instructions for location to insert test (e.g., call to function X)(e.g., call to function X)
Allows function X to executeAllows function X to execute Adds instructions to Adds instructions to simulatesimulate error error
• Instructions added Instructions added afterafter function X function X• Program thinks error happened, reactsProgram thinks error happened, reacts
ExampleExampleif ((sptr = malloc (size + 1)) == NULL) {if ((sptr = malloc (size + 1)) == NULL) {
findmem ();findmem ();
xlfail (“insufficient string space”);xlfail (“insufficient string space”);if ((sptr = malloc (size + 1)) == NULL)if ((sptr = malloc (size + 1)) == NULL)
call malloc (code for malloc)movl <return val> sptrcmpl sptr, 0jnz L1call findmem….L1: …
Dynatest Generator
call malloc (code for malloc)movl 0, <return val>movl ENOMEM, errnomovl <return val> sptrcmpl sptr, 0jnz L1call findmem….L1: …
}}L1:L1:
The Good, the Bad and the UglyThe Good, the Bad and the Ugly
The Bad: Not a perfect simulationNot a perfect simulation
The Good: Adequate simulationAdequate simulation Can target system or appl callsCan target system or appl calls Saves quite a lot of tester effortSaves quite a lot of tester effort
The Ugly:Still a prototypeStill a prototype
Security Mechanism Testing:Security Mechanism Testing:Encrypting Function PointersEncrypting Function Pointers
Protects progs against func pointer attacksProtects progs against func pointer attacks Difficult to test (need vulnerable program and Difficult to test (need vulnerable program and
attack)attack) RUGRAT can simulate attack by adding RUGRAT can simulate attack by adding
instructionsinstructions• Very different from error handling code caseVery different from error handling code case
RUGRAT can be used for variety of RUGRAT can be used for variety of testing tasks.testing tasks.
Current Implementation NotesCurrent Implementation Notes
Used DynamoRIOUsed DynamoRIO11 dynamic compiler dynamic compiler• Some limitations (but new version is available)Some limitations (but new version is available)
Test spec from env. varsTest spec from env. vars Nothing fancy for oracleNothing fancy for oracle
11 Bruening, et al., CGO 2003 Bruening, et al., CGO 2003
ExperimentsExperiments
Ran variety of programs with RUGRATRan variety of programs with RUGRAT• space, space, SPEC, MiBENCHSPEC, MiBENCH
Tested handling of errors in Tested handling of errors in • malloc / fopen / writemalloc / fopen / write• Application callsApplication calls
Experiments SummaryExperiments Summary
Can RUGRAT generate tests to cover error Can RUGRAT generate tests to cover error handling code?handling code?
YES! RUGRAT tested error handling YES! RUGRAT tested error handling code at 120+ callsitescode at 120+ callsites
(missed one because DynamoRIO (missed one because DynamoRIO incurred a segfault)incurred a segfault)
Experiments SummaryExperiments Summary
Can RUGRAT increase statement coverage for Can RUGRAT increase statement coverage for error handling code?error handling code?
YES! RUGRAT increased code coverage ~ YES! RUGRAT increased code coverage ~ 50% (on average) of error handling code50% (on average) of error handling code
•Not all statements executed b/c of different optionsNot all statements executed b/c of different options•RUGRAT detected cases of omission errorsRUGRAT detected cases of omission errors
Fault DetectionFault Detection
Could RUGRAT help detect failures in Could RUGRAT help detect failures in error handling code?error handling code?
Grad students seeded faults into error Grad students seeded faults into error handling code for handling code for spacespace program program
• Changed assignments, loops, conditionals,etcChanged assignments, loops, conditionals,etc• Seeded total of 34 faultsSeeded total of 34 faults
Fault Detection SummaryFault Detection Summary
RUGRAT detected 15 / 34 faultsRUGRAT detected 15 / 34 faults Of 19 undetected faults:Of 19 undetected faults:
• 6 changed return values, but callers only checked 6 changed return values, but callers only checked certain vals (e.g., if (func () != 0))certain vals (e.g., if (func () != 0))
• 2 allocated too little memory (2 allocated too little memory (mallocmalloc may allocate may allocate more memory than requested anyway)more memory than requested anyway)
• 2 unknown2 unknown• 1 caused 1 caused spacespace to quit to quit• 8 instances were caller performed same code as 8 instances were caller performed same code as
callee (e.g., any fault in callee was undid by caller)callee (e.g., any fault in callee was undid by caller)
Some related workSome related work
HolodeckHolodeck11, FIG, FIG22
• Require tester provide alternative “stub” Require tester provide alternative “stub” functions to do testingfunctions to do testing
• Miss application callsMiss application calls Dynamic branch switchingDynamic branch switching33
• Not originally intended for testing error codeNot originally intended for testing error code• Need to know which branch to changeNeed to know which branch to change• Far less accurate simulationFar less accurate simulation
11 Thompson et al., SAC 2002 Thompson et al., SAC 200222 Broadwell et al., SHAMAN 2002 Broadwell et al., SHAMAN 200233 Zhang et al., ICSE 2006 Zhang et al., ICSE 2006
Conclusions and SummaryConclusions and Summary
Presented RUGRAT architecturePresented RUGRAT architecture• Can test hard to reach (and seldom tested) Can test hard to reach (and seldom tested)
code by using dynamic compilerscode by using dynamic compilers• Saves tester effortSaves tester effort
RUGRAT is a general toolRUGRAT is a general tool
RUGRAT ArchitectureRUGRAT Architecture
code
Basicblock
Mod. Basicblock
Executeon CPU
Dynamic Compiler
Create basic block
Dynatest Generator
Testspec
Test OracleTest
Report
Experiments SummaryExperiments Summary
Tested variety programs with RUGRATTested variety programs with RUGRAT 120+ error code handling callsites covered120+ error code handling callsites covered
• Both application and system callsBoth application and system calls Increased error code coverage ~ 50% over Increased error code coverage ~ 50% over
regular test casesregular test cases• Not all error code statements could be coveredNot all error code statements could be covered
• Different options, etcDifferent options, etc
Reasonable time overheadReasonable time overhead