rugrat: runtime test case generation using dynamic compilers ben breech nasa goddard space flight...

22
RUGRAT: RUGRAT: Runtime Test Case Runtime Test Case Generation using Dynamic Generation using Dynamic Compilers Compilers Ben Breech Ben Breech NASA Goddard Space Flight NASA Goddard Space Flight Center Center Lori Pollock Lori Pollock John Cavazos John Cavazos University of University of Delaware Delaware

Post on 19-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

RUGRAT:RUGRAT:Runtime Test Case Generation Runtime Test Case Generation

using Dynamic Compilersusing Dynamic Compilers

Ben BreechBen BreechNASA Goddard Space Flight CenterNASA Goddard Space Flight Center

Lori PollockLori Pollock

John CavazosJohn CavazosUniversity of DelawareUniversity of Delaware

Motivating ExampleMotivating Exampleif ((sptr = malloc (size + 1)) == NULL) {if ((sptr = malloc (size + 1)) == NULL) {

findmem ();findmem ();if ((sptr = malloc (size + 1)) == NULL)if ((sptr = malloc (size + 1)) == NULL)

xlfail (“insufficient string space”); }xlfail (“insufficient string space”); }

How do I test this callsite?How do I test this callsite?

Make the machine run out of memory?Make the machine run out of memory?Flip the conditional, recompile, flip back?Flip the conditional, recompile, flip back?

Pretend it doesn’t exist during testing?Pretend it doesn’t exist during testing?

Generalizing the ProblemGeneralizing the Problem

Code to handle Code to handle uncommon situationsuncommon situations• Difficult to testDifficult to test• May need external environment event to triggerMay need external environment event to trigger

Examples:Examples:• Error handling codeError handling code• Testing program security mechanismsTesting program security mechanisms

ObservationObservation

Hard to reach code executes when program Hard to reach code executes when program thinksthinks something uncommon has occurred something uncommon has occurred

if ((sptr = malloc (size + 1)) == NULL) {if ((sptr = malloc (size + 1)) == NULL) { findmem ();findmem ();

xlfail (“insufficient string space”); }xlfail (“insufficient string space”); }if ((sptr = malloc (size + 1)) == NULL)if ((sptr = malloc (size + 1)) == NULL)

Could test Could test findmemfindmem() by simulating error() by simulating error• e.g., could add instructions to program so program believes e.g., could add instructions to program so program believes

mallocmalloc failed failed

RUGRAT ApproachRUGRAT Approach

Use Use Dynamic CompilersDynamic Compilers to generate to generate test cases for hard to reach code.test cases for hard to reach code.

AutomaticallyAutomatically add instructions to add instructions to program program during executionduring execution to simulate to simulate uncommon situation.uncommon situation.

Dynamic CompilersDynamic Compilers

Dynamic compilers perform compilation Dynamic compilers perform compilation tasks tasks duringduring program execution program execution

code

Analysistransformation

optimization

Create basic block

translate

Basicblock

Mod. Basicblock

Executeon CPU

Dynamic Compiler

RUGRAT ArchitectureRUGRAT Architecture

code

Analysistransformation

optimization

Create basic block

translate

Basicblock

Mod. Basicblock

Executeon CPU

Dynamic Compiler

Create basic block

Dynatest Generator

Testspec

Test OracleTest

Report

Test SpecTest Spec

Details where/how for inserting testsDetails where/how for inserting tests Current prototype limited (environment Current prototype limited (environment

vars). Can express:vars). Can express:• Function locationsFunction locations

• ““test all calls to function x” test all calls to function x” • ““test only second call to x in function y”test only second call to x in function y”

• Failure value (e.g., 0, -1, etc)Failure value (e.g., 0, -1, etc)• Some side effectsSome side effects

Dynatest GeneratorDynatest Generator

Scans instructions for location to insert test Scans instructions for location to insert test (e.g., call to function X)(e.g., call to function X)

Allows function X to executeAllows function X to execute Adds instructions to Adds instructions to simulatesimulate error error

• Instructions added Instructions added afterafter function X function X• Program thinks error happened, reactsProgram thinks error happened, reacts

ExampleExampleif ((sptr = malloc (size + 1)) == NULL) {if ((sptr = malloc (size + 1)) == NULL) {

findmem ();findmem ();

xlfail (“insufficient string space”);xlfail (“insufficient string space”);if ((sptr = malloc (size + 1)) == NULL)if ((sptr = malloc (size + 1)) == NULL)

call malloc (code for malloc)movl <return val> sptrcmpl sptr, 0jnz L1call findmem….L1: …

Dynatest Generator

call malloc (code for malloc)movl 0, <return val>movl ENOMEM, errnomovl <return val> sptrcmpl sptr, 0jnz L1call findmem….L1: …

}}L1:L1:

The Good, the Bad and the UglyThe Good, the Bad and the Ugly

The Bad: Not a perfect simulationNot a perfect simulation

The Good: Adequate simulationAdequate simulation Can target system or appl callsCan target system or appl calls Saves quite a lot of tester effortSaves quite a lot of tester effort

The Ugly:Still a prototypeStill a prototype

Security Mechanism Testing:Security Mechanism Testing:Encrypting Function PointersEncrypting Function Pointers

Protects progs against func pointer attacksProtects progs against func pointer attacks Difficult to test (need vulnerable program and Difficult to test (need vulnerable program and

attack)attack) RUGRAT can simulate attack by adding RUGRAT can simulate attack by adding

instructionsinstructions• Very different from error handling code caseVery different from error handling code case

RUGRAT can be used for variety of RUGRAT can be used for variety of testing tasks.testing tasks.

Current Implementation NotesCurrent Implementation Notes

Used DynamoRIOUsed DynamoRIO11 dynamic compiler dynamic compiler• Some limitations (but new version is available)Some limitations (but new version is available)

Test spec from env. varsTest spec from env. vars Nothing fancy for oracleNothing fancy for oracle

11 Bruening, et al., CGO 2003 Bruening, et al., CGO 2003

ExperimentsExperiments

Ran variety of programs with RUGRATRan variety of programs with RUGRAT• space, space, SPEC, MiBENCHSPEC, MiBENCH

Tested handling of errors in Tested handling of errors in • malloc / fopen / writemalloc / fopen / write• Application callsApplication calls

Experiments SummaryExperiments Summary

Can RUGRAT generate tests to cover error Can RUGRAT generate tests to cover error handling code?handling code?

YES! RUGRAT tested error handling YES! RUGRAT tested error handling code at 120+ callsitescode at 120+ callsites

(missed one because DynamoRIO (missed one because DynamoRIO incurred a segfault)incurred a segfault)

Experiments SummaryExperiments Summary

Can RUGRAT increase statement coverage for Can RUGRAT increase statement coverage for error handling code?error handling code?

YES! RUGRAT increased code coverage ~ YES! RUGRAT increased code coverage ~ 50% (on average) of error handling code50% (on average) of error handling code

•Not all statements executed b/c of different optionsNot all statements executed b/c of different options•RUGRAT detected cases of omission errorsRUGRAT detected cases of omission errors

Fault DetectionFault Detection

Could RUGRAT help detect failures in Could RUGRAT help detect failures in error handling code?error handling code?

Grad students seeded faults into error Grad students seeded faults into error handling code for handling code for spacespace program program

• Changed assignments, loops, conditionals,etcChanged assignments, loops, conditionals,etc• Seeded total of 34 faultsSeeded total of 34 faults

Fault Detection SummaryFault Detection Summary

RUGRAT detected 15 / 34 faultsRUGRAT detected 15 / 34 faults Of 19 undetected faults:Of 19 undetected faults:

• 6 changed return values, but callers only checked 6 changed return values, but callers only checked certain vals (e.g., if (func () != 0))certain vals (e.g., if (func () != 0))

• 2 allocated too little memory (2 allocated too little memory (mallocmalloc may allocate may allocate more memory than requested anyway)more memory than requested anyway)

• 2 unknown2 unknown• 1 caused 1 caused spacespace to quit to quit• 8 instances were caller performed same code as 8 instances were caller performed same code as

callee (e.g., any fault in callee was undid by caller)callee (e.g., any fault in callee was undid by caller)

Some related workSome related work

HolodeckHolodeck11, FIG, FIG22

• Require tester provide alternative “stub” Require tester provide alternative “stub” functions to do testingfunctions to do testing

• Miss application callsMiss application calls Dynamic branch switchingDynamic branch switching33

• Not originally intended for testing error codeNot originally intended for testing error code• Need to know which branch to changeNeed to know which branch to change• Far less accurate simulationFar less accurate simulation

11 Thompson et al., SAC 2002 Thompson et al., SAC 200222 Broadwell et al., SHAMAN 2002 Broadwell et al., SHAMAN 200233 Zhang et al., ICSE 2006 Zhang et al., ICSE 2006

Conclusions and SummaryConclusions and Summary

Presented RUGRAT architecturePresented RUGRAT architecture• Can test hard to reach (and seldom tested) Can test hard to reach (and seldom tested)

code by using dynamic compilerscode by using dynamic compilers• Saves tester effortSaves tester effort

RUGRAT is a general toolRUGRAT is a general tool

RUGRAT ArchitectureRUGRAT Architecture

code

Basicblock

Mod. Basicblock

Executeon CPU

Dynamic Compiler

Create basic block

Dynatest Generator

Testspec

Test OracleTest

Report

Experiments SummaryExperiments Summary

Tested variety programs with RUGRATTested variety programs with RUGRAT 120+ error code handling callsites covered120+ error code handling callsites covered

• Both application and system callsBoth application and system calls Increased error code coverage ~ 50% over Increased error code coverage ~ 50% over

regular test casesregular test cases• Not all error code statements could be coveredNot all error code statements could be covered

• Different options, etcDifferent options, etc

Reasonable time overheadReasonable time overhead