differential testing - cis.upenn.edumhnaik/edu/cis700/lessons/differential... · mayur naik cis 700...
TRANSCRIPT
9/24/18
1
DifferentialTesting
MayurNaikCIS700– Fall2018
Motivation
Providingfunctionalspecifications forlargesoftwareisveryhard
• Functionalspecification:desiredpropertiesofinputsandoutputs
• E.g.,Ccompiler
> 600pagesinEnglish
=?
C Compiler
Input
Output
9/24/18
2
AnIdea
C Compiler
Input
Other C Compilers
Compare OutputBuggy Correct
different same
DifferentialTesting
Cross-check– Executedifferentimplementationsofthesamefunctionality(e.g.,GCC andLLVM)withthesameinputs
– Comparetheiroutputs– Report anyanomaliesasbugs
9/24/18
3
SuccessStories
• Compilers
• NeuralNets
int foo() {signed char x = 1;unsigned char y = 255;return x > y;
}
1
gcc inUbuntu8.04
Othercompilers
0Buggy Correct
OneversionofNvidia self-drivingcar
systemOtherversions
Correct“Go straight” “Turn left”
Buggy
Challenges
Howdowegenerategoodinputs?
– Concise:Avoidsillegalandredundanttests
– Diverse:Givesgoodcoverageofdiscrepantparts
9/24/18
4
Approaches
• UnguidedApproach– Generatetestinputsindependentlyacrossiterations
– e.g.,Csmith• GuidedApproach
– Generatetestinputsbyobservingprogrambehaviorforpastinputs
– e.g.,NEZHA,DeepXplore
UnguidedTestingforCCompilers
CSmith:UnguidedDifferential Testing toolforCCompilers
Random C program generator
C-program
Binary 1
...
Binary 2
Binary n
Compare
9/24/18
5
InputGeneration
• Basedonrandomtesting– RandomlygenerateCprograms
• Consideringdomain-specificknowledge– Well-formedness(Csyntax)– Well-definedness (Csemantics)
FoundCompilerBugs
LLVMVersion
1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
#Bugs 27 20 18 22 22 12 13 7 10 1
GCCVersion
3.0.0 3.1.0 3.2.0 3.3.0 3.4.0 4.0.0 4.1.0 4.2.0 4.3.0 4.4.0
#Bugs 10 11 9 7 7 11 7 6 14 5
9/24/18
6
LimitationsofUnguidedTesting
Generally,highlyinefficienttofinddiscrepancies,because:
– Randomlygenerates inputs
– Ignores anyinformationfrompastinputs
GuidedTestingforBinaries
NEZHA:GuidedDifferential Testing tool forBinaries
– Exploitbehavioraldiscrepanciesbetweenmultiplebinaryexecutables
– Evolveaninputcorpusthatisguidedbasedonruntimeinformation(obtainedbydoingbinaryinstrumentation)
9/24/18
7
GuidedTesting
GuidedTesting
• Startingfrominputseeds,keepgeneratingmutationsuntildiscrepanciesarefound
• Insteadofmaintainingallmutants,onlykeep“promising”ones– Mutantsofinputsthatledtocombinations
ofstatesunseensofar
9/24/18
8
Example
Inputcorpus:{ 7,0,1}
Exercisedinput:
Exercisededges: { }
Example
Inputcorpus:{ 7,0,1}
Exercisedinput:7
Exercisededges:{
{E1,E2’,E3’}}
9/24/18
9
Example
Inputcorpus:{ 7, 0,1 }
Exercisedinput:0
Exercisededges:{
{E1,E2’,E3’},{E2,E3,E1’}
}
Example
Inputcorpus:{ 7, 0,1 }
Exercisedinput:1
Exercisededges:{
{E1,E2’,E3’},{E2,E3,E1’},{E1,E1’}
}
Notaddedanddiscarded
9/24/18
10
Example
Inputcorpus:{ 7, 0,1 }
Exercisedinput:2(pick1andmutateitbyadding1)
Exercisededges:{
{E1,E2’,E3’},{E2,E3,E1’},{E1,E1’},{E3,E4,E1’}
}
Intuition
• Inputsthatexercisedifferentcoderegions
inthetwoappsmightimplydifferencesin
handlinglogic
• Suchinputsarelikelytofinddiscrepancies
9/24/18
11
CaseStudies
Application Tests
SSLLibraries OpenSSL,LibreSSL,BoringSSL,GnuTLS,wolfSSL,mbedTLS
PDFReaders EvincePDF,MuPDF,Xpdf
ELF Parser ClamAV,binutils
XZParser ClamAV,XZ
DiscrepanciesandBugsFound
Type SSLCertificate
PDFFile
ELFBinary
XZArchive
Discrepancies 764 7 2 5
ErrorsandCrashes 6 0 0 2
9/24/18
12
GuidedTestingforDNNs
DeepXplore:guideddifferentialtestingtoolforDeepNeuralNetworks(DNNs)
GuidedTestingforDNNs
Decision boundary of DNN 1 Decision boundary
of DNN 2
Seed input
9/24/18
13
InputGeneration
• Iteratively changing a seed input• Two criteria:1) Maximizing differential behaviors2) Maximizing neuron coverage
• Formally, maximizing the following objective function
obj(x) = (𝚺k≠jF(x)[c] - ƛ1∙Fj(x)[c]) + ƛ2∙fn(x)
obj1 obj2
MaximizingDifferentialBehavior
• Supposewehaven DNNsF1…Fn• LetFk(x)[c]betheclassprobability thatFk(x)predicts xtobe c
• RandomlyselectoneDNNFk• Maximizethisobjectivefunction:
• Easilysolvedusinggradientascent
obj1(x) = 𝚺k≠jF(x)[c] - ƛ1∙Fj(x)[c]
9/24/18
14
MaximizingNeuronCoverage
• Iterativelypickaninactivatedneuronf
• Modifytheinputsuchthataneuronbecomesactivated,thatis,maximizethis objectivefunctionforneuronn:
• Easilysolvedusinggradientascent
obj2(x) = fn(x)
EmpiricalEvaluation
Dataset Description DNN Description DNN Name #Neurons #Discripancies/ #Seeds
MNIST Hand-written digits LeNet variations
MNI_C1MNI_C2MNI_C3
52148268
1,073 / 2,0001,968 / 2,000
827 / 2,000
Imagenet General images
State-of-the-art image classifiers
IMG_C1IMG_C2IMG_C3
14,88816,16894,059
1,969 / 2,0001,976 / 2,0001,996 / 2,000
Driving Driving video frames
Nvidia DAVE self-driving systems
DRV_C1DRV_C2DRV_C3
1,5601,560
844
1,720 / 2,0001,866 / 2,0001,930 / 2,000
Contagio/Virustoral PDFs PDF malware detectors
PDF_C1PDF_C2PDF_C3
402602802
1,103 / 2,000789 / 2,000
1,253 / 2,000
Drebin Android apps Android app malware detectors
APP_C1APP_C2APP_C3
402102212
2,000 / 2,0002,000 / 2,0002,000 / 2,000
9/24/18
15
EmpiricalEvaluation
WhatHaveWeLearned?
• Feedingsameinputstodifferentimplementations
• Observingbehavioraldifferences
• Reportingdiscrepanciesasbugs
• Inputgenerationmethods– Unguidedvs. Guided– Tools:Csmith,NEZHA,DeepXPlore