modeling input-dependent error propagation in...
TRANSCRIPT
ModelingInput-DependentErrorPropagationinPrograms
Guanpeng(Justin)LiandKarthikPattabiramanUniversityofBritishColumbia
SoftErrors
2
= 0001 = 0101
• Errorpropagationinprograms
• SilentDataCorruption(SDC)
• Incorrectprogramoutput
• Crash
• Benign
• Traditionalsolutionsaretooexpensive
• Hardwareduplication
• Circuithardening
Researchershaveexpectedmodernsoftwareapplicationstotoleratehardwareerrors
BoundingSDCRate
PoolofRepresentative
Inputs
+
Foreachinput:
EvaluationofProgramSDCRate
FaultInjections
BoundofProgramSDCRate
… …
3
FaultInjectionforMeasurementofSDCRate
ProgramExecution
SDC
Crash
Benign
Artificially introduceafault
OneProgramInput
ObserveFailure
Repeatforthousandsofsamplesforthesameinput
4
Problems
• FaultInjection
• Evenoneexecutionmaytakehoursinalargeprogram
• Needtorepeatforthousandsofsamplesforoneinput
• SDCishighlyInput-Dependent(Showlater)
• SDCratechangesifprograminputischanged
• Repeatthewholefaultinjectionspereachinput
• Evenworse…
• Needtore-dothewholeevaluationeverytimecodeischanged
Alreadytime-consumingforonlyoneprograminput
BoundingprogramSDCratetakestoomuchtime
Impracticaltointegrateintodevelopmentcycle
5
OurGoals&Contributions
Accuracy
Performance
FaultInjection
Fastpredictionofboundingprogram
SDCrate
1. Understandhowdifferentprogram
inputsaffecterrorpropagation
2. DevelopafastmodeltoboundtheSDC
rateoftheprogramgivenmultiple
programinputsCzek etal./
Folkesson etal.
Trident vTrident
6
Challenges
• Faultinjectionapproachisblack-box
• Don’tknowwhathappenduringtheexecutionofbillionsofinstructions
7
Approach
• Understandwhattomodel
• Ouranalyticalmodel:Trident
• Closedformulaoferrorpropagation
• KeyInsight:
• OnlysomepartsoftheentiremodelingarecriticaltoboundprogramSDCrate
• RemovethepartsthatarenotsensitivetothechangeofinputfromTrident
vTridentVolatilityPredictionforTrident
Trident Insights
Trident:Three-levelmodeling
• Register-communicationmodule
• Control-flowmodule
• Memorydependencymodule
SDCpropagation=fs • fc •fm=x1 •x2 •x3 … …
Givenaprogram&
Differentinputs
8
Whattomodel?
• ProgramSDCrateisanaggregationofboth…
• InstructionExecution
• InstructionSDCrateInputA InputB
01010001101… …
ADDR0,R1,4
400times 100timesVariationofInstructionExecution
VariationofInstructionSDCRate 40% 70%
ProgramSDCratechanges!
Easytoprofile
HardtomodelVariationofProgram
SDCRate
Program SDC Rate = f ( Inst Exec, Inst SDC Rate )
9
PriorWork
• Onlymodelsthevariationofinstructionexecution…
• PooraccuracytoquantifythevariationofprogramSDCrateVaria
tionofProgram
SDCRa
te
NeedtomodelthevariationsofbothinstructionSDCrateandinstructionexecution!
ErrorBar:0.03%- 0.55%at95%Confidence
Inst Exec10
Program SDC Rate = f ( Inst Exec, Inst SDC Rate )
VariationofInstructionSDCRate
InputA InputBOneofexamples…
<Propagation Prob.>
<Propagation Prob.>CMPGTR1,R0
R0=512R1=16
R0=64R1=16
32bits
32-bitDataWidth
923
<PropagationProb.>
Reg.Value
~71%
5
~84%
27
11
Program SDC Rate = f ( Inst Exec, Inst SDC Rate )
vTrident:Steps
• BoundingprogramSDCratesforgiveninputs
PoolofProgram Inputs
vTrident
Runeachinput
+
ProgramunderTest
Max Min
VariationofProgramSDCRate
-
=
SDCRa
te
GetRankings
MeasureSDCRateoftheInput
Ref
RefDeriveBounding
13
ExperimentalSetup
• ComparisonoftheBoundingofSDCRate
• Accuracyandperformance
• FaultinjectionresultsderivedbyLLFI[1]asbaseline
• FaultModel
• Singlebit-flip
• Onefaultinjectedperprogramexecution
• Benchmark
• 9open-sourcebenchmarksfromvariousdomainstakingnumericalinputs
• 10inputsgeneratedforeachbenchmark[1]LLVMFaultInjector[DSN’14]
BenchmarkApplication Domains
14
Evaluation:VariationofProgramSDCRateVaria
tionofProgram
SDCRa
te
Max Min
VariationofProgramSDCRate
-
=• Methodology
• DerivedbyvTrident,Inst Exec,faultinjection
• Theclosertofaultinjectionresult,thebetterpredictionvTridentismuchbetterinpredictingthevariationofprogramSDCrate
ErrorBar:0.03%- 0.55%at95%Confidence
Program SDC Rate = f ( Inst Exec, Inst SDC Rate )
Inst Exec 15
Evaluation:BoundingSDCRate
• Methodology:
• RankingofSDCratesbyFaultInjectionandvTrident:Averagedistanceof2.11
• BoundingofasmuchasmeasuredSDCrateswiththepredictedvariationofprogramSDCrate
MeasuredSDCRatebyFaultInjection
Bounding DerivedbyInst-Exec
Bounding DerivedbyvTrident Y-axis:SDCRate;ErrorBar:0.03%- 0.55%at95%Confidence
Program SDC Rate = f ( Inst Exec, Inst SDC Rate )
16
vTridentbounds79%ofSDCswhereastheothermodelboundsmerely32%ofSDCs
Evaluation:Performance
• Wall-ClockTime
• Sample3,000faultswitheachinput,totally10inputsforeachbenchmark
• vTridenttakes2.6hours,8xfasterthanTrident,37xfasterthanfaultinjection
• MemoryRequired(Peak)
• vTrident:14.97GB
• Trident:4outof9benchmarksrequiresmorethan32GBmemory
vTridentissignificantlyfasterthanpriortechniques,requiringmuchlesshardwareresources
17
vTridentinPractice
• Builtascompilermodule
• Fullyautomated
• FastboundingofprogramSDCrate
• Intergrationintosoftware
developmentprocess
NowcanbereplacedbyvTrident
vTrident 18
Summary
• Errorpropagationishighlyinput-dependent
• FaultinjectionsaretooslowtoboundprogramSDCrategivenmultipleinputs
• Understandinginput-dependenterrorpropagation
• vTridentisafastandaccuratemodeltoboundprogramSDCrate
• OpenSource:CodeavailableinthesamerepoofTrident
• https://github.com/DependableSystemsLab/Trident
Guanpeng (Justin)LiUniversityofBritishColumbia
BackupSlides
SoftwareApproach
20
Device/CircuitLevel
ArchitecturalLevel
OperatingSystemLevel
ApplicationLevel
ImpactfulErrors
Protectio
nOverhead
SoftError
Increasin
g
vTrident:Methodology
• ModifyingTrident
• Simplifymemorydependencymodelingthatisnotsensitivetothevariation
• Giveninputs,vTrident…
• PredictstherelativerankingofSDCrates
• DeterminesthevariationofprogramSDCrates
12
Evaluation:Performance
• Wall-ClockTime
• Sample3,000faultswitheachinput,totally10inputsforeachbenchmark
• vTridenttakes2.6hours,8xfasterthanTrident,37xfasterthanfaultinjection
• MemoryRequired
• AverageTraceSize
• vTrident:0.13MB
• Trident:28.13GB
• PeakMemoryConsumption
• vTrident:14.97GB
• Trident:4outof9benchmarksrequiresmorethan32GBmemory
vTridentissignificantlyfasterthanpriortechniques,requiringmuchlesshardwareresources
17