spectaint: speculative taint analysis for discovering

14
SpecTaint: Speculative Taint Analysis for Discovering Spectre Gadgets Zhenxiao Qi UC Riverside [email protected] Qian Feng Baidu USA [email protected] Yueqiang Cheng * NIO Security Research [email protected] Mengjia Yan MIT [email protected] Peng Li ByteDance [email protected] Heng Yin UC Riverside [email protected] Tao Wei Ant Group [email protected] Abstract—Software patching is a crucial mitigation approach against Spectre-type attacks. It utilizes serialization instructions to disable speculative execution of potential Spectre gadgets in a program. Unfortunately, there are no effective solutions to detect gadgets for Spectre-type attacks. In this paper, we propose a novel Spectre gadget detection technique by enabling dynamic taint analysis on speculative execution paths. To this end, we simulate and explore speculative execution at system level (within a CPU emulator). We have implemented a prototype called SpecTaint to demonstrate the efficacy of our proposed approach. We eval- uated SpecTaint on our Spectre Samples Dataset, and compared SpecTaint with existing state-of-the-art Spectre gadget detection approaches on real-world applications. Our experimental results demonstrate that SpecTaint outperforms existing methods with respect to detection precision and recall by large margins, and it also detects new Spectre gadgets in real-world applications such as Caffe and Brotli. Besides, SpecTaint significantly reduces the performance overhead after patching the detected gadgets, compared with other approaches. I. I NTRODUCTION Spectre has now become an emerging attack vector [29]– [31] that breaks down the isolation between processes. Spectre attacks have prompted widespread security concerns, since they can exploit critical vulnerabilities across the spectrum of different processor architectures, including those from Intel, AMD, and ARM. Any devices with these vulnerable proces- sors could be leveraged to defeat the security protection in operating systems (OSs) [29], browsers [21] and infrastruc- tures [45] built on these devices. Spectre mitigation at the hardware level is challenging. To ensure the security of processors, SafeSpec [27] proposes shadow hardware structures designed for speculative execution, such that the microarchitectural state can also be discarded to avoid leakage through side channels. InvisiSpec [46] provides a speculative buffer that stores speculative loads. Thereby, * The main work was done when Yueqiang Cheng worked at Baidu Security. speculative loads are invisible to the cache line state. While these techniques can mitigate Spectre attacks by disabling cache side channels, they are still at the design stage. Software patching is another way to mitigate Spectre attacks. Software vendors cannot always assume their services are running on patched processors, especially for these services running on third-party cloud platforms. Fortunately, software patching provides vendors an opportunity to take control over the security of their products. When there is no effective remedy applied to the hardware, software vendors can utilize serialization instructions [13] to patch Spectre gadgets in software and protect their products. However, it is nontrivial to find a Spectre gadget (a sequence of instructions that leaks information via speculative execution). It is impossible to blindly serialize all conditional branching instructions in a program, which will introduce a very high performance penalty. Many researchers propose various solutions to strike a balance between security and per- formance. Generally speaking, they try to find a set of potential Spectre gadgets in a program and only patch these gadgets to avoid high runtime overhead. Spectre V1 Scanner from RedHat [5] and MSCV Spectre 1 pass [41] search in binary for gadget patterns, and only patch gadgets that match predefined patterns. Tools like SPECTECTOR [23] and oo7 [43] conduct more advanced static analysis such as symbolic execution and taint analysis to detect Spectre gadgets. They are more accurate and generic than simple pattern matching. However, static anal- ysis approaches are known to be imprecise, bringing high false positives and sometimes false negatives when detecting Spectre gadgets. Having realized the limitations in the static analysis, SpecFuzz [39] takes a dynamic analysis approach. It extends fuzzing to not only monitor the normal execution of a program but also simulate its speculative execution paths. It simulates speculative execution by inserting speculative execution logic into the original program at compile time, and detects Spectre gadgets when out-of-bound memory accesses are observed during fuzzing. However, to ensure high fuzzing throughput during simulation of speculative execution, its simulation logic is oversimplified, causing both high false positives and false negatives (see Section II-B for more detailed discussions). In this paper, we would like to enable dynamic taint analysis to discover Spectre gadgets. We propose to conduct Network and Distributed Systems Security (NDSS) Symposium 2021 21-24 February 2021 ISBN 1-891562-66-5 https://dx.doi.org/10.14722/ndss.2021.23466 www.ndss-symposium.org

Upload: others

Post on 27-Nov-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SpecTaint: Speculative Taint Analysis for Discovering

SpecTaint Speculative Taint Analysis forDiscovering Spectre Gadgets

Zhenxiao QiUC Riverside

zqi020ucredu

Qian FengBaidu USA

fengqianbaiducom

Yueqiang ChenglowastNIO Security Researchyueqiangchengnioio

Mengjia YanMIT

mengjiaymitedu

Peng LiByteDance

penglibytedancecom

Heng YinUC Riverside

hengcsucredu

Tao WeiAnt Group

lenxweiantgroupcom

AbstractmdashSoftware patching is a crucial mitigation approachagainst Spectre-type attacks It utilizes serialization instructionsto disable speculative execution of potential Spectre gadgets in aprogram Unfortunately there are no effective solutions to detectgadgets for Spectre-type attacks In this paper we propose a novelSpectre gadget detection technique by enabling dynamic taintanalysis on speculative execution paths To this end we simulateand explore speculative execution at system level (within a CPUemulator) We have implemented a prototype called SpecTaintto demonstrate the efficacy of our proposed approach We eval-uated SpecTaint on our Spectre Samples Dataset and comparedSpecTaint with existing state-of-the-art Spectre gadget detectionapproaches on real-world applications Our experimental resultsdemonstrate that SpecTaint outperforms existing methods withrespect to detection precision and recall by large margins andit also detects new Spectre gadgets in real-world applicationssuch as Caffe and Brotli Besides SpecTaint significantly reducesthe performance overhead after patching the detected gadgetscompared with other approaches

I INTRODUCTION

Spectre has now become an emerging attack vector [29]ndash[31] that breaks down the isolation between processes Spectreattacks have prompted widespread security concerns sincethey can exploit critical vulnerabilities across the spectrum ofdifferent processor architectures including those from IntelAMD and ARM Any devices with these vulnerable proces-sors could be leveraged to defeat the security protection inoperating systems (OSs) [29] browsers [21] and infrastruc-tures [45] built on these devices

Spectre mitigation at the hardware level is challengingTo ensure the security of processors SafeSpec [27] proposesshadow hardware structures designed for speculative executionsuch that the microarchitectural state can also be discarded toavoid leakage through side channels InvisiSpec [46] providesa speculative buffer that stores speculative loads Thereby

lowastThe main work was done when Yueqiang Cheng worked at Baidu Security

speculative loads are invisible to the cache line state Whilethese techniques can mitigate Spectre attacks by disablingcache side channels they are still at the design stage

Software patching is another way to mitigate Spectreattacks Software vendors cannot always assume their servicesare running on patched processors especially for these servicesrunning on third-party cloud platforms Fortunately softwarepatching provides vendors an opportunity to take control overthe security of their products When there is no effectiveremedy applied to the hardware software vendors can utilizeserialization instructions [13] to patch Spectre gadgets insoftware and protect their products

However it is nontrivial to find a Spectre gadget (asequence of instructions that leaks information via speculativeexecution) It is impossible to blindly serialize all conditionalbranching instructions in a program which will introducea very high performance penalty Many researchers proposevarious solutions to strike a balance between security and per-formance Generally speaking they try to find a set of potentialSpectre gadgets in a program and only patch these gadgetsto avoid high runtime overhead Spectre V1 Scanner fromRedHat [5] and MSCV Spectre 1 pass [41] search in binary forgadget patterns and only patch gadgets that match predefinedpatterns Tools like SPECTECTOR [23] and oo7 [43] conductmore advanced static analysis such as symbolic execution andtaint analysis to detect Spectre gadgets They are more accurateand generic than simple pattern matching However static anal-ysis approaches are known to be imprecise bringing high falsepositives and sometimes false negatives when detecting Spectregadgets Having realized the limitations in the static analysisSpecFuzz [39] takes a dynamic analysis approach It extendsfuzzing to not only monitor the normal execution of a programbut also simulate its speculative execution paths It simulatesspeculative execution by inserting speculative execution logicinto the original program at compile time and detects Spectregadgets when out-of-bound memory accesses are observedduring fuzzing However to ensure high fuzzing throughputduring simulation of speculative execution its simulation logicis oversimplified causing both high false positives and falsenegatives (see Section II-B for more detailed discussions)

In this paper we would like to enable dynamic taintanalysis to discover Spectre gadgets We propose to conduct

Network and Distributed Systems Security (NDSS) Symposium 202121-24 February 2021ISBN 1-891562-66-5httpsdxdoiorg1014722ndss202123466wwwndss-symposiumorg

dynamic taint analysis on speculative execution paths and dis-cover Spectre gadgets based on data flow patterns Comparedwith syntax-based [5] and sanitizer-based approaches [39] ourgadget pattern can capture the semantics of Spectre gadgetsAs substantiated in the evaluation (Section VI) our semantic-based gadget pattern produces fewer false positives comparedwith the existing works Furthermore static taint analysis suf-fers from the severe over-tainting and under-tainting issues dueto imprecise memory alias analysis incomplete control flowgraph extraction etc As shown in our evaluation (Section VI)oo7 has poor precision and recall rates when analyzing real-world programs

However existing dynamic binary taint analysis platformscannot detect Spectre gadgets because speculative execution isinvisible to normal program execution Therefore we extenda dynamic taint analysis platform and instrument speculativeexecution logic on-the-fly As a result we can simulate spec-ulative execution and capture data flow patterns on specula-tive paths for Spectre gadget discovery To the best of ourknowledge we are the first to enable dynamic taint analysison speculative paths for Spectre gadget detection

We have implemented a prototype SpecTaint to demon-strate the efficacy of speculative taint analysis in detectingSpectre gadgets We evaluated the performance of SpecTainton our Spectre Samples Dataset and six real-world programsThe experimental results demonstrate that SpecTaint out-performs the baseline approaches in terms of the precisionand recall by large margins It also has reasonable runtimeefficiency and reduces the performance overhead after patchingdetected gadgets by 73 compared with the conservativehardening strategy Besides SpecTaint can detect new Spectregadgets in real-world applications like Brotli [7] and Caffeframework [1]

We summarize our contributions as follows

bull We propose a dynamic speculative execution simulationplatform that enables dynamic taint analysis on specula-tive execution paths With the support of dynamic taintanalysis we deploy a semantic-based gadget detector thatdetects exploitable Spectre gadgets during the programexecution

bull We build a synthetic dataset by inserting known Spectregadgets into selected real-world programs The datasetcan be considered as a benchmark to help researchersevaluate their Spectre gadget detection tools It will ben-efit the security community

bull We implement a prototype SpecTaint to demonstratethe efficacy of our approach and compare SpecTaintwith state-of-the-art tools on the Spectre Samples Datasetand real-world programs Our evaluation indicates thatSpecTaint outperforms existing methods with respect toprecision and recall with reasonable runtime efficiencyMoreover SpecTaint discovered new Spectre gadgetsthat were not detected by the other tools Besides it sig-nificantly reduces the execution overhead after patchingdetected gadgets compared with the other approaches

II OVERVIEW

In this section we first walk through a motivating exampleto explain how a Spectre V1 gadget is exploited by attackers

Then we introduce the background and limitations of state-of-the-art tools that discover Spectre gadgets from binariesFurthermore we propose our approach and show the capabil-ities of our approach to address these limitations Then wegive a brief introduction about the overview and mechanismof SpecTaint At last we discuss the scope and assumptionsof this work

A Motivating Example

Listing 1 shows a code snippet that can be exploited tolaunch a Spectre V1 attack [34]

1 vo id v i c t i m f u n c t i o n ( s i z e t u s e r i n p u t )2 3 4 i f ( u s e r i n p u t lt g e t l e n ( a r r a y 1 ) )5 s e c r e t = a r r a y 1 [ u s e r i n p u t ] RS Read S e c r e t6 temp amp= a r r a y 2 [ s e c r e t lowast 2 5 6 ] LS Leak S e c r e t7 8 9 i n t main ( )

10 11 v i c t i m f u n c t i o n ( u s e r i n p u t ) 12 r e t u r n 0 13

Listing 1 Code snippet containing Spectre gadgets

In this example the if statement is a sanity check thatensures the following array access is within a valid rangeWhen evaluating the sanity check the outcome of the branchat line 4 may take many CPU cycles to be determined (egdue to the delay caused by a load from the main memory)To avoid the performance penalty caused by this delay thebranch prediction unit (BPU) will predict the branch outcomefrom where the instructions will be executed speculatively Anattacker can poison the branch predictor by feeding craftedinputs to the program to intentionally trick the BPU intomaking an expected prediction on that branch Then theattacker can launch the Spectre attack by running the code withan out-of-bounds value as input In this case the BPU predictsthe branch outcome to be true and the processor speculativelyexecutes instructions at line 5 and line 6 Consequently anarbitrary value can be read using an out-of-bounds index toaccess array1 at line 5 Then another index related to theloaded secret is used to access array2 and results in cacheline state changes for array2 After the processor finds outthe prediction to be wrong it discards all architectural effectsmade by speculative instructions However side effects (egcache line state changes) still remain at the micro-architecturallevel and the attacker can launch a cache side-channel attack(eg Flush+reload [48]) to retrieve the secret value

B Background and Rationale

Speculative execution is a hardware feature that is invisibleto the program execution Therefore to apply software-basedprogram analysis techniques on detecting Spectre gadgets thefirst step is to simulate speculative execution at the softwarelevel which is also the principle for all related works To thisend existing approaches utilize different methods to simulatespeculative execution either statically or dynamically RHscanner [5] (also known as Spectre V1 scanner) is a staticanalysis tool It simulates speculative execution by scanning

2

both targets of a conditional branch By doing so it at leastcovers one path that is not taken during real execution which isconsidered as a speculative path During scanning it searchesa certain pattern in the binary to detect Spectre gadgetsHowever the syntax-based code pattern used by RH scannercould produce a large number of unexploitable candidatessince not all detected gadgets can be controlled by attackers

Oo7 [43] also conducts a static binary analysis The differ-ence lies in that it conducts a semantic-based gadget detectionThat is it leverages the data flow analysis to construct asemantic code pattern and identifies the code snippet that notonly satisfies predefined patterns but also can be controlled viauser inputs The intuition behind it is that if the gadgets canbe influenced by user inputs attackers can leverage carefully-constructed inputs to exploit these gadgets To this end oo7utilizes static taint analysis to trace information flow frominputs However it is well-accepted that static taint analysissuffers from severe over-tainting and under-tainting issues dueto imprecise memory alias analysis inaccurate control flowgraphs and call graphs etc

The aforementioned works examine speculative executionstatically and deploy different program analysis techniques forSpectre gadget detection However the detection capabilitiesof these approaches inherit the limitations of static analysistechniques Therefore these techniques by design suffer fromhigh false positives and false negatives

SPECTECTOR [23] mathematically defines a semanticnotion of security against speculative execution and developsan algorithm based on symbolic execution to prove the absenceof speculative leaks However this approach inherits the bottle-necks of symbolic execution and has to sacrifice the soundnessand completeness of analysis when analyzing large programs

SpecFuzz [39] takes a fuzzing approach to dynamicallydetect Spectre gadgets It exposes speculative execution tofuzzing by inserting the speculative execution logic into theprogram at compile time and relies on random mutation ofprogram inputs to detect speculative execution errors duringprogram execution To ensure high fuzzing throughput itsgadget detection logic is oversimplified More specifically ithas the following limitations

bull Simplistic Gadget Modeling To avoid high runtimeoverhead during fuzzing SpecFuzz simply leverages amemory safety checker (eg AddressSanitizer) to detectout-of-bounds memory accesses and considers all out-of-bounds memory accesses to be potential Spectre gadgetsThis modeling is oversimplified and error-prone An out-of-bounds memory access during speculative executionmay not be controlled via user inputs and does not nec-essarily constitute a Spectre V1 gadget Our experimentsin Section VI substantiate this claim

bull Probabilistic Detection SpecFuzz detects Spectre gad-gets by capturing out-of-bounds memory access errorsduring speculative execution However even if a trueSpectre gadget is indeed exercised during fuzzing it mayor may not trigger any out-of-bounds memory accesserrors SpecFuzz relies on random mutation of programinputs to trigger these errors As a result the Spectregadget detection of SpecFuzz is probabilistic

bull Flawed Exception Handling Exceptions are likely tooccur during simulated speculative execution because itexecutes a path which might not be expected To dealwith exceptions during speculative execution again forsimplicity SpecFuzz stops the simulation immediatelyand restores the execution to a previously saved state Thisexception handling is flawed because it does not correctlysimulate how the hardware actually behaves In realitywhen encountering an exception during speculative execu-tion the processor can continue the speculative executionuntil it is terminated [34] Consequently SpecFuzz mightmiss Spectre gadgets that are located after the exception-raising instruction thereby causing false negatives

In this work we would also like to take a dynamic analysisapproach to detect Spectre gadgets to ensure high detectionaccuracy We resort to independent test case generation sys-tems (such as fuzzing and symbolic execution) to producehigh-quality test cases to achieve high detection coverageIn order to achieve high detection accuracy in real-worldsoftware we need to strike a balance between scalability andgadget detection fidelity That is our analysis must be able tocope with complex real-world software and faithful enough toensure high detection accuracy

To this end we propose to perform dynamic taint analysison speculative execution paths and conduct taint-based pat-tern checking to characterize Spectre gadgets at the semanticlevel Essentially by performing dynamic taint analysis onspeculative execution paths we can detect memory accessesthat are dependent on the program input (which attackers cancontrol) and may cause information leakage through cacheside channels This taint-based gadget pattern checking mightnot be as precise as SPECTECTOR [23] but our scheme isdesigned to be scalable and practical for detecting Spectregadgets from real-world programs It is more faithful thanthe one used in SpecFuzz [39] as it is deterministic (oncethe program inputs are determined) rather than probabilistic(relying on random mutation of program inputs to triggererrors) Our evaluation in Section VI shows that this taint-basedapproach is able to achieve much better detection accuracy thanSpecFuzz by paying more runtime overhead on speculativetaint analysis In other words our trade-off between scalabilityand gadget detection fidelity is justified

C System Overview

Architecture As illustrated in Fig 1 we extend a systememulator to simulate speculative execution and enable dynamictaint analysis on top of it Given a target program we simulatespeculative execution of the CPU by dynamically forcing theCPU emulator to execute code paths (speculative instructions)which will not be executed at normal execution We alsoconduct tainting analysis on speculatively executed instructionsto detect Spectre gadgets

Workflow Fig 2 illustrates the workflow of SpecTaintSpecTaint will go through two stages the normal executionstage and speculative execution stage At the normal executionstage SpecTaint runs the target program with seeds generatedby external fuzzers to explore as many execution paths aspossible SpecTaint will start to simulate speculative execution

3

Fig 1 Architecture of SpecTaint

Fig 2 Workflow of SpecTaint

when encountering a conditional branch and backup the cur-rent execution state When the speculative execution window(SEW) is reached SpecTaint will roll back the executionstate to the previously saved checkpoint Like the rollback ofspeculative execution in the hardware our simulated rollbackalso squashes all the side effects produced during speculativeexecution At the speculative execution stage a Spectre gadgetdetector conducts pattern checking on each speculative execu-tion path to detect gadgets To mitigate Spectre attacks thereported gadgets are forwarded to automatic serialization tools(eg Speculative Load Hardening [18])

Threat Model and Scope We share the same threat modelwith other Spectre gadget detection tools [39] [43] That isour analyzed programs are benign but might be vulnerableWe do not deal with malicious code that might deliberatelythwart or escape our analysis Our approach can be used toanalyze the exploit code for Spectre gadgets in a malwaresample but it is not our focus in this paper In this workwe focus on detecting gadgets in victim programs that canbe exploited by Spectre V1 attacks and leak sensitive datathrough cache side channels Meltdown-type attacks do notrequire gadgets in victim programs and can be launched inmalicious programs Therefore they fall out of the scope ofthis work Our simulation focuses on Spectre V1 gadgetsdetection at the binary level We do not simulate micro-operations instruction execution including instruction fetchingdecoding etc and irrelevant hardware structures such as thereorder buffer (ROB) We only use the size of ROB to calculatethe speculative execution window (SEW)

III DYNAMIC SPECULATIVE EXECUTION SIMULATION

In this section we introduce our speculative execution sim-ulation platform for Spectre gadgets detection Our platformextends the binary analysis platform DECAF [19] which isbuilt on top of the system emulator QEMU [14] Specificallywe will showcase how we utilize the system emulator to sim-ulate speculative execution triggered by branch mispredictionsand how we explore speculative paths in a depth-first manner

A Misprediction Simulation

Spectre V1 exploits out-of-bounds memory accesses inspeculative execution triggered by one or more mispredic-tions of conditional branches SpecTaint extends a systememulator to simulate this behavior In the system emulatorpc stores the next instruction to be executed in the caseof a conditional jump it stores the jump target address Tosimulate the misprediction behavior of processors SpecTaintinverts the direction of the conditional jump by replacingthe pc with the untaken target address As a result it willexecute the untaken path first at a conditional branch andenter the simulated speculative execution Before it goes tothe untaken path a state checkpoint is saved which containsthe current state of the emulator eg CPU registers Duringthe simulated speculative execution any modification to thememory will be logged Specifically the original value will besaved before a speculatively executed instruction modifies itWhen the speculative execution window is reached the simu-lated speculative execution will be terminated and the controlflow transfers back to the last saved checkpoint To maintainthe correctness of execution the memory modifications willbe restored by writing the original values back to memoryand the CPU state will be restored with the saved state Asfor other conditions to terminate speculative execution weconsider serialization instructions (eg SYSCALL LFENCEetc) listed by Mambretti et al [35] and terminate the simulatedspeculative execution when encountering these instructions

Exception Handling Exceptions can be very common duringthe simulation of speculative executions This is because aCPU will execute a path which is not expected to executeA straightforward way for simulation is to terminate thesimulation and roll back whenever an exception is triggered asdone in SpecFuzz [39] However it could miss Spectre gadgetsafter the exception point along the speculative path sincespeculative execution is not supposed to be terminated whenhardware exceptions are raised in many CPU models [34] Thespeculative execution is expected to continue even if an excep-tion occurs To guarantee a precise simulation our approachdoes not roll back when an exception occurs and instead forcesit to silently simulate this exception instruction without raisingany exception and continue speculative execution on the nextinstruction

To simulate this behavior we extend the exception handlerof the system emulator More specifically we use a customizedexception handler to capture errors in speculative executionWhen an exception due to the permission violation (eg ac-cessing kernel space from user-land) occurs during a simulatedspeculative execution we will silently simulate this exceptioninstruction without raising any exception and continue theuser-land execution after the exception instruction By doing

4

so SpecTaint is able to continue the simulation even whenexceptions occur For other exceptions which SpecTaint isnot able to handle (eg caused by invalid jump target orincomplete address translation) because the emulator does notknow where the next instruction to be executed is SpecTaintwill make a state rollback and restore to the previously savedcheckpoint

B Exploration of Speculative Execution Paths

The path exploration involves two path types the normalexecution (NE) path and speculative execution (SE) path Toexplore the NE paths we feed seeds generated by fuzzingtools into the target program When encountering a conditionalbranching instruction SpecTaint redirects the execution tothe untaken branch first to explore SE paths Crucially CPUscan perform nested speculative execution in a SE path Sincewe center on Spectre V1 which exploits the misprediction ofconditional branches we take into account nested speculativeexecutions triggered by conditional branches To simulate thisbehavior SpecTaint explores SE paths for each conditionalbranch in NE paths and SE paths within SEW In other wordsSpecTaint explores the speculative paths in a depth-firstmanner During the exploration SpecTaint monitors whetherany termination condition is met and restore to the last savedstate if there is any

Execution State Rollback It is crucial to restore the executionstate after the speculative execution is terminated so as tomaintain the correctness of program execution An executionstate includes the state of all the CPU registers and used mem-ory regions The execution state rollback has two operationsstate backup and resume For a conditional branch SpecTaintfirst backups the current execution state before simulatesspeculative execution on that branch When the SE simulationis terminated SpecTaint resumes the backup state by resettingthe current execution environment with the backup state Tobackup an execution state SpecTaint saves the emulator stateeg all registers For memory it is too heavyweight to savethe entire memory so we adopt a lightweight ldquocopy-on-writerdquoapproach More specifically SpecTaint keeps track of memoryregions that are modified during speculative execution Beforethe memory regions are modified SpecTaint saves the originalvalues When making a state rollback SpecTaint writes theoriginal values back to the memory in a reversed order asthey are modified Moreover SpecTaint uses dynamic tainttracking to detect Spectre gadgets Thus it also restorestaint information that is created during simulated speculativeexecution More details are discussed in Section IV-A

Path Exploration The path exploration considers two typesof path coverage the NE and SE path coverage The pathcoverage on normal execution is used to explore more pathsduring the normal execution We utilize fuzzing techniquessuch as AFL [2] to improve path coverage on normal execu-tion The SE path coverage is to measure how many speculativepaths are covered The goal is to explore speculative executioncomprehensively so as to avoid missing Spectre gadgets onuncovered speculative paths

The switch point is the instruction that transfers the exe-cution mode from NE to SE SpecTaint treats a conditionalbranch as a switch point and conducts the SE simulation Once

Algorithm 1 Speculative Path Exploration AlgorithmInput Entry point pc SEW size w Backup state

set saved stateOutput gadgets gadget setgadget setlarr empty inst countlarr 0Function explorer(pc w)

while inst count lt w doif is terminator(pc) then

state = saved statepop()restore(state)pc = statepcexplorer(pc stateinsn count)

endif is branch(pc) then

saved statepush(checkpoint(pc))foreach t isin get targets(pc) do

explorer(t w minus insn count)end

endexecute(pc)if gadget checker(pc) then

gadget setlarr pcendpclarr next pcinst count ++

endreturn

entering the speculative execution mode SpecTaint exploreseach conditional branch and its targets for the speculativeexploration The speculative path exploration algorithm isshown in Algorithm 1 We use pc to represent the currentinstruction and gadget set to store the locations of detectedSpectre gadgets The speculative execution path explorationis a depth-first traversal on the control flow graph of theprogram at the speculative execution mode When exploringspeculative paths SpecTaint will first check whether thecurrent execution path has reached the SEW limit w or anyother termination conditions and if so restore to the last savedstate If the current instruction at pc is a conditional branchSpecTaint will simulate a misprediction by walking throughboth targets of this branch instruction Before exploring eachtarget SpecTaint will first backup the state and push it into astack saved state After finishing the path exploration on atarget (eg reaching the SEW limit) SpecTaint will resumethe last saved state and continue for the next target Alongwith the exploration SpecTaint conducts the gadget patternchecking on the current instruction If it matches SpecTaintwill save this gadget into gadget set and continue explorationuntil termination conditions are met

Mitigating Path Explosion As presented in our evaluationthe length of each speculative execution path is bounded by theSEW limit (see TableV) However there can be a large numberof paths within this SEW limit in the worst case Thereforewe may still encounter a path explosion problem Accordingto the analysis of our evaluation dataset we identify two kindsof cases where the exponential growth of paths may happenloops and recursive functions We propose our approach to

5

address the path explosion issue That is SpecTaint has athreshold on how many times simulation can happen on thesame branch In real hardware it is unlikely that speculativeexecution can be triggered by the same branch repeatedlyAfter iterations the same branch will not be mispredictedsince its memory value are very likely to be stored in registersor be cached after iterations [35] As demonstrated by Kocheret al [29] executing the same branch five times is enough totrain the branch predictor Therefore we also set the thresholdto five and SpecTaint will not simulate speculative executionrepeatedly on the same branch if it reaches the thresholdWe admit that this approach might miss gadgets in paths wehave not explored yet More details about false negatives arediscussed in Section VII

IV SPECTRE GADGET DETECTION

This section addresses how we detect Spectre gadgetsduring dynamic speculative execution simulation More specif-ically we first formalize the Spectre gadget definitions and thendescribe how we check the patterns to detect potential Spectregadgets on speculative paths

A Dynamic Taint Tracking

If a variable is data-dependent on user inputs we con-sider this variable is under an attackerrsquos control To findexploitable Spectre gadgets it is essential to find gadgetsthat can be controlled through external inputs Therefore inorder to capture this control relation we utilize dynamic taintanalysis to trace variables that are data-dependent on externalinputs during execution To this end we label user inputsas taint sources and observe how the data flows from userinputs Specifically we utilize the whole-system dynamic taintanalysis shipped with the platform we extend DECAF [19]and perform taint propagation along with the execution of theprogram DECAFrsquos tainting rules have been formally verifiedto be sound (guarantee of no under-tainting at instructionlevel) and most of them have also been verified to be precise(guarantee of no over-tainting) The details are documented inthis paper [24] We conduct dynamic taint analysis on bothNE and SE paths Performing taint analysis on NE paths isto ensure the propagation of taint labels in normal programexecution performing taint analysis on SE paths is to facilitatethe gadget checker to detect exploitable Spectre gadgets Whenthe SE path is terminated (eg when reaching SEW size) theCPU state and memory modifications will be restored so as thetaint information When restoring to the previously saved statewe also clean the variables which are marked as tainted duringthe simulated speculative execution in order to maintain thecorrectness of taint propagation

B Spectre Gadget Modeling

In this section we formulate the gadget patterns for twotypes of Spectre V1 gadgets Bounds Check Bypass (BCB)and Bounds Check Bypass Store (BCBS) To facilitate thediscussion we define the following notions before giving theSpectre gadget definitions

bull c is a conditional branch instructionbull T (c) denotes a set of instructions in a speculative execu-

tion trace from c

bull m(i) denotes i is a memory read instructionbull str(i) denotes i is a memory write instructionbull [i ] denotes the memory value accessed by instruction i bull dep(i j ) denotes instruction i is data dependent on j bull t(i) denotes the operands of instruction i is taintedbull δ denotes the size of speculative execution window

BCB Gadget In BCB attacks the speculative load instructionis under an attackerrsquos control thus the attacker could readarbitrary values from memory Then another load instructionis required to load the secret-indexed memory location withthe intention of leaking the secret To leak the secret throughthe cache side channel the leak instruction has to use theloaded secret as the index to read from memory thus thesecret can be retrieved by monitoring the cache line statechanges In general the BCB gadget involves a set of arrayoperations and the index in the latter array operation is data-dependent on the value from the former array [34] Essentiallythe first array access is responsible for loading secrets and thesecond one is responsible for leaking secrets However notall the code sequences matching such patterns are consideredas a BCB gadget The BCB gadget demands the index of theformer array access should be under the attackerrsquos control suchthat the attacker could read arbitrary values from memory inthe target program space by carefully manipulating the inputvalues To capture this control relation we utilize dynamictaint analysis to track data flow from external inputs whichis discussed in IV-A Suppose the speculative execution startsfrom a conditional branch c and we formalize our BCB gadgetpattern as Φbcb(c) and its definition is as follows

Φbcb(c) = existi j isin T (c)m(i) andm(j ) and dep(j [i ])

andt(i) and |c j | lt δ(1)

BCBS Gadget Unlike BCB gadget BCBS uses a speculativewrite (SW) to modify arbitrary memory locations In BCBSattacks attackers control the index of an array that wouldaccess out of boundary memory during speculative executionAs a result attackers can modify an arbitrary memory location(eg return address) by manipulating the index of the arrayWe formalize our BCBS gadget pattern as Φbcbs(c) and itsdefinition is as follows

Φbcbs(c)= existi isin T (c) str(i) and t(i) and |c i | lt δ (2)

In both patterns we leverage dynamic taint analysis todetermine whether the index of an array is under attackersrsquocontrol or not If an instruction i is tainted we consider i isunder attackersrsquo control and t(i) is set to be true

Gadget Classification Our gadget patterns are similar towhat is proposed in oo7 [43] The primary difference liesin whether the branch instruction is tainted or not Oo7 [43]considers that the branch should be tainted and controlled bythe inputs Thus attackers can poison the branch predictor tocause intentional misprediction of that branch by executing thevictim program with carefully-constructed inputs However asdiscovered by Canella et al [17] the branch prediction buffers

6

are shared and commonly indexed by the virtual address of thebranch instruction thus can be poisoned from another attack-controlled process by executing a congruent branch with thesame virtual address Based on this finding Spectre gadgetscan be exploited in the same address space (ie intra-process)or across address spaces (ie cross-process) [17] Thereforeonly considering tainted branches will exclude cross-processgadgets Our gadget patterns cover both situations if thespeculative execution is triggered by a taint conditional branchwe will mark this gadget as an intra-process Spectre gadgetotherwise this gadget will be marked as a cross-processSpectre gadget

C Gadget Detection

The gadget detection is to check whether the speculativeinstruction trace conforms to the gadget patterns describedabove We deploy the gadget checker in simulated speculativeexecution Although both our gadget patterns involve memoryaccess operations it is inefficient and unnecessary to checkall the memory access instructions on SE paths Instead weonly check memory access instructions which are tainted Tothis end we instrument memory read and write operationsTo detect BCB gadgets for each memory read we checkwhether its source is tainted To make sure the dependencydep(j [i ]) rule is satisfied between two instructions we alsotrack whether the tainted operand of one instruction j ispropagated from instruction i For BCBS gadgets the patterncaptures any tainted memory write whose destination addressis marked as tainted within the SEW

V IMPLEMENTATION

We have implemented the prototype SpecTaint in C Morespecifically we wrote a C plugin of 1 KLOC in C codeon top of DECAF [25] (a dynamic binary analysis platformbuilt on top of QEMU 10) This plugin implements thestate checkpoint management and Spectre gadget detectioncomponents We reused the dynamic taint analysis plugin ofDECAF for our taint analysis Overall the changes to developour prototype do not exceed 2 KLOC Besides to increase thecode coverage we use AFL 252b [2] and honggfuzz [8] togenerate seed inputs

VI EXPERIMENTAL EVALUATION

In this section we evaluate SpecTaint to answer thefollowing research questions

1) How effective is SpecTaint to find Spectre gadgets com-pared with other existing tools

2) How efficient is SpecTaint to find Spectre gadgets inreal-world applications

This section is composed as follows First we brieflydescribe the experiment setup the datasets and the evaluationmetrics used in our experiments (Sections VI-A) Second weevaluate the efficacy of SpecTaint (Sections VI-B) Then weevaluate the efficiency of SpecTaint on real-world applications(Sections VI-E) Finally we conduct case studies of someSpectre gadgets detected by SpecTaint in real-world applica-tions and the deep learning framework Caffe (Sections VI-F)

A Experiment Setup

Baseline Methods We compare SpecTaint with three base-line approaches Spectre 1 Scanner from Red Hat (RH Scan-ner) [5] oo7 [43] and SpecFuzz [39] RH Scanner is a staticanalysis tool that can be used to scan for Spectre gadgetsOo7 [43] is another static analysis tool that utilizes static taintanalysis to find Spectre gadgets SpecFuzz [39] extends thefuzzing technique to detect errors in speculative execution andreport Spectre gadgets Another related work SPECTECTORleverages symbolic execution to detect information-flow differ-ences introduced by speculative execution Since the manualsettings to make it work on large programs are not open-source it is hard to evaluate it on our real-world benchmarksAs discussed in SpecFuzz [39] the authors failed to runSPECTECTOR on the real-world benchmarks due to a largenumber of unsupported instructions Therefore we did notcompare with SPECTECTOR on real-world benchmarks (samewith SpecFuzzrsquos)

Evaluation Dataset The evaluation is conducted on twodatasets the Spectre Samples Dataset and Real-world DatasetAligned with other baseline works we utilize the same SpectreSamples Dataset to demonstrate the detection capability ofSpecTaint For baseline comparison with related works wecollected six real-world applications and created two real-world datasets

bull Spectre Samples Dataset This dataset is designed todemonstrate the efficacy of SpecTaint We collected15 Spectre V1 samples created by Paul Kocher [3] andcompiled 15 samples with the same configuration (gcc-484 with O0) [39]

bull Real-world V1 Dataset For fair baseline comparisonwe use the same dataset as SpecFuzzrsquos [39] This datasetcontains six widely used applications one cryptographicprogram from OpenSSL [12] a compression program(Brotli [7]) and four parsing programs (JSON [10]LibHTP [11] HTTP [9] and YAML [6])

bull Real-world V2 Dataset Systematic baseline evaluationis difficult due to the shortage of ground truth in real-world programs To solve this problem we injectedknown Spectre gadgets into programs from Real-worldV1 Dataset and created the Real-world V2 Dataset Weadopt the same injection approach proposed in LAVA [20]to build this dataset More specifically we utilize dynamictaint analysis to find attack points which can be controlledby input bytes that do not determine control flow andhave not been modified much (see [20] for more details)Then we inject the Spectre gadgets from Spectre SampleDataset into target programs and add code to makeinjected gadgets controllable via input bytes As a resultwe injected 15 Spectre V1 gadgets from Spectre SampleDataset into 52 different locations in six programs Tomake a fair comparison the input seeds used in the evalu-ation are all generated by a fuzzing tool [8] Therefore wehave no clue whether the injected locations are coveredby input seeds in the evaluation It is worth mentioningthat this dataset is used to evaluate the detection coverageof SpecTaint and related works instead of pathcodecoverage and we choose the same seed corpus withSpecFuzzrsquos to guarantee a fair comparison

7

Evaluation Metrics For the Real-world V1 dataset whichdoes not have ground truth we manually verify the detectionresults and calculate the precision rate to quantify the perfor-mance of SpecTaint and baseline approaches The precisionis calculated as precision(P ) = TP

TP+FP where TP is thenumber of detected gadgets that are manually verified to beexploitable and FP is the number of detected gadgets thatare not exploitable based on manual analysis For Real-worldV2 Dataset we only consider injected Spectre gadgets to betrue positives and other reported results to be false positivesThen we calculate the precision and recall to quantify theeffectiveness of the proposed approach and baseline methodsThe recall is calculated as recall(R) = TP

TP+FN where TP isthe number of inserted gadgets which are correctly detectedand FN is the number of inserted gadgets that are missed Theprecision is calculated as precision(P ) = TP

TP+FP where FPis the number of detected gadgets other than injected ones Tomeasure the efficiency we reported runtime per speculativeexecution and number of paths explored per speculative exe-cution along with other statistics (see Section VI-E for moredetails)

Configuration The experiments were conducted on a desktopwith 16 GB memory Intel Core i7 12 cores at 370 GHzCPU and running Linux 415 The Guest OS in QEMU isUbuntu 1404 with 1 GB memory The speculative window isdependent on the space limit ie ROB size and timing limitsWe follow the configuration used by SpecFuzz [39] and alsoset the speculative window size to 250

B Baseline Evaluation on Spectre Samples Dataset

In this experiment we compared SpecTaint with threebaseline tools on the Spectre Sample Dataset As presented inTable II SpecTaint successfully detected all Spectre gadgetsin the Spectre Samples Dataset while RH Scanner relies onsyntax-based pattern matching and missed three cases

C Baseline Comparison on Real-world V2 Dataset

We conducted the baseline comparison with three relatedworks RH Scanner oo7 and SpecFuzz on Real-world V2Dataset In this experiment we focused on detecting theinserted gadgets therefore we only consider inserted gadgetsto be true positives and all other detection results to befalse positives For tools that utilize taint tracking to detectSpectre gadget (oo7 and SpecTaint) we mark input bytes astaint sources Since the analysis of oo7 is very slow whenperforming whole input bytes tainting we only mark inputbytes that influence injected gadgets as taint sources We adoptthe same configuration for SpecTaint To compare with RHScanner and SpecFuzz we have another configuration forSpecTaint where we mark all input bytes as tainted Theresults for the former configuration are labeled with ldquordquo inTable I For dynamic analysis tools (SpecTaint and SpecFuzz)we used an external fuzzing tool [8] to fuzz the six programsfor 10 hours and fed the generated seeds as inputs to run theprograms

1 2 i f ( p a r s e rminusgtt o k n e x t gt= num tokens ) 3 I n s e r t e d S p e c t r e Gadget4 i f d e f SPECTRE VARIANT5 i f ( g l o b a l i d x lt a r r a y 1 s i z e )6 tmp amp= a r r a y 2 [ a r r a y 1 [ g l o b a l i d x ] lowast 5 1 2 ] 7 8 e n d i f9

Listing 2 Missed Spectre gadget 1 by SpecFuzz

1 2 i f ( lminusgtf i r s t + i d x lt lminusgtmax size ) 3 r e t u r n ( vo id lowast) lminusgte l e m e n t s [ lminusgtf i r s t + i d x ] 4 e l s e 5 I n s e r t e d S p e c t r e Gadget6 i f d e f SPECTRE VARIANT7 i n t temp = 0 8 i n t lowastadd r = ampg l o b a l i d x 9 i f (lowast add r lt a r r a y 1 s i z e )

10 temp amp= a r r a y 2 [ a r r a y 1 [lowast add r ] lowast 5 1 2 ] 11 12 e n d i f13

Listing 3 Missed Spectre gadget 2 by SpecFuzz

Table I shows that when tainting gadget-related inputbytes SpecTaint has no false positives and achieves a pre-cision rate of 100 Under the same configuration howeveroo7 still produced false positives For instance it reported13 gadget candidates in LibHTP but 12 of them are falsepositives The results show that static taint analysis suffersfrom the over-tainting issue thereby is hard to achieve highprecision in detecting Spectre gadgets As presented in Table Ioo7 has many false negatives For example it missed allinserted gadgets in YAML and Brotli We examined these falsenegatives and found the reasons are as follows Firstly thedetection results of oo7 [43] depend on the completeness of thecontrol-flow graph (CFG) extraction Some inserted gadgetsare missed since it failed to extract a complete CFG dueto the limitation of the static approach Also oo7 is limitedby static inter-procedure taint tracking and it missed manyinserted gadgets because it failed to propagate the taint sourceto the injection points in the target programs The evaluationresults substantiate our claim that dynamic taint analysis ismuch more accurate and effective than static taint analysis indetecting Spectre gadgets

As presented in Table I SpecTaint outperforms RHScanner and SpecFuzz under the whole input bytes taintingconfiguration Since we only consider injected gadgets to betrue positives in this dataset other detection results are labeledas false positives The analysis results of other gadgets reportedby SpecTaint are presented in Table IV Note that Spec-Taint missed some injected gadgets in this experiment Wefurther investigated the results and found that missed gadgetsare not covered by input seeds However SpecFuzz missedmany injected gadgets that are covered by input seeds anddetected by SpecTaint According to our analysis the reasonsare first SpecFuzz adopts a prioritized simulation of branchmispredictions it selectively chooses whether to simulate themisprediction or not on a conditional branch Therefore itmissed some injected gadgets For example as presented inListing 2 the seeds are able to reach the branch at line 2 butSpecFuzz did not simulate branch misprediction over these

8

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 2: SpecTaint: Speculative Taint Analysis for Discovering

dynamic taint analysis on speculative execution paths and dis-cover Spectre gadgets based on data flow patterns Comparedwith syntax-based [5] and sanitizer-based approaches [39] ourgadget pattern can capture the semantics of Spectre gadgetsAs substantiated in the evaluation (Section VI) our semantic-based gadget pattern produces fewer false positives comparedwith the existing works Furthermore static taint analysis suf-fers from the severe over-tainting and under-tainting issues dueto imprecise memory alias analysis incomplete control flowgraph extraction etc As shown in our evaluation (Section VI)oo7 has poor precision and recall rates when analyzing real-world programs

However existing dynamic binary taint analysis platformscannot detect Spectre gadgets because speculative execution isinvisible to normal program execution Therefore we extenda dynamic taint analysis platform and instrument speculativeexecution logic on-the-fly As a result we can simulate spec-ulative execution and capture data flow patterns on specula-tive paths for Spectre gadget discovery To the best of ourknowledge we are the first to enable dynamic taint analysison speculative paths for Spectre gadget detection

We have implemented a prototype SpecTaint to demon-strate the efficacy of speculative taint analysis in detectingSpectre gadgets We evaluated the performance of SpecTainton our Spectre Samples Dataset and six real-world programsThe experimental results demonstrate that SpecTaint out-performs the baseline approaches in terms of the precisionand recall by large margins It also has reasonable runtimeefficiency and reduces the performance overhead after patchingdetected gadgets by 73 compared with the conservativehardening strategy Besides SpecTaint can detect new Spectregadgets in real-world applications like Brotli [7] and Caffeframework [1]

We summarize our contributions as follows

bull We propose a dynamic speculative execution simulationplatform that enables dynamic taint analysis on specula-tive execution paths With the support of dynamic taintanalysis we deploy a semantic-based gadget detector thatdetects exploitable Spectre gadgets during the programexecution

bull We build a synthetic dataset by inserting known Spectregadgets into selected real-world programs The datasetcan be considered as a benchmark to help researchersevaluate their Spectre gadget detection tools It will ben-efit the security community

bull We implement a prototype SpecTaint to demonstratethe efficacy of our approach and compare SpecTaintwith state-of-the-art tools on the Spectre Samples Datasetand real-world programs Our evaluation indicates thatSpecTaint outperforms existing methods with respect toprecision and recall with reasonable runtime efficiencyMoreover SpecTaint discovered new Spectre gadgetsthat were not detected by the other tools Besides it sig-nificantly reduces the execution overhead after patchingdetected gadgets compared with the other approaches

II OVERVIEW

In this section we first walk through a motivating exampleto explain how a Spectre V1 gadget is exploited by attackers

Then we introduce the background and limitations of state-of-the-art tools that discover Spectre gadgets from binariesFurthermore we propose our approach and show the capabil-ities of our approach to address these limitations Then wegive a brief introduction about the overview and mechanismof SpecTaint At last we discuss the scope and assumptionsof this work

A Motivating Example

Listing 1 shows a code snippet that can be exploited tolaunch a Spectre V1 attack [34]

1 vo id v i c t i m f u n c t i o n ( s i z e t u s e r i n p u t )2 3 4 i f ( u s e r i n p u t lt g e t l e n ( a r r a y 1 ) )5 s e c r e t = a r r a y 1 [ u s e r i n p u t ] RS Read S e c r e t6 temp amp= a r r a y 2 [ s e c r e t lowast 2 5 6 ] LS Leak S e c r e t7 8 9 i n t main ( )

10 11 v i c t i m f u n c t i o n ( u s e r i n p u t ) 12 r e t u r n 0 13

Listing 1 Code snippet containing Spectre gadgets

In this example the if statement is a sanity check thatensures the following array access is within a valid rangeWhen evaluating the sanity check the outcome of the branchat line 4 may take many CPU cycles to be determined (egdue to the delay caused by a load from the main memory)To avoid the performance penalty caused by this delay thebranch prediction unit (BPU) will predict the branch outcomefrom where the instructions will be executed speculatively Anattacker can poison the branch predictor by feeding craftedinputs to the program to intentionally trick the BPU intomaking an expected prediction on that branch Then theattacker can launch the Spectre attack by running the code withan out-of-bounds value as input In this case the BPU predictsthe branch outcome to be true and the processor speculativelyexecutes instructions at line 5 and line 6 Consequently anarbitrary value can be read using an out-of-bounds index toaccess array1 at line 5 Then another index related to theloaded secret is used to access array2 and results in cacheline state changes for array2 After the processor finds outthe prediction to be wrong it discards all architectural effectsmade by speculative instructions However side effects (egcache line state changes) still remain at the micro-architecturallevel and the attacker can launch a cache side-channel attack(eg Flush+reload [48]) to retrieve the secret value

B Background and Rationale

Speculative execution is a hardware feature that is invisibleto the program execution Therefore to apply software-basedprogram analysis techniques on detecting Spectre gadgets thefirst step is to simulate speculative execution at the softwarelevel which is also the principle for all related works To thisend existing approaches utilize different methods to simulatespeculative execution either statically or dynamically RHscanner [5] (also known as Spectre V1 scanner) is a staticanalysis tool It simulates speculative execution by scanning

2

both targets of a conditional branch By doing so it at leastcovers one path that is not taken during real execution which isconsidered as a speculative path During scanning it searchesa certain pattern in the binary to detect Spectre gadgetsHowever the syntax-based code pattern used by RH scannercould produce a large number of unexploitable candidatessince not all detected gadgets can be controlled by attackers

Oo7 [43] also conducts a static binary analysis The differ-ence lies in that it conducts a semantic-based gadget detectionThat is it leverages the data flow analysis to construct asemantic code pattern and identifies the code snippet that notonly satisfies predefined patterns but also can be controlled viauser inputs The intuition behind it is that if the gadgets canbe influenced by user inputs attackers can leverage carefully-constructed inputs to exploit these gadgets To this end oo7utilizes static taint analysis to trace information flow frominputs However it is well-accepted that static taint analysissuffers from severe over-tainting and under-tainting issues dueto imprecise memory alias analysis inaccurate control flowgraphs and call graphs etc

The aforementioned works examine speculative executionstatically and deploy different program analysis techniques forSpectre gadget detection However the detection capabilitiesof these approaches inherit the limitations of static analysistechniques Therefore these techniques by design suffer fromhigh false positives and false negatives

SPECTECTOR [23] mathematically defines a semanticnotion of security against speculative execution and developsan algorithm based on symbolic execution to prove the absenceof speculative leaks However this approach inherits the bottle-necks of symbolic execution and has to sacrifice the soundnessand completeness of analysis when analyzing large programs

SpecFuzz [39] takes a fuzzing approach to dynamicallydetect Spectre gadgets It exposes speculative execution tofuzzing by inserting the speculative execution logic into theprogram at compile time and relies on random mutation ofprogram inputs to detect speculative execution errors duringprogram execution To ensure high fuzzing throughput itsgadget detection logic is oversimplified More specifically ithas the following limitations

bull Simplistic Gadget Modeling To avoid high runtimeoverhead during fuzzing SpecFuzz simply leverages amemory safety checker (eg AddressSanitizer) to detectout-of-bounds memory accesses and considers all out-of-bounds memory accesses to be potential Spectre gadgetsThis modeling is oversimplified and error-prone An out-of-bounds memory access during speculative executionmay not be controlled via user inputs and does not nec-essarily constitute a Spectre V1 gadget Our experimentsin Section VI substantiate this claim

bull Probabilistic Detection SpecFuzz detects Spectre gad-gets by capturing out-of-bounds memory access errorsduring speculative execution However even if a trueSpectre gadget is indeed exercised during fuzzing it mayor may not trigger any out-of-bounds memory accesserrors SpecFuzz relies on random mutation of programinputs to trigger these errors As a result the Spectregadget detection of SpecFuzz is probabilistic

bull Flawed Exception Handling Exceptions are likely tooccur during simulated speculative execution because itexecutes a path which might not be expected To dealwith exceptions during speculative execution again forsimplicity SpecFuzz stops the simulation immediatelyand restores the execution to a previously saved state Thisexception handling is flawed because it does not correctlysimulate how the hardware actually behaves In realitywhen encountering an exception during speculative execu-tion the processor can continue the speculative executionuntil it is terminated [34] Consequently SpecFuzz mightmiss Spectre gadgets that are located after the exception-raising instruction thereby causing false negatives

In this work we would also like to take a dynamic analysisapproach to detect Spectre gadgets to ensure high detectionaccuracy We resort to independent test case generation sys-tems (such as fuzzing and symbolic execution) to producehigh-quality test cases to achieve high detection coverageIn order to achieve high detection accuracy in real-worldsoftware we need to strike a balance between scalability andgadget detection fidelity That is our analysis must be able tocope with complex real-world software and faithful enough toensure high detection accuracy

To this end we propose to perform dynamic taint analysison speculative execution paths and conduct taint-based pat-tern checking to characterize Spectre gadgets at the semanticlevel Essentially by performing dynamic taint analysis onspeculative execution paths we can detect memory accessesthat are dependent on the program input (which attackers cancontrol) and may cause information leakage through cacheside channels This taint-based gadget pattern checking mightnot be as precise as SPECTECTOR [23] but our scheme isdesigned to be scalable and practical for detecting Spectregadgets from real-world programs It is more faithful thanthe one used in SpecFuzz [39] as it is deterministic (oncethe program inputs are determined) rather than probabilistic(relying on random mutation of program inputs to triggererrors) Our evaluation in Section VI shows that this taint-basedapproach is able to achieve much better detection accuracy thanSpecFuzz by paying more runtime overhead on speculativetaint analysis In other words our trade-off between scalabilityand gadget detection fidelity is justified

C System Overview

Architecture As illustrated in Fig 1 we extend a systememulator to simulate speculative execution and enable dynamictaint analysis on top of it Given a target program we simulatespeculative execution of the CPU by dynamically forcing theCPU emulator to execute code paths (speculative instructions)which will not be executed at normal execution We alsoconduct tainting analysis on speculatively executed instructionsto detect Spectre gadgets

Workflow Fig 2 illustrates the workflow of SpecTaintSpecTaint will go through two stages the normal executionstage and speculative execution stage At the normal executionstage SpecTaint runs the target program with seeds generatedby external fuzzers to explore as many execution paths aspossible SpecTaint will start to simulate speculative execution

3

Fig 1 Architecture of SpecTaint

Fig 2 Workflow of SpecTaint

when encountering a conditional branch and backup the cur-rent execution state When the speculative execution window(SEW) is reached SpecTaint will roll back the executionstate to the previously saved checkpoint Like the rollback ofspeculative execution in the hardware our simulated rollbackalso squashes all the side effects produced during speculativeexecution At the speculative execution stage a Spectre gadgetdetector conducts pattern checking on each speculative execu-tion path to detect gadgets To mitigate Spectre attacks thereported gadgets are forwarded to automatic serialization tools(eg Speculative Load Hardening [18])

Threat Model and Scope We share the same threat modelwith other Spectre gadget detection tools [39] [43] That isour analyzed programs are benign but might be vulnerableWe do not deal with malicious code that might deliberatelythwart or escape our analysis Our approach can be used toanalyze the exploit code for Spectre gadgets in a malwaresample but it is not our focus in this paper In this workwe focus on detecting gadgets in victim programs that canbe exploited by Spectre V1 attacks and leak sensitive datathrough cache side channels Meltdown-type attacks do notrequire gadgets in victim programs and can be launched inmalicious programs Therefore they fall out of the scope ofthis work Our simulation focuses on Spectre V1 gadgetsdetection at the binary level We do not simulate micro-operations instruction execution including instruction fetchingdecoding etc and irrelevant hardware structures such as thereorder buffer (ROB) We only use the size of ROB to calculatethe speculative execution window (SEW)

III DYNAMIC SPECULATIVE EXECUTION SIMULATION

In this section we introduce our speculative execution sim-ulation platform for Spectre gadgets detection Our platformextends the binary analysis platform DECAF [19] which isbuilt on top of the system emulator QEMU [14] Specificallywe will showcase how we utilize the system emulator to sim-ulate speculative execution triggered by branch mispredictionsand how we explore speculative paths in a depth-first manner

A Misprediction Simulation

Spectre V1 exploits out-of-bounds memory accesses inspeculative execution triggered by one or more mispredic-tions of conditional branches SpecTaint extends a systememulator to simulate this behavior In the system emulatorpc stores the next instruction to be executed in the caseof a conditional jump it stores the jump target address Tosimulate the misprediction behavior of processors SpecTaintinverts the direction of the conditional jump by replacingthe pc with the untaken target address As a result it willexecute the untaken path first at a conditional branch andenter the simulated speculative execution Before it goes tothe untaken path a state checkpoint is saved which containsthe current state of the emulator eg CPU registers Duringthe simulated speculative execution any modification to thememory will be logged Specifically the original value will besaved before a speculatively executed instruction modifies itWhen the speculative execution window is reached the simu-lated speculative execution will be terminated and the controlflow transfers back to the last saved checkpoint To maintainthe correctness of execution the memory modifications willbe restored by writing the original values back to memoryand the CPU state will be restored with the saved state Asfor other conditions to terminate speculative execution weconsider serialization instructions (eg SYSCALL LFENCEetc) listed by Mambretti et al [35] and terminate the simulatedspeculative execution when encountering these instructions

Exception Handling Exceptions can be very common duringthe simulation of speculative executions This is because aCPU will execute a path which is not expected to executeA straightforward way for simulation is to terminate thesimulation and roll back whenever an exception is triggered asdone in SpecFuzz [39] However it could miss Spectre gadgetsafter the exception point along the speculative path sincespeculative execution is not supposed to be terminated whenhardware exceptions are raised in many CPU models [34] Thespeculative execution is expected to continue even if an excep-tion occurs To guarantee a precise simulation our approachdoes not roll back when an exception occurs and instead forcesit to silently simulate this exception instruction without raisingany exception and continue speculative execution on the nextinstruction

To simulate this behavior we extend the exception handlerof the system emulator More specifically we use a customizedexception handler to capture errors in speculative executionWhen an exception due to the permission violation (eg ac-cessing kernel space from user-land) occurs during a simulatedspeculative execution we will silently simulate this exceptioninstruction without raising any exception and continue theuser-land execution after the exception instruction By doing

4

so SpecTaint is able to continue the simulation even whenexceptions occur For other exceptions which SpecTaint isnot able to handle (eg caused by invalid jump target orincomplete address translation) because the emulator does notknow where the next instruction to be executed is SpecTaintwill make a state rollback and restore to the previously savedcheckpoint

B Exploration of Speculative Execution Paths

The path exploration involves two path types the normalexecution (NE) path and speculative execution (SE) path Toexplore the NE paths we feed seeds generated by fuzzingtools into the target program When encountering a conditionalbranching instruction SpecTaint redirects the execution tothe untaken branch first to explore SE paths Crucially CPUscan perform nested speculative execution in a SE path Sincewe center on Spectre V1 which exploits the misprediction ofconditional branches we take into account nested speculativeexecutions triggered by conditional branches To simulate thisbehavior SpecTaint explores SE paths for each conditionalbranch in NE paths and SE paths within SEW In other wordsSpecTaint explores the speculative paths in a depth-firstmanner During the exploration SpecTaint monitors whetherany termination condition is met and restore to the last savedstate if there is any

Execution State Rollback It is crucial to restore the executionstate after the speculative execution is terminated so as tomaintain the correctness of program execution An executionstate includes the state of all the CPU registers and used mem-ory regions The execution state rollback has two operationsstate backup and resume For a conditional branch SpecTaintfirst backups the current execution state before simulatesspeculative execution on that branch When the SE simulationis terminated SpecTaint resumes the backup state by resettingthe current execution environment with the backup state Tobackup an execution state SpecTaint saves the emulator stateeg all registers For memory it is too heavyweight to savethe entire memory so we adopt a lightweight ldquocopy-on-writerdquoapproach More specifically SpecTaint keeps track of memoryregions that are modified during speculative execution Beforethe memory regions are modified SpecTaint saves the originalvalues When making a state rollback SpecTaint writes theoriginal values back to the memory in a reversed order asthey are modified Moreover SpecTaint uses dynamic tainttracking to detect Spectre gadgets Thus it also restorestaint information that is created during simulated speculativeexecution More details are discussed in Section IV-A

Path Exploration The path exploration considers two typesof path coverage the NE and SE path coverage The pathcoverage on normal execution is used to explore more pathsduring the normal execution We utilize fuzzing techniquessuch as AFL [2] to improve path coverage on normal execu-tion The SE path coverage is to measure how many speculativepaths are covered The goal is to explore speculative executioncomprehensively so as to avoid missing Spectre gadgets onuncovered speculative paths

The switch point is the instruction that transfers the exe-cution mode from NE to SE SpecTaint treats a conditionalbranch as a switch point and conducts the SE simulation Once

Algorithm 1 Speculative Path Exploration AlgorithmInput Entry point pc SEW size w Backup state

set saved stateOutput gadgets gadget setgadget setlarr empty inst countlarr 0Function explorer(pc w)

while inst count lt w doif is terminator(pc) then

state = saved statepop()restore(state)pc = statepcexplorer(pc stateinsn count)

endif is branch(pc) then

saved statepush(checkpoint(pc))foreach t isin get targets(pc) do

explorer(t w minus insn count)end

endexecute(pc)if gadget checker(pc) then

gadget setlarr pcendpclarr next pcinst count ++

endreturn

entering the speculative execution mode SpecTaint exploreseach conditional branch and its targets for the speculativeexploration The speculative path exploration algorithm isshown in Algorithm 1 We use pc to represent the currentinstruction and gadget set to store the locations of detectedSpectre gadgets The speculative execution path explorationis a depth-first traversal on the control flow graph of theprogram at the speculative execution mode When exploringspeculative paths SpecTaint will first check whether thecurrent execution path has reached the SEW limit w or anyother termination conditions and if so restore to the last savedstate If the current instruction at pc is a conditional branchSpecTaint will simulate a misprediction by walking throughboth targets of this branch instruction Before exploring eachtarget SpecTaint will first backup the state and push it into astack saved state After finishing the path exploration on atarget (eg reaching the SEW limit) SpecTaint will resumethe last saved state and continue for the next target Alongwith the exploration SpecTaint conducts the gadget patternchecking on the current instruction If it matches SpecTaintwill save this gadget into gadget set and continue explorationuntil termination conditions are met

Mitigating Path Explosion As presented in our evaluationthe length of each speculative execution path is bounded by theSEW limit (see TableV) However there can be a large numberof paths within this SEW limit in the worst case Thereforewe may still encounter a path explosion problem Accordingto the analysis of our evaluation dataset we identify two kindsof cases where the exponential growth of paths may happenloops and recursive functions We propose our approach to

5

address the path explosion issue That is SpecTaint has athreshold on how many times simulation can happen on thesame branch In real hardware it is unlikely that speculativeexecution can be triggered by the same branch repeatedlyAfter iterations the same branch will not be mispredictedsince its memory value are very likely to be stored in registersor be cached after iterations [35] As demonstrated by Kocheret al [29] executing the same branch five times is enough totrain the branch predictor Therefore we also set the thresholdto five and SpecTaint will not simulate speculative executionrepeatedly on the same branch if it reaches the thresholdWe admit that this approach might miss gadgets in paths wehave not explored yet More details about false negatives arediscussed in Section VII

IV SPECTRE GADGET DETECTION

This section addresses how we detect Spectre gadgetsduring dynamic speculative execution simulation More specif-ically we first formalize the Spectre gadget definitions and thendescribe how we check the patterns to detect potential Spectregadgets on speculative paths

A Dynamic Taint Tracking

If a variable is data-dependent on user inputs we con-sider this variable is under an attackerrsquos control To findexploitable Spectre gadgets it is essential to find gadgetsthat can be controlled through external inputs Therefore inorder to capture this control relation we utilize dynamic taintanalysis to trace variables that are data-dependent on externalinputs during execution To this end we label user inputsas taint sources and observe how the data flows from userinputs Specifically we utilize the whole-system dynamic taintanalysis shipped with the platform we extend DECAF [19]and perform taint propagation along with the execution of theprogram DECAFrsquos tainting rules have been formally verifiedto be sound (guarantee of no under-tainting at instructionlevel) and most of them have also been verified to be precise(guarantee of no over-tainting) The details are documented inthis paper [24] We conduct dynamic taint analysis on bothNE and SE paths Performing taint analysis on NE paths isto ensure the propagation of taint labels in normal programexecution performing taint analysis on SE paths is to facilitatethe gadget checker to detect exploitable Spectre gadgets Whenthe SE path is terminated (eg when reaching SEW size) theCPU state and memory modifications will be restored so as thetaint information When restoring to the previously saved statewe also clean the variables which are marked as tainted duringthe simulated speculative execution in order to maintain thecorrectness of taint propagation

B Spectre Gadget Modeling

In this section we formulate the gadget patterns for twotypes of Spectre V1 gadgets Bounds Check Bypass (BCB)and Bounds Check Bypass Store (BCBS) To facilitate thediscussion we define the following notions before giving theSpectre gadget definitions

bull c is a conditional branch instructionbull T (c) denotes a set of instructions in a speculative execu-

tion trace from c

bull m(i) denotes i is a memory read instructionbull str(i) denotes i is a memory write instructionbull [i ] denotes the memory value accessed by instruction i bull dep(i j ) denotes instruction i is data dependent on j bull t(i) denotes the operands of instruction i is taintedbull δ denotes the size of speculative execution window

BCB Gadget In BCB attacks the speculative load instructionis under an attackerrsquos control thus the attacker could readarbitrary values from memory Then another load instructionis required to load the secret-indexed memory location withthe intention of leaking the secret To leak the secret throughthe cache side channel the leak instruction has to use theloaded secret as the index to read from memory thus thesecret can be retrieved by monitoring the cache line statechanges In general the BCB gadget involves a set of arrayoperations and the index in the latter array operation is data-dependent on the value from the former array [34] Essentiallythe first array access is responsible for loading secrets and thesecond one is responsible for leaking secrets However notall the code sequences matching such patterns are consideredas a BCB gadget The BCB gadget demands the index of theformer array access should be under the attackerrsquos control suchthat the attacker could read arbitrary values from memory inthe target program space by carefully manipulating the inputvalues To capture this control relation we utilize dynamictaint analysis to track data flow from external inputs whichis discussed in IV-A Suppose the speculative execution startsfrom a conditional branch c and we formalize our BCB gadgetpattern as Φbcb(c) and its definition is as follows

Φbcb(c) = existi j isin T (c)m(i) andm(j ) and dep(j [i ])

andt(i) and |c j | lt δ(1)

BCBS Gadget Unlike BCB gadget BCBS uses a speculativewrite (SW) to modify arbitrary memory locations In BCBSattacks attackers control the index of an array that wouldaccess out of boundary memory during speculative executionAs a result attackers can modify an arbitrary memory location(eg return address) by manipulating the index of the arrayWe formalize our BCBS gadget pattern as Φbcbs(c) and itsdefinition is as follows

Φbcbs(c)= existi isin T (c) str(i) and t(i) and |c i | lt δ (2)

In both patterns we leverage dynamic taint analysis todetermine whether the index of an array is under attackersrsquocontrol or not If an instruction i is tainted we consider i isunder attackersrsquo control and t(i) is set to be true

Gadget Classification Our gadget patterns are similar towhat is proposed in oo7 [43] The primary difference liesin whether the branch instruction is tainted or not Oo7 [43]considers that the branch should be tainted and controlled bythe inputs Thus attackers can poison the branch predictor tocause intentional misprediction of that branch by executing thevictim program with carefully-constructed inputs However asdiscovered by Canella et al [17] the branch prediction buffers

6

are shared and commonly indexed by the virtual address of thebranch instruction thus can be poisoned from another attack-controlled process by executing a congruent branch with thesame virtual address Based on this finding Spectre gadgetscan be exploited in the same address space (ie intra-process)or across address spaces (ie cross-process) [17] Thereforeonly considering tainted branches will exclude cross-processgadgets Our gadget patterns cover both situations if thespeculative execution is triggered by a taint conditional branchwe will mark this gadget as an intra-process Spectre gadgetotherwise this gadget will be marked as a cross-processSpectre gadget

C Gadget Detection

The gadget detection is to check whether the speculativeinstruction trace conforms to the gadget patterns describedabove We deploy the gadget checker in simulated speculativeexecution Although both our gadget patterns involve memoryaccess operations it is inefficient and unnecessary to checkall the memory access instructions on SE paths Instead weonly check memory access instructions which are tainted Tothis end we instrument memory read and write operationsTo detect BCB gadgets for each memory read we checkwhether its source is tainted To make sure the dependencydep(j [i ]) rule is satisfied between two instructions we alsotrack whether the tainted operand of one instruction j ispropagated from instruction i For BCBS gadgets the patterncaptures any tainted memory write whose destination addressis marked as tainted within the SEW

V IMPLEMENTATION

We have implemented the prototype SpecTaint in C Morespecifically we wrote a C plugin of 1 KLOC in C codeon top of DECAF [25] (a dynamic binary analysis platformbuilt on top of QEMU 10) This plugin implements thestate checkpoint management and Spectre gadget detectioncomponents We reused the dynamic taint analysis plugin ofDECAF for our taint analysis Overall the changes to developour prototype do not exceed 2 KLOC Besides to increase thecode coverage we use AFL 252b [2] and honggfuzz [8] togenerate seed inputs

VI EXPERIMENTAL EVALUATION

In this section we evaluate SpecTaint to answer thefollowing research questions

1) How effective is SpecTaint to find Spectre gadgets com-pared with other existing tools

2) How efficient is SpecTaint to find Spectre gadgets inreal-world applications

This section is composed as follows First we brieflydescribe the experiment setup the datasets and the evaluationmetrics used in our experiments (Sections VI-A) Second weevaluate the efficacy of SpecTaint (Sections VI-B) Then weevaluate the efficiency of SpecTaint on real-world applications(Sections VI-E) Finally we conduct case studies of someSpectre gadgets detected by SpecTaint in real-world applica-tions and the deep learning framework Caffe (Sections VI-F)

A Experiment Setup

Baseline Methods We compare SpecTaint with three base-line approaches Spectre 1 Scanner from Red Hat (RH Scan-ner) [5] oo7 [43] and SpecFuzz [39] RH Scanner is a staticanalysis tool that can be used to scan for Spectre gadgetsOo7 [43] is another static analysis tool that utilizes static taintanalysis to find Spectre gadgets SpecFuzz [39] extends thefuzzing technique to detect errors in speculative execution andreport Spectre gadgets Another related work SPECTECTORleverages symbolic execution to detect information-flow differ-ences introduced by speculative execution Since the manualsettings to make it work on large programs are not open-source it is hard to evaluate it on our real-world benchmarksAs discussed in SpecFuzz [39] the authors failed to runSPECTECTOR on the real-world benchmarks due to a largenumber of unsupported instructions Therefore we did notcompare with SPECTECTOR on real-world benchmarks (samewith SpecFuzzrsquos)

Evaluation Dataset The evaluation is conducted on twodatasets the Spectre Samples Dataset and Real-world DatasetAligned with other baseline works we utilize the same SpectreSamples Dataset to demonstrate the detection capability ofSpecTaint For baseline comparison with related works wecollected six real-world applications and created two real-world datasets

bull Spectre Samples Dataset This dataset is designed todemonstrate the efficacy of SpecTaint We collected15 Spectre V1 samples created by Paul Kocher [3] andcompiled 15 samples with the same configuration (gcc-484 with O0) [39]

bull Real-world V1 Dataset For fair baseline comparisonwe use the same dataset as SpecFuzzrsquos [39] This datasetcontains six widely used applications one cryptographicprogram from OpenSSL [12] a compression program(Brotli [7]) and four parsing programs (JSON [10]LibHTP [11] HTTP [9] and YAML [6])

bull Real-world V2 Dataset Systematic baseline evaluationis difficult due to the shortage of ground truth in real-world programs To solve this problem we injectedknown Spectre gadgets into programs from Real-worldV1 Dataset and created the Real-world V2 Dataset Weadopt the same injection approach proposed in LAVA [20]to build this dataset More specifically we utilize dynamictaint analysis to find attack points which can be controlledby input bytes that do not determine control flow andhave not been modified much (see [20] for more details)Then we inject the Spectre gadgets from Spectre SampleDataset into target programs and add code to makeinjected gadgets controllable via input bytes As a resultwe injected 15 Spectre V1 gadgets from Spectre SampleDataset into 52 different locations in six programs Tomake a fair comparison the input seeds used in the evalu-ation are all generated by a fuzzing tool [8] Therefore wehave no clue whether the injected locations are coveredby input seeds in the evaluation It is worth mentioningthat this dataset is used to evaluate the detection coverageof SpecTaint and related works instead of pathcodecoverage and we choose the same seed corpus withSpecFuzzrsquos to guarantee a fair comparison

7

Evaluation Metrics For the Real-world V1 dataset whichdoes not have ground truth we manually verify the detectionresults and calculate the precision rate to quantify the perfor-mance of SpecTaint and baseline approaches The precisionis calculated as precision(P ) = TP

TP+FP where TP is thenumber of detected gadgets that are manually verified to beexploitable and FP is the number of detected gadgets thatare not exploitable based on manual analysis For Real-worldV2 Dataset we only consider injected Spectre gadgets to betrue positives and other reported results to be false positivesThen we calculate the precision and recall to quantify theeffectiveness of the proposed approach and baseline methodsThe recall is calculated as recall(R) = TP

TP+FN where TP isthe number of inserted gadgets which are correctly detectedand FN is the number of inserted gadgets that are missed Theprecision is calculated as precision(P ) = TP

TP+FP where FPis the number of detected gadgets other than injected ones Tomeasure the efficiency we reported runtime per speculativeexecution and number of paths explored per speculative exe-cution along with other statistics (see Section VI-E for moredetails)

Configuration The experiments were conducted on a desktopwith 16 GB memory Intel Core i7 12 cores at 370 GHzCPU and running Linux 415 The Guest OS in QEMU isUbuntu 1404 with 1 GB memory The speculative window isdependent on the space limit ie ROB size and timing limitsWe follow the configuration used by SpecFuzz [39] and alsoset the speculative window size to 250

B Baseline Evaluation on Spectre Samples Dataset

In this experiment we compared SpecTaint with threebaseline tools on the Spectre Sample Dataset As presented inTable II SpecTaint successfully detected all Spectre gadgetsin the Spectre Samples Dataset while RH Scanner relies onsyntax-based pattern matching and missed three cases

C Baseline Comparison on Real-world V2 Dataset

We conducted the baseline comparison with three relatedworks RH Scanner oo7 and SpecFuzz on Real-world V2Dataset In this experiment we focused on detecting theinserted gadgets therefore we only consider inserted gadgetsto be true positives and all other detection results to befalse positives For tools that utilize taint tracking to detectSpectre gadget (oo7 and SpecTaint) we mark input bytes astaint sources Since the analysis of oo7 is very slow whenperforming whole input bytes tainting we only mark inputbytes that influence injected gadgets as taint sources We adoptthe same configuration for SpecTaint To compare with RHScanner and SpecFuzz we have another configuration forSpecTaint where we mark all input bytes as tainted Theresults for the former configuration are labeled with ldquordquo inTable I For dynamic analysis tools (SpecTaint and SpecFuzz)we used an external fuzzing tool [8] to fuzz the six programsfor 10 hours and fed the generated seeds as inputs to run theprograms

1 2 i f ( p a r s e rminusgtt o k n e x t gt= num tokens ) 3 I n s e r t e d S p e c t r e Gadget4 i f d e f SPECTRE VARIANT5 i f ( g l o b a l i d x lt a r r a y 1 s i z e )6 tmp amp= a r r a y 2 [ a r r a y 1 [ g l o b a l i d x ] lowast 5 1 2 ] 7 8 e n d i f9

Listing 2 Missed Spectre gadget 1 by SpecFuzz

1 2 i f ( lminusgtf i r s t + i d x lt lminusgtmax size ) 3 r e t u r n ( vo id lowast) lminusgte l e m e n t s [ lminusgtf i r s t + i d x ] 4 e l s e 5 I n s e r t e d S p e c t r e Gadget6 i f d e f SPECTRE VARIANT7 i n t temp = 0 8 i n t lowastadd r = ampg l o b a l i d x 9 i f (lowast add r lt a r r a y 1 s i z e )

10 temp amp= a r r a y 2 [ a r r a y 1 [lowast add r ] lowast 5 1 2 ] 11 12 e n d i f13

Listing 3 Missed Spectre gadget 2 by SpecFuzz

Table I shows that when tainting gadget-related inputbytes SpecTaint has no false positives and achieves a pre-cision rate of 100 Under the same configuration howeveroo7 still produced false positives For instance it reported13 gadget candidates in LibHTP but 12 of them are falsepositives The results show that static taint analysis suffersfrom the over-tainting issue thereby is hard to achieve highprecision in detecting Spectre gadgets As presented in Table Ioo7 has many false negatives For example it missed allinserted gadgets in YAML and Brotli We examined these falsenegatives and found the reasons are as follows Firstly thedetection results of oo7 [43] depend on the completeness of thecontrol-flow graph (CFG) extraction Some inserted gadgetsare missed since it failed to extract a complete CFG dueto the limitation of the static approach Also oo7 is limitedby static inter-procedure taint tracking and it missed manyinserted gadgets because it failed to propagate the taint sourceto the injection points in the target programs The evaluationresults substantiate our claim that dynamic taint analysis ismuch more accurate and effective than static taint analysis indetecting Spectre gadgets

As presented in Table I SpecTaint outperforms RHScanner and SpecFuzz under the whole input bytes taintingconfiguration Since we only consider injected gadgets to betrue positives in this dataset other detection results are labeledas false positives The analysis results of other gadgets reportedby SpecTaint are presented in Table IV Note that Spec-Taint missed some injected gadgets in this experiment Wefurther investigated the results and found that missed gadgetsare not covered by input seeds However SpecFuzz missedmany injected gadgets that are covered by input seeds anddetected by SpecTaint According to our analysis the reasonsare first SpecFuzz adopts a prioritized simulation of branchmispredictions it selectively chooses whether to simulate themisprediction or not on a conditional branch Therefore itmissed some injected gadgets For example as presented inListing 2 the seeds are able to reach the branch at line 2 butSpecFuzz did not simulate branch misprediction over these

8

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 3: SpecTaint: Speculative Taint Analysis for Discovering

both targets of a conditional branch By doing so it at leastcovers one path that is not taken during real execution which isconsidered as a speculative path During scanning it searchesa certain pattern in the binary to detect Spectre gadgetsHowever the syntax-based code pattern used by RH scannercould produce a large number of unexploitable candidatessince not all detected gadgets can be controlled by attackers

Oo7 [43] also conducts a static binary analysis The differ-ence lies in that it conducts a semantic-based gadget detectionThat is it leverages the data flow analysis to construct asemantic code pattern and identifies the code snippet that notonly satisfies predefined patterns but also can be controlled viauser inputs The intuition behind it is that if the gadgets canbe influenced by user inputs attackers can leverage carefully-constructed inputs to exploit these gadgets To this end oo7utilizes static taint analysis to trace information flow frominputs However it is well-accepted that static taint analysissuffers from severe over-tainting and under-tainting issues dueto imprecise memory alias analysis inaccurate control flowgraphs and call graphs etc

The aforementioned works examine speculative executionstatically and deploy different program analysis techniques forSpectre gadget detection However the detection capabilitiesof these approaches inherit the limitations of static analysistechniques Therefore these techniques by design suffer fromhigh false positives and false negatives

SPECTECTOR [23] mathematically defines a semanticnotion of security against speculative execution and developsan algorithm based on symbolic execution to prove the absenceof speculative leaks However this approach inherits the bottle-necks of symbolic execution and has to sacrifice the soundnessand completeness of analysis when analyzing large programs

SpecFuzz [39] takes a fuzzing approach to dynamicallydetect Spectre gadgets It exposes speculative execution tofuzzing by inserting the speculative execution logic into theprogram at compile time and relies on random mutation ofprogram inputs to detect speculative execution errors duringprogram execution To ensure high fuzzing throughput itsgadget detection logic is oversimplified More specifically ithas the following limitations

bull Simplistic Gadget Modeling To avoid high runtimeoverhead during fuzzing SpecFuzz simply leverages amemory safety checker (eg AddressSanitizer) to detectout-of-bounds memory accesses and considers all out-of-bounds memory accesses to be potential Spectre gadgetsThis modeling is oversimplified and error-prone An out-of-bounds memory access during speculative executionmay not be controlled via user inputs and does not nec-essarily constitute a Spectre V1 gadget Our experimentsin Section VI substantiate this claim

bull Probabilistic Detection SpecFuzz detects Spectre gad-gets by capturing out-of-bounds memory access errorsduring speculative execution However even if a trueSpectre gadget is indeed exercised during fuzzing it mayor may not trigger any out-of-bounds memory accesserrors SpecFuzz relies on random mutation of programinputs to trigger these errors As a result the Spectregadget detection of SpecFuzz is probabilistic

bull Flawed Exception Handling Exceptions are likely tooccur during simulated speculative execution because itexecutes a path which might not be expected To dealwith exceptions during speculative execution again forsimplicity SpecFuzz stops the simulation immediatelyand restores the execution to a previously saved state Thisexception handling is flawed because it does not correctlysimulate how the hardware actually behaves In realitywhen encountering an exception during speculative execu-tion the processor can continue the speculative executionuntil it is terminated [34] Consequently SpecFuzz mightmiss Spectre gadgets that are located after the exception-raising instruction thereby causing false negatives

In this work we would also like to take a dynamic analysisapproach to detect Spectre gadgets to ensure high detectionaccuracy We resort to independent test case generation sys-tems (such as fuzzing and symbolic execution) to producehigh-quality test cases to achieve high detection coverageIn order to achieve high detection accuracy in real-worldsoftware we need to strike a balance between scalability andgadget detection fidelity That is our analysis must be able tocope with complex real-world software and faithful enough toensure high detection accuracy

To this end we propose to perform dynamic taint analysison speculative execution paths and conduct taint-based pat-tern checking to characterize Spectre gadgets at the semanticlevel Essentially by performing dynamic taint analysis onspeculative execution paths we can detect memory accessesthat are dependent on the program input (which attackers cancontrol) and may cause information leakage through cacheside channels This taint-based gadget pattern checking mightnot be as precise as SPECTECTOR [23] but our scheme isdesigned to be scalable and practical for detecting Spectregadgets from real-world programs It is more faithful thanthe one used in SpecFuzz [39] as it is deterministic (oncethe program inputs are determined) rather than probabilistic(relying on random mutation of program inputs to triggererrors) Our evaluation in Section VI shows that this taint-basedapproach is able to achieve much better detection accuracy thanSpecFuzz by paying more runtime overhead on speculativetaint analysis In other words our trade-off between scalabilityand gadget detection fidelity is justified

C System Overview

Architecture As illustrated in Fig 1 we extend a systememulator to simulate speculative execution and enable dynamictaint analysis on top of it Given a target program we simulatespeculative execution of the CPU by dynamically forcing theCPU emulator to execute code paths (speculative instructions)which will not be executed at normal execution We alsoconduct tainting analysis on speculatively executed instructionsto detect Spectre gadgets

Workflow Fig 2 illustrates the workflow of SpecTaintSpecTaint will go through two stages the normal executionstage and speculative execution stage At the normal executionstage SpecTaint runs the target program with seeds generatedby external fuzzers to explore as many execution paths aspossible SpecTaint will start to simulate speculative execution

3

Fig 1 Architecture of SpecTaint

Fig 2 Workflow of SpecTaint

when encountering a conditional branch and backup the cur-rent execution state When the speculative execution window(SEW) is reached SpecTaint will roll back the executionstate to the previously saved checkpoint Like the rollback ofspeculative execution in the hardware our simulated rollbackalso squashes all the side effects produced during speculativeexecution At the speculative execution stage a Spectre gadgetdetector conducts pattern checking on each speculative execu-tion path to detect gadgets To mitigate Spectre attacks thereported gadgets are forwarded to automatic serialization tools(eg Speculative Load Hardening [18])

Threat Model and Scope We share the same threat modelwith other Spectre gadget detection tools [39] [43] That isour analyzed programs are benign but might be vulnerableWe do not deal with malicious code that might deliberatelythwart or escape our analysis Our approach can be used toanalyze the exploit code for Spectre gadgets in a malwaresample but it is not our focus in this paper In this workwe focus on detecting gadgets in victim programs that canbe exploited by Spectre V1 attacks and leak sensitive datathrough cache side channels Meltdown-type attacks do notrequire gadgets in victim programs and can be launched inmalicious programs Therefore they fall out of the scope ofthis work Our simulation focuses on Spectre V1 gadgetsdetection at the binary level We do not simulate micro-operations instruction execution including instruction fetchingdecoding etc and irrelevant hardware structures such as thereorder buffer (ROB) We only use the size of ROB to calculatethe speculative execution window (SEW)

III DYNAMIC SPECULATIVE EXECUTION SIMULATION

In this section we introduce our speculative execution sim-ulation platform for Spectre gadgets detection Our platformextends the binary analysis platform DECAF [19] which isbuilt on top of the system emulator QEMU [14] Specificallywe will showcase how we utilize the system emulator to sim-ulate speculative execution triggered by branch mispredictionsand how we explore speculative paths in a depth-first manner

A Misprediction Simulation

Spectre V1 exploits out-of-bounds memory accesses inspeculative execution triggered by one or more mispredic-tions of conditional branches SpecTaint extends a systememulator to simulate this behavior In the system emulatorpc stores the next instruction to be executed in the caseof a conditional jump it stores the jump target address Tosimulate the misprediction behavior of processors SpecTaintinverts the direction of the conditional jump by replacingthe pc with the untaken target address As a result it willexecute the untaken path first at a conditional branch andenter the simulated speculative execution Before it goes tothe untaken path a state checkpoint is saved which containsthe current state of the emulator eg CPU registers Duringthe simulated speculative execution any modification to thememory will be logged Specifically the original value will besaved before a speculatively executed instruction modifies itWhen the speculative execution window is reached the simu-lated speculative execution will be terminated and the controlflow transfers back to the last saved checkpoint To maintainthe correctness of execution the memory modifications willbe restored by writing the original values back to memoryand the CPU state will be restored with the saved state Asfor other conditions to terminate speculative execution weconsider serialization instructions (eg SYSCALL LFENCEetc) listed by Mambretti et al [35] and terminate the simulatedspeculative execution when encountering these instructions

Exception Handling Exceptions can be very common duringthe simulation of speculative executions This is because aCPU will execute a path which is not expected to executeA straightforward way for simulation is to terminate thesimulation and roll back whenever an exception is triggered asdone in SpecFuzz [39] However it could miss Spectre gadgetsafter the exception point along the speculative path sincespeculative execution is not supposed to be terminated whenhardware exceptions are raised in many CPU models [34] Thespeculative execution is expected to continue even if an excep-tion occurs To guarantee a precise simulation our approachdoes not roll back when an exception occurs and instead forcesit to silently simulate this exception instruction without raisingany exception and continue speculative execution on the nextinstruction

To simulate this behavior we extend the exception handlerof the system emulator More specifically we use a customizedexception handler to capture errors in speculative executionWhen an exception due to the permission violation (eg ac-cessing kernel space from user-land) occurs during a simulatedspeculative execution we will silently simulate this exceptioninstruction without raising any exception and continue theuser-land execution after the exception instruction By doing

4

so SpecTaint is able to continue the simulation even whenexceptions occur For other exceptions which SpecTaint isnot able to handle (eg caused by invalid jump target orincomplete address translation) because the emulator does notknow where the next instruction to be executed is SpecTaintwill make a state rollback and restore to the previously savedcheckpoint

B Exploration of Speculative Execution Paths

The path exploration involves two path types the normalexecution (NE) path and speculative execution (SE) path Toexplore the NE paths we feed seeds generated by fuzzingtools into the target program When encountering a conditionalbranching instruction SpecTaint redirects the execution tothe untaken branch first to explore SE paths Crucially CPUscan perform nested speculative execution in a SE path Sincewe center on Spectre V1 which exploits the misprediction ofconditional branches we take into account nested speculativeexecutions triggered by conditional branches To simulate thisbehavior SpecTaint explores SE paths for each conditionalbranch in NE paths and SE paths within SEW In other wordsSpecTaint explores the speculative paths in a depth-firstmanner During the exploration SpecTaint monitors whetherany termination condition is met and restore to the last savedstate if there is any

Execution State Rollback It is crucial to restore the executionstate after the speculative execution is terminated so as tomaintain the correctness of program execution An executionstate includes the state of all the CPU registers and used mem-ory regions The execution state rollback has two operationsstate backup and resume For a conditional branch SpecTaintfirst backups the current execution state before simulatesspeculative execution on that branch When the SE simulationis terminated SpecTaint resumes the backup state by resettingthe current execution environment with the backup state Tobackup an execution state SpecTaint saves the emulator stateeg all registers For memory it is too heavyweight to savethe entire memory so we adopt a lightweight ldquocopy-on-writerdquoapproach More specifically SpecTaint keeps track of memoryregions that are modified during speculative execution Beforethe memory regions are modified SpecTaint saves the originalvalues When making a state rollback SpecTaint writes theoriginal values back to the memory in a reversed order asthey are modified Moreover SpecTaint uses dynamic tainttracking to detect Spectre gadgets Thus it also restorestaint information that is created during simulated speculativeexecution More details are discussed in Section IV-A

Path Exploration The path exploration considers two typesof path coverage the NE and SE path coverage The pathcoverage on normal execution is used to explore more pathsduring the normal execution We utilize fuzzing techniquessuch as AFL [2] to improve path coverage on normal execu-tion The SE path coverage is to measure how many speculativepaths are covered The goal is to explore speculative executioncomprehensively so as to avoid missing Spectre gadgets onuncovered speculative paths

The switch point is the instruction that transfers the exe-cution mode from NE to SE SpecTaint treats a conditionalbranch as a switch point and conducts the SE simulation Once

Algorithm 1 Speculative Path Exploration AlgorithmInput Entry point pc SEW size w Backup state

set saved stateOutput gadgets gadget setgadget setlarr empty inst countlarr 0Function explorer(pc w)

while inst count lt w doif is terminator(pc) then

state = saved statepop()restore(state)pc = statepcexplorer(pc stateinsn count)

endif is branch(pc) then

saved statepush(checkpoint(pc))foreach t isin get targets(pc) do

explorer(t w minus insn count)end

endexecute(pc)if gadget checker(pc) then

gadget setlarr pcendpclarr next pcinst count ++

endreturn

entering the speculative execution mode SpecTaint exploreseach conditional branch and its targets for the speculativeexploration The speculative path exploration algorithm isshown in Algorithm 1 We use pc to represent the currentinstruction and gadget set to store the locations of detectedSpectre gadgets The speculative execution path explorationis a depth-first traversal on the control flow graph of theprogram at the speculative execution mode When exploringspeculative paths SpecTaint will first check whether thecurrent execution path has reached the SEW limit w or anyother termination conditions and if so restore to the last savedstate If the current instruction at pc is a conditional branchSpecTaint will simulate a misprediction by walking throughboth targets of this branch instruction Before exploring eachtarget SpecTaint will first backup the state and push it into astack saved state After finishing the path exploration on atarget (eg reaching the SEW limit) SpecTaint will resumethe last saved state and continue for the next target Alongwith the exploration SpecTaint conducts the gadget patternchecking on the current instruction If it matches SpecTaintwill save this gadget into gadget set and continue explorationuntil termination conditions are met

Mitigating Path Explosion As presented in our evaluationthe length of each speculative execution path is bounded by theSEW limit (see TableV) However there can be a large numberof paths within this SEW limit in the worst case Thereforewe may still encounter a path explosion problem Accordingto the analysis of our evaluation dataset we identify two kindsof cases where the exponential growth of paths may happenloops and recursive functions We propose our approach to

5

address the path explosion issue That is SpecTaint has athreshold on how many times simulation can happen on thesame branch In real hardware it is unlikely that speculativeexecution can be triggered by the same branch repeatedlyAfter iterations the same branch will not be mispredictedsince its memory value are very likely to be stored in registersor be cached after iterations [35] As demonstrated by Kocheret al [29] executing the same branch five times is enough totrain the branch predictor Therefore we also set the thresholdto five and SpecTaint will not simulate speculative executionrepeatedly on the same branch if it reaches the thresholdWe admit that this approach might miss gadgets in paths wehave not explored yet More details about false negatives arediscussed in Section VII

IV SPECTRE GADGET DETECTION

This section addresses how we detect Spectre gadgetsduring dynamic speculative execution simulation More specif-ically we first formalize the Spectre gadget definitions and thendescribe how we check the patterns to detect potential Spectregadgets on speculative paths

A Dynamic Taint Tracking

If a variable is data-dependent on user inputs we con-sider this variable is under an attackerrsquos control To findexploitable Spectre gadgets it is essential to find gadgetsthat can be controlled through external inputs Therefore inorder to capture this control relation we utilize dynamic taintanalysis to trace variables that are data-dependent on externalinputs during execution To this end we label user inputsas taint sources and observe how the data flows from userinputs Specifically we utilize the whole-system dynamic taintanalysis shipped with the platform we extend DECAF [19]and perform taint propagation along with the execution of theprogram DECAFrsquos tainting rules have been formally verifiedto be sound (guarantee of no under-tainting at instructionlevel) and most of them have also been verified to be precise(guarantee of no over-tainting) The details are documented inthis paper [24] We conduct dynamic taint analysis on bothNE and SE paths Performing taint analysis on NE paths isto ensure the propagation of taint labels in normal programexecution performing taint analysis on SE paths is to facilitatethe gadget checker to detect exploitable Spectre gadgets Whenthe SE path is terminated (eg when reaching SEW size) theCPU state and memory modifications will be restored so as thetaint information When restoring to the previously saved statewe also clean the variables which are marked as tainted duringthe simulated speculative execution in order to maintain thecorrectness of taint propagation

B Spectre Gadget Modeling

In this section we formulate the gadget patterns for twotypes of Spectre V1 gadgets Bounds Check Bypass (BCB)and Bounds Check Bypass Store (BCBS) To facilitate thediscussion we define the following notions before giving theSpectre gadget definitions

bull c is a conditional branch instructionbull T (c) denotes a set of instructions in a speculative execu-

tion trace from c

bull m(i) denotes i is a memory read instructionbull str(i) denotes i is a memory write instructionbull [i ] denotes the memory value accessed by instruction i bull dep(i j ) denotes instruction i is data dependent on j bull t(i) denotes the operands of instruction i is taintedbull δ denotes the size of speculative execution window

BCB Gadget In BCB attacks the speculative load instructionis under an attackerrsquos control thus the attacker could readarbitrary values from memory Then another load instructionis required to load the secret-indexed memory location withthe intention of leaking the secret To leak the secret throughthe cache side channel the leak instruction has to use theloaded secret as the index to read from memory thus thesecret can be retrieved by monitoring the cache line statechanges In general the BCB gadget involves a set of arrayoperations and the index in the latter array operation is data-dependent on the value from the former array [34] Essentiallythe first array access is responsible for loading secrets and thesecond one is responsible for leaking secrets However notall the code sequences matching such patterns are consideredas a BCB gadget The BCB gadget demands the index of theformer array access should be under the attackerrsquos control suchthat the attacker could read arbitrary values from memory inthe target program space by carefully manipulating the inputvalues To capture this control relation we utilize dynamictaint analysis to track data flow from external inputs whichis discussed in IV-A Suppose the speculative execution startsfrom a conditional branch c and we formalize our BCB gadgetpattern as Φbcb(c) and its definition is as follows

Φbcb(c) = existi j isin T (c)m(i) andm(j ) and dep(j [i ])

andt(i) and |c j | lt δ(1)

BCBS Gadget Unlike BCB gadget BCBS uses a speculativewrite (SW) to modify arbitrary memory locations In BCBSattacks attackers control the index of an array that wouldaccess out of boundary memory during speculative executionAs a result attackers can modify an arbitrary memory location(eg return address) by manipulating the index of the arrayWe formalize our BCBS gadget pattern as Φbcbs(c) and itsdefinition is as follows

Φbcbs(c)= existi isin T (c) str(i) and t(i) and |c i | lt δ (2)

In both patterns we leverage dynamic taint analysis todetermine whether the index of an array is under attackersrsquocontrol or not If an instruction i is tainted we consider i isunder attackersrsquo control and t(i) is set to be true

Gadget Classification Our gadget patterns are similar towhat is proposed in oo7 [43] The primary difference liesin whether the branch instruction is tainted or not Oo7 [43]considers that the branch should be tainted and controlled bythe inputs Thus attackers can poison the branch predictor tocause intentional misprediction of that branch by executing thevictim program with carefully-constructed inputs However asdiscovered by Canella et al [17] the branch prediction buffers

6

are shared and commonly indexed by the virtual address of thebranch instruction thus can be poisoned from another attack-controlled process by executing a congruent branch with thesame virtual address Based on this finding Spectre gadgetscan be exploited in the same address space (ie intra-process)or across address spaces (ie cross-process) [17] Thereforeonly considering tainted branches will exclude cross-processgadgets Our gadget patterns cover both situations if thespeculative execution is triggered by a taint conditional branchwe will mark this gadget as an intra-process Spectre gadgetotherwise this gadget will be marked as a cross-processSpectre gadget

C Gadget Detection

The gadget detection is to check whether the speculativeinstruction trace conforms to the gadget patterns describedabove We deploy the gadget checker in simulated speculativeexecution Although both our gadget patterns involve memoryaccess operations it is inefficient and unnecessary to checkall the memory access instructions on SE paths Instead weonly check memory access instructions which are tainted Tothis end we instrument memory read and write operationsTo detect BCB gadgets for each memory read we checkwhether its source is tainted To make sure the dependencydep(j [i ]) rule is satisfied between two instructions we alsotrack whether the tainted operand of one instruction j ispropagated from instruction i For BCBS gadgets the patterncaptures any tainted memory write whose destination addressis marked as tainted within the SEW

V IMPLEMENTATION

We have implemented the prototype SpecTaint in C Morespecifically we wrote a C plugin of 1 KLOC in C codeon top of DECAF [25] (a dynamic binary analysis platformbuilt on top of QEMU 10) This plugin implements thestate checkpoint management and Spectre gadget detectioncomponents We reused the dynamic taint analysis plugin ofDECAF for our taint analysis Overall the changes to developour prototype do not exceed 2 KLOC Besides to increase thecode coverage we use AFL 252b [2] and honggfuzz [8] togenerate seed inputs

VI EXPERIMENTAL EVALUATION

In this section we evaluate SpecTaint to answer thefollowing research questions

1) How effective is SpecTaint to find Spectre gadgets com-pared with other existing tools

2) How efficient is SpecTaint to find Spectre gadgets inreal-world applications

This section is composed as follows First we brieflydescribe the experiment setup the datasets and the evaluationmetrics used in our experiments (Sections VI-A) Second weevaluate the efficacy of SpecTaint (Sections VI-B) Then weevaluate the efficiency of SpecTaint on real-world applications(Sections VI-E) Finally we conduct case studies of someSpectre gadgets detected by SpecTaint in real-world applica-tions and the deep learning framework Caffe (Sections VI-F)

A Experiment Setup

Baseline Methods We compare SpecTaint with three base-line approaches Spectre 1 Scanner from Red Hat (RH Scan-ner) [5] oo7 [43] and SpecFuzz [39] RH Scanner is a staticanalysis tool that can be used to scan for Spectre gadgetsOo7 [43] is another static analysis tool that utilizes static taintanalysis to find Spectre gadgets SpecFuzz [39] extends thefuzzing technique to detect errors in speculative execution andreport Spectre gadgets Another related work SPECTECTORleverages symbolic execution to detect information-flow differ-ences introduced by speculative execution Since the manualsettings to make it work on large programs are not open-source it is hard to evaluate it on our real-world benchmarksAs discussed in SpecFuzz [39] the authors failed to runSPECTECTOR on the real-world benchmarks due to a largenumber of unsupported instructions Therefore we did notcompare with SPECTECTOR on real-world benchmarks (samewith SpecFuzzrsquos)

Evaluation Dataset The evaluation is conducted on twodatasets the Spectre Samples Dataset and Real-world DatasetAligned with other baseline works we utilize the same SpectreSamples Dataset to demonstrate the detection capability ofSpecTaint For baseline comparison with related works wecollected six real-world applications and created two real-world datasets

bull Spectre Samples Dataset This dataset is designed todemonstrate the efficacy of SpecTaint We collected15 Spectre V1 samples created by Paul Kocher [3] andcompiled 15 samples with the same configuration (gcc-484 with O0) [39]

bull Real-world V1 Dataset For fair baseline comparisonwe use the same dataset as SpecFuzzrsquos [39] This datasetcontains six widely used applications one cryptographicprogram from OpenSSL [12] a compression program(Brotli [7]) and four parsing programs (JSON [10]LibHTP [11] HTTP [9] and YAML [6])

bull Real-world V2 Dataset Systematic baseline evaluationis difficult due to the shortage of ground truth in real-world programs To solve this problem we injectedknown Spectre gadgets into programs from Real-worldV1 Dataset and created the Real-world V2 Dataset Weadopt the same injection approach proposed in LAVA [20]to build this dataset More specifically we utilize dynamictaint analysis to find attack points which can be controlledby input bytes that do not determine control flow andhave not been modified much (see [20] for more details)Then we inject the Spectre gadgets from Spectre SampleDataset into target programs and add code to makeinjected gadgets controllable via input bytes As a resultwe injected 15 Spectre V1 gadgets from Spectre SampleDataset into 52 different locations in six programs Tomake a fair comparison the input seeds used in the evalu-ation are all generated by a fuzzing tool [8] Therefore wehave no clue whether the injected locations are coveredby input seeds in the evaluation It is worth mentioningthat this dataset is used to evaluate the detection coverageof SpecTaint and related works instead of pathcodecoverage and we choose the same seed corpus withSpecFuzzrsquos to guarantee a fair comparison

7

Evaluation Metrics For the Real-world V1 dataset whichdoes not have ground truth we manually verify the detectionresults and calculate the precision rate to quantify the perfor-mance of SpecTaint and baseline approaches The precisionis calculated as precision(P ) = TP

TP+FP where TP is thenumber of detected gadgets that are manually verified to beexploitable and FP is the number of detected gadgets thatare not exploitable based on manual analysis For Real-worldV2 Dataset we only consider injected Spectre gadgets to betrue positives and other reported results to be false positivesThen we calculate the precision and recall to quantify theeffectiveness of the proposed approach and baseline methodsThe recall is calculated as recall(R) = TP

TP+FN where TP isthe number of inserted gadgets which are correctly detectedand FN is the number of inserted gadgets that are missed Theprecision is calculated as precision(P ) = TP

TP+FP where FPis the number of detected gadgets other than injected ones Tomeasure the efficiency we reported runtime per speculativeexecution and number of paths explored per speculative exe-cution along with other statistics (see Section VI-E for moredetails)

Configuration The experiments were conducted on a desktopwith 16 GB memory Intel Core i7 12 cores at 370 GHzCPU and running Linux 415 The Guest OS in QEMU isUbuntu 1404 with 1 GB memory The speculative window isdependent on the space limit ie ROB size and timing limitsWe follow the configuration used by SpecFuzz [39] and alsoset the speculative window size to 250

B Baseline Evaluation on Spectre Samples Dataset

In this experiment we compared SpecTaint with threebaseline tools on the Spectre Sample Dataset As presented inTable II SpecTaint successfully detected all Spectre gadgetsin the Spectre Samples Dataset while RH Scanner relies onsyntax-based pattern matching and missed three cases

C Baseline Comparison on Real-world V2 Dataset

We conducted the baseline comparison with three relatedworks RH Scanner oo7 and SpecFuzz on Real-world V2Dataset In this experiment we focused on detecting theinserted gadgets therefore we only consider inserted gadgetsto be true positives and all other detection results to befalse positives For tools that utilize taint tracking to detectSpectre gadget (oo7 and SpecTaint) we mark input bytes astaint sources Since the analysis of oo7 is very slow whenperforming whole input bytes tainting we only mark inputbytes that influence injected gadgets as taint sources We adoptthe same configuration for SpecTaint To compare with RHScanner and SpecFuzz we have another configuration forSpecTaint where we mark all input bytes as tainted Theresults for the former configuration are labeled with ldquordquo inTable I For dynamic analysis tools (SpecTaint and SpecFuzz)we used an external fuzzing tool [8] to fuzz the six programsfor 10 hours and fed the generated seeds as inputs to run theprograms

1 2 i f ( p a r s e rminusgtt o k n e x t gt= num tokens ) 3 I n s e r t e d S p e c t r e Gadget4 i f d e f SPECTRE VARIANT5 i f ( g l o b a l i d x lt a r r a y 1 s i z e )6 tmp amp= a r r a y 2 [ a r r a y 1 [ g l o b a l i d x ] lowast 5 1 2 ] 7 8 e n d i f9

Listing 2 Missed Spectre gadget 1 by SpecFuzz

1 2 i f ( lminusgtf i r s t + i d x lt lminusgtmax size ) 3 r e t u r n ( vo id lowast) lminusgte l e m e n t s [ lminusgtf i r s t + i d x ] 4 e l s e 5 I n s e r t e d S p e c t r e Gadget6 i f d e f SPECTRE VARIANT7 i n t temp = 0 8 i n t lowastadd r = ampg l o b a l i d x 9 i f (lowast add r lt a r r a y 1 s i z e )

10 temp amp= a r r a y 2 [ a r r a y 1 [lowast add r ] lowast 5 1 2 ] 11 12 e n d i f13

Listing 3 Missed Spectre gadget 2 by SpecFuzz

Table I shows that when tainting gadget-related inputbytes SpecTaint has no false positives and achieves a pre-cision rate of 100 Under the same configuration howeveroo7 still produced false positives For instance it reported13 gadget candidates in LibHTP but 12 of them are falsepositives The results show that static taint analysis suffersfrom the over-tainting issue thereby is hard to achieve highprecision in detecting Spectre gadgets As presented in Table Ioo7 has many false negatives For example it missed allinserted gadgets in YAML and Brotli We examined these falsenegatives and found the reasons are as follows Firstly thedetection results of oo7 [43] depend on the completeness of thecontrol-flow graph (CFG) extraction Some inserted gadgetsare missed since it failed to extract a complete CFG dueto the limitation of the static approach Also oo7 is limitedby static inter-procedure taint tracking and it missed manyinserted gadgets because it failed to propagate the taint sourceto the injection points in the target programs The evaluationresults substantiate our claim that dynamic taint analysis ismuch more accurate and effective than static taint analysis indetecting Spectre gadgets

As presented in Table I SpecTaint outperforms RHScanner and SpecFuzz under the whole input bytes taintingconfiguration Since we only consider injected gadgets to betrue positives in this dataset other detection results are labeledas false positives The analysis results of other gadgets reportedby SpecTaint are presented in Table IV Note that Spec-Taint missed some injected gadgets in this experiment Wefurther investigated the results and found that missed gadgetsare not covered by input seeds However SpecFuzz missedmany injected gadgets that are covered by input seeds anddetected by SpecTaint According to our analysis the reasonsare first SpecFuzz adopts a prioritized simulation of branchmispredictions it selectively chooses whether to simulate themisprediction or not on a conditional branch Therefore itmissed some injected gadgets For example as presented inListing 2 the seeds are able to reach the branch at line 2 butSpecFuzz did not simulate branch misprediction over these

8

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 4: SpecTaint: Speculative Taint Analysis for Discovering

Fig 1 Architecture of SpecTaint

Fig 2 Workflow of SpecTaint

when encountering a conditional branch and backup the cur-rent execution state When the speculative execution window(SEW) is reached SpecTaint will roll back the executionstate to the previously saved checkpoint Like the rollback ofspeculative execution in the hardware our simulated rollbackalso squashes all the side effects produced during speculativeexecution At the speculative execution stage a Spectre gadgetdetector conducts pattern checking on each speculative execu-tion path to detect gadgets To mitigate Spectre attacks thereported gadgets are forwarded to automatic serialization tools(eg Speculative Load Hardening [18])

Threat Model and Scope We share the same threat modelwith other Spectre gadget detection tools [39] [43] That isour analyzed programs are benign but might be vulnerableWe do not deal with malicious code that might deliberatelythwart or escape our analysis Our approach can be used toanalyze the exploit code for Spectre gadgets in a malwaresample but it is not our focus in this paper In this workwe focus on detecting gadgets in victim programs that canbe exploited by Spectre V1 attacks and leak sensitive datathrough cache side channels Meltdown-type attacks do notrequire gadgets in victim programs and can be launched inmalicious programs Therefore they fall out of the scope ofthis work Our simulation focuses on Spectre V1 gadgetsdetection at the binary level We do not simulate micro-operations instruction execution including instruction fetchingdecoding etc and irrelevant hardware structures such as thereorder buffer (ROB) We only use the size of ROB to calculatethe speculative execution window (SEW)

III DYNAMIC SPECULATIVE EXECUTION SIMULATION

In this section we introduce our speculative execution sim-ulation platform for Spectre gadgets detection Our platformextends the binary analysis platform DECAF [19] which isbuilt on top of the system emulator QEMU [14] Specificallywe will showcase how we utilize the system emulator to sim-ulate speculative execution triggered by branch mispredictionsand how we explore speculative paths in a depth-first manner

A Misprediction Simulation

Spectre V1 exploits out-of-bounds memory accesses inspeculative execution triggered by one or more mispredic-tions of conditional branches SpecTaint extends a systememulator to simulate this behavior In the system emulatorpc stores the next instruction to be executed in the caseof a conditional jump it stores the jump target address Tosimulate the misprediction behavior of processors SpecTaintinverts the direction of the conditional jump by replacingthe pc with the untaken target address As a result it willexecute the untaken path first at a conditional branch andenter the simulated speculative execution Before it goes tothe untaken path a state checkpoint is saved which containsthe current state of the emulator eg CPU registers Duringthe simulated speculative execution any modification to thememory will be logged Specifically the original value will besaved before a speculatively executed instruction modifies itWhen the speculative execution window is reached the simu-lated speculative execution will be terminated and the controlflow transfers back to the last saved checkpoint To maintainthe correctness of execution the memory modifications willbe restored by writing the original values back to memoryand the CPU state will be restored with the saved state Asfor other conditions to terminate speculative execution weconsider serialization instructions (eg SYSCALL LFENCEetc) listed by Mambretti et al [35] and terminate the simulatedspeculative execution when encountering these instructions

Exception Handling Exceptions can be very common duringthe simulation of speculative executions This is because aCPU will execute a path which is not expected to executeA straightforward way for simulation is to terminate thesimulation and roll back whenever an exception is triggered asdone in SpecFuzz [39] However it could miss Spectre gadgetsafter the exception point along the speculative path sincespeculative execution is not supposed to be terminated whenhardware exceptions are raised in many CPU models [34] Thespeculative execution is expected to continue even if an excep-tion occurs To guarantee a precise simulation our approachdoes not roll back when an exception occurs and instead forcesit to silently simulate this exception instruction without raisingany exception and continue speculative execution on the nextinstruction

To simulate this behavior we extend the exception handlerof the system emulator More specifically we use a customizedexception handler to capture errors in speculative executionWhen an exception due to the permission violation (eg ac-cessing kernel space from user-land) occurs during a simulatedspeculative execution we will silently simulate this exceptioninstruction without raising any exception and continue theuser-land execution after the exception instruction By doing

4

so SpecTaint is able to continue the simulation even whenexceptions occur For other exceptions which SpecTaint isnot able to handle (eg caused by invalid jump target orincomplete address translation) because the emulator does notknow where the next instruction to be executed is SpecTaintwill make a state rollback and restore to the previously savedcheckpoint

B Exploration of Speculative Execution Paths

The path exploration involves two path types the normalexecution (NE) path and speculative execution (SE) path Toexplore the NE paths we feed seeds generated by fuzzingtools into the target program When encountering a conditionalbranching instruction SpecTaint redirects the execution tothe untaken branch first to explore SE paths Crucially CPUscan perform nested speculative execution in a SE path Sincewe center on Spectre V1 which exploits the misprediction ofconditional branches we take into account nested speculativeexecutions triggered by conditional branches To simulate thisbehavior SpecTaint explores SE paths for each conditionalbranch in NE paths and SE paths within SEW In other wordsSpecTaint explores the speculative paths in a depth-firstmanner During the exploration SpecTaint monitors whetherany termination condition is met and restore to the last savedstate if there is any

Execution State Rollback It is crucial to restore the executionstate after the speculative execution is terminated so as tomaintain the correctness of program execution An executionstate includes the state of all the CPU registers and used mem-ory regions The execution state rollback has two operationsstate backup and resume For a conditional branch SpecTaintfirst backups the current execution state before simulatesspeculative execution on that branch When the SE simulationis terminated SpecTaint resumes the backup state by resettingthe current execution environment with the backup state Tobackup an execution state SpecTaint saves the emulator stateeg all registers For memory it is too heavyweight to savethe entire memory so we adopt a lightweight ldquocopy-on-writerdquoapproach More specifically SpecTaint keeps track of memoryregions that are modified during speculative execution Beforethe memory regions are modified SpecTaint saves the originalvalues When making a state rollback SpecTaint writes theoriginal values back to the memory in a reversed order asthey are modified Moreover SpecTaint uses dynamic tainttracking to detect Spectre gadgets Thus it also restorestaint information that is created during simulated speculativeexecution More details are discussed in Section IV-A

Path Exploration The path exploration considers two typesof path coverage the NE and SE path coverage The pathcoverage on normal execution is used to explore more pathsduring the normal execution We utilize fuzzing techniquessuch as AFL [2] to improve path coverage on normal execu-tion The SE path coverage is to measure how many speculativepaths are covered The goal is to explore speculative executioncomprehensively so as to avoid missing Spectre gadgets onuncovered speculative paths

The switch point is the instruction that transfers the exe-cution mode from NE to SE SpecTaint treats a conditionalbranch as a switch point and conducts the SE simulation Once

Algorithm 1 Speculative Path Exploration AlgorithmInput Entry point pc SEW size w Backup state

set saved stateOutput gadgets gadget setgadget setlarr empty inst countlarr 0Function explorer(pc w)

while inst count lt w doif is terminator(pc) then

state = saved statepop()restore(state)pc = statepcexplorer(pc stateinsn count)

endif is branch(pc) then

saved statepush(checkpoint(pc))foreach t isin get targets(pc) do

explorer(t w minus insn count)end

endexecute(pc)if gadget checker(pc) then

gadget setlarr pcendpclarr next pcinst count ++

endreturn

entering the speculative execution mode SpecTaint exploreseach conditional branch and its targets for the speculativeexploration The speculative path exploration algorithm isshown in Algorithm 1 We use pc to represent the currentinstruction and gadget set to store the locations of detectedSpectre gadgets The speculative execution path explorationis a depth-first traversal on the control flow graph of theprogram at the speculative execution mode When exploringspeculative paths SpecTaint will first check whether thecurrent execution path has reached the SEW limit w or anyother termination conditions and if so restore to the last savedstate If the current instruction at pc is a conditional branchSpecTaint will simulate a misprediction by walking throughboth targets of this branch instruction Before exploring eachtarget SpecTaint will first backup the state and push it into astack saved state After finishing the path exploration on atarget (eg reaching the SEW limit) SpecTaint will resumethe last saved state and continue for the next target Alongwith the exploration SpecTaint conducts the gadget patternchecking on the current instruction If it matches SpecTaintwill save this gadget into gadget set and continue explorationuntil termination conditions are met

Mitigating Path Explosion As presented in our evaluationthe length of each speculative execution path is bounded by theSEW limit (see TableV) However there can be a large numberof paths within this SEW limit in the worst case Thereforewe may still encounter a path explosion problem Accordingto the analysis of our evaluation dataset we identify two kindsof cases where the exponential growth of paths may happenloops and recursive functions We propose our approach to

5

address the path explosion issue That is SpecTaint has athreshold on how many times simulation can happen on thesame branch In real hardware it is unlikely that speculativeexecution can be triggered by the same branch repeatedlyAfter iterations the same branch will not be mispredictedsince its memory value are very likely to be stored in registersor be cached after iterations [35] As demonstrated by Kocheret al [29] executing the same branch five times is enough totrain the branch predictor Therefore we also set the thresholdto five and SpecTaint will not simulate speculative executionrepeatedly on the same branch if it reaches the thresholdWe admit that this approach might miss gadgets in paths wehave not explored yet More details about false negatives arediscussed in Section VII

IV SPECTRE GADGET DETECTION

This section addresses how we detect Spectre gadgetsduring dynamic speculative execution simulation More specif-ically we first formalize the Spectre gadget definitions and thendescribe how we check the patterns to detect potential Spectregadgets on speculative paths

A Dynamic Taint Tracking

If a variable is data-dependent on user inputs we con-sider this variable is under an attackerrsquos control To findexploitable Spectre gadgets it is essential to find gadgetsthat can be controlled through external inputs Therefore inorder to capture this control relation we utilize dynamic taintanalysis to trace variables that are data-dependent on externalinputs during execution To this end we label user inputsas taint sources and observe how the data flows from userinputs Specifically we utilize the whole-system dynamic taintanalysis shipped with the platform we extend DECAF [19]and perform taint propagation along with the execution of theprogram DECAFrsquos tainting rules have been formally verifiedto be sound (guarantee of no under-tainting at instructionlevel) and most of them have also been verified to be precise(guarantee of no over-tainting) The details are documented inthis paper [24] We conduct dynamic taint analysis on bothNE and SE paths Performing taint analysis on NE paths isto ensure the propagation of taint labels in normal programexecution performing taint analysis on SE paths is to facilitatethe gadget checker to detect exploitable Spectre gadgets Whenthe SE path is terminated (eg when reaching SEW size) theCPU state and memory modifications will be restored so as thetaint information When restoring to the previously saved statewe also clean the variables which are marked as tainted duringthe simulated speculative execution in order to maintain thecorrectness of taint propagation

B Spectre Gadget Modeling

In this section we formulate the gadget patterns for twotypes of Spectre V1 gadgets Bounds Check Bypass (BCB)and Bounds Check Bypass Store (BCBS) To facilitate thediscussion we define the following notions before giving theSpectre gadget definitions

bull c is a conditional branch instructionbull T (c) denotes a set of instructions in a speculative execu-

tion trace from c

bull m(i) denotes i is a memory read instructionbull str(i) denotes i is a memory write instructionbull [i ] denotes the memory value accessed by instruction i bull dep(i j ) denotes instruction i is data dependent on j bull t(i) denotes the operands of instruction i is taintedbull δ denotes the size of speculative execution window

BCB Gadget In BCB attacks the speculative load instructionis under an attackerrsquos control thus the attacker could readarbitrary values from memory Then another load instructionis required to load the secret-indexed memory location withthe intention of leaking the secret To leak the secret throughthe cache side channel the leak instruction has to use theloaded secret as the index to read from memory thus thesecret can be retrieved by monitoring the cache line statechanges In general the BCB gadget involves a set of arrayoperations and the index in the latter array operation is data-dependent on the value from the former array [34] Essentiallythe first array access is responsible for loading secrets and thesecond one is responsible for leaking secrets However notall the code sequences matching such patterns are consideredas a BCB gadget The BCB gadget demands the index of theformer array access should be under the attackerrsquos control suchthat the attacker could read arbitrary values from memory inthe target program space by carefully manipulating the inputvalues To capture this control relation we utilize dynamictaint analysis to track data flow from external inputs whichis discussed in IV-A Suppose the speculative execution startsfrom a conditional branch c and we formalize our BCB gadgetpattern as Φbcb(c) and its definition is as follows

Φbcb(c) = existi j isin T (c)m(i) andm(j ) and dep(j [i ])

andt(i) and |c j | lt δ(1)

BCBS Gadget Unlike BCB gadget BCBS uses a speculativewrite (SW) to modify arbitrary memory locations In BCBSattacks attackers control the index of an array that wouldaccess out of boundary memory during speculative executionAs a result attackers can modify an arbitrary memory location(eg return address) by manipulating the index of the arrayWe formalize our BCBS gadget pattern as Φbcbs(c) and itsdefinition is as follows

Φbcbs(c)= existi isin T (c) str(i) and t(i) and |c i | lt δ (2)

In both patterns we leverage dynamic taint analysis todetermine whether the index of an array is under attackersrsquocontrol or not If an instruction i is tainted we consider i isunder attackersrsquo control and t(i) is set to be true

Gadget Classification Our gadget patterns are similar towhat is proposed in oo7 [43] The primary difference liesin whether the branch instruction is tainted or not Oo7 [43]considers that the branch should be tainted and controlled bythe inputs Thus attackers can poison the branch predictor tocause intentional misprediction of that branch by executing thevictim program with carefully-constructed inputs However asdiscovered by Canella et al [17] the branch prediction buffers

6

are shared and commonly indexed by the virtual address of thebranch instruction thus can be poisoned from another attack-controlled process by executing a congruent branch with thesame virtual address Based on this finding Spectre gadgetscan be exploited in the same address space (ie intra-process)or across address spaces (ie cross-process) [17] Thereforeonly considering tainted branches will exclude cross-processgadgets Our gadget patterns cover both situations if thespeculative execution is triggered by a taint conditional branchwe will mark this gadget as an intra-process Spectre gadgetotherwise this gadget will be marked as a cross-processSpectre gadget

C Gadget Detection

The gadget detection is to check whether the speculativeinstruction trace conforms to the gadget patterns describedabove We deploy the gadget checker in simulated speculativeexecution Although both our gadget patterns involve memoryaccess operations it is inefficient and unnecessary to checkall the memory access instructions on SE paths Instead weonly check memory access instructions which are tainted Tothis end we instrument memory read and write operationsTo detect BCB gadgets for each memory read we checkwhether its source is tainted To make sure the dependencydep(j [i ]) rule is satisfied between two instructions we alsotrack whether the tainted operand of one instruction j ispropagated from instruction i For BCBS gadgets the patterncaptures any tainted memory write whose destination addressis marked as tainted within the SEW

V IMPLEMENTATION

We have implemented the prototype SpecTaint in C Morespecifically we wrote a C plugin of 1 KLOC in C codeon top of DECAF [25] (a dynamic binary analysis platformbuilt on top of QEMU 10) This plugin implements thestate checkpoint management and Spectre gadget detectioncomponents We reused the dynamic taint analysis plugin ofDECAF for our taint analysis Overall the changes to developour prototype do not exceed 2 KLOC Besides to increase thecode coverage we use AFL 252b [2] and honggfuzz [8] togenerate seed inputs

VI EXPERIMENTAL EVALUATION

In this section we evaluate SpecTaint to answer thefollowing research questions

1) How effective is SpecTaint to find Spectre gadgets com-pared with other existing tools

2) How efficient is SpecTaint to find Spectre gadgets inreal-world applications

This section is composed as follows First we brieflydescribe the experiment setup the datasets and the evaluationmetrics used in our experiments (Sections VI-A) Second weevaluate the efficacy of SpecTaint (Sections VI-B) Then weevaluate the efficiency of SpecTaint on real-world applications(Sections VI-E) Finally we conduct case studies of someSpectre gadgets detected by SpecTaint in real-world applica-tions and the deep learning framework Caffe (Sections VI-F)

A Experiment Setup

Baseline Methods We compare SpecTaint with three base-line approaches Spectre 1 Scanner from Red Hat (RH Scan-ner) [5] oo7 [43] and SpecFuzz [39] RH Scanner is a staticanalysis tool that can be used to scan for Spectre gadgetsOo7 [43] is another static analysis tool that utilizes static taintanalysis to find Spectre gadgets SpecFuzz [39] extends thefuzzing technique to detect errors in speculative execution andreport Spectre gadgets Another related work SPECTECTORleverages symbolic execution to detect information-flow differ-ences introduced by speculative execution Since the manualsettings to make it work on large programs are not open-source it is hard to evaluate it on our real-world benchmarksAs discussed in SpecFuzz [39] the authors failed to runSPECTECTOR on the real-world benchmarks due to a largenumber of unsupported instructions Therefore we did notcompare with SPECTECTOR on real-world benchmarks (samewith SpecFuzzrsquos)

Evaluation Dataset The evaluation is conducted on twodatasets the Spectre Samples Dataset and Real-world DatasetAligned with other baseline works we utilize the same SpectreSamples Dataset to demonstrate the detection capability ofSpecTaint For baseline comparison with related works wecollected six real-world applications and created two real-world datasets

bull Spectre Samples Dataset This dataset is designed todemonstrate the efficacy of SpecTaint We collected15 Spectre V1 samples created by Paul Kocher [3] andcompiled 15 samples with the same configuration (gcc-484 with O0) [39]

bull Real-world V1 Dataset For fair baseline comparisonwe use the same dataset as SpecFuzzrsquos [39] This datasetcontains six widely used applications one cryptographicprogram from OpenSSL [12] a compression program(Brotli [7]) and four parsing programs (JSON [10]LibHTP [11] HTTP [9] and YAML [6])

bull Real-world V2 Dataset Systematic baseline evaluationis difficult due to the shortage of ground truth in real-world programs To solve this problem we injectedknown Spectre gadgets into programs from Real-worldV1 Dataset and created the Real-world V2 Dataset Weadopt the same injection approach proposed in LAVA [20]to build this dataset More specifically we utilize dynamictaint analysis to find attack points which can be controlledby input bytes that do not determine control flow andhave not been modified much (see [20] for more details)Then we inject the Spectre gadgets from Spectre SampleDataset into target programs and add code to makeinjected gadgets controllable via input bytes As a resultwe injected 15 Spectre V1 gadgets from Spectre SampleDataset into 52 different locations in six programs Tomake a fair comparison the input seeds used in the evalu-ation are all generated by a fuzzing tool [8] Therefore wehave no clue whether the injected locations are coveredby input seeds in the evaluation It is worth mentioningthat this dataset is used to evaluate the detection coverageof SpecTaint and related works instead of pathcodecoverage and we choose the same seed corpus withSpecFuzzrsquos to guarantee a fair comparison

7

Evaluation Metrics For the Real-world V1 dataset whichdoes not have ground truth we manually verify the detectionresults and calculate the precision rate to quantify the perfor-mance of SpecTaint and baseline approaches The precisionis calculated as precision(P ) = TP

TP+FP where TP is thenumber of detected gadgets that are manually verified to beexploitable and FP is the number of detected gadgets thatare not exploitable based on manual analysis For Real-worldV2 Dataset we only consider injected Spectre gadgets to betrue positives and other reported results to be false positivesThen we calculate the precision and recall to quantify theeffectiveness of the proposed approach and baseline methodsThe recall is calculated as recall(R) = TP

TP+FN where TP isthe number of inserted gadgets which are correctly detectedand FN is the number of inserted gadgets that are missed Theprecision is calculated as precision(P ) = TP

TP+FP where FPis the number of detected gadgets other than injected ones Tomeasure the efficiency we reported runtime per speculativeexecution and number of paths explored per speculative exe-cution along with other statistics (see Section VI-E for moredetails)

Configuration The experiments were conducted on a desktopwith 16 GB memory Intel Core i7 12 cores at 370 GHzCPU and running Linux 415 The Guest OS in QEMU isUbuntu 1404 with 1 GB memory The speculative window isdependent on the space limit ie ROB size and timing limitsWe follow the configuration used by SpecFuzz [39] and alsoset the speculative window size to 250

B Baseline Evaluation on Spectre Samples Dataset

In this experiment we compared SpecTaint with threebaseline tools on the Spectre Sample Dataset As presented inTable II SpecTaint successfully detected all Spectre gadgetsin the Spectre Samples Dataset while RH Scanner relies onsyntax-based pattern matching and missed three cases

C Baseline Comparison on Real-world V2 Dataset

We conducted the baseline comparison with three relatedworks RH Scanner oo7 and SpecFuzz on Real-world V2Dataset In this experiment we focused on detecting theinserted gadgets therefore we only consider inserted gadgetsto be true positives and all other detection results to befalse positives For tools that utilize taint tracking to detectSpectre gadget (oo7 and SpecTaint) we mark input bytes astaint sources Since the analysis of oo7 is very slow whenperforming whole input bytes tainting we only mark inputbytes that influence injected gadgets as taint sources We adoptthe same configuration for SpecTaint To compare with RHScanner and SpecFuzz we have another configuration forSpecTaint where we mark all input bytes as tainted Theresults for the former configuration are labeled with ldquordquo inTable I For dynamic analysis tools (SpecTaint and SpecFuzz)we used an external fuzzing tool [8] to fuzz the six programsfor 10 hours and fed the generated seeds as inputs to run theprograms

1 2 i f ( p a r s e rminusgtt o k n e x t gt= num tokens ) 3 I n s e r t e d S p e c t r e Gadget4 i f d e f SPECTRE VARIANT5 i f ( g l o b a l i d x lt a r r a y 1 s i z e )6 tmp amp= a r r a y 2 [ a r r a y 1 [ g l o b a l i d x ] lowast 5 1 2 ] 7 8 e n d i f9

Listing 2 Missed Spectre gadget 1 by SpecFuzz

1 2 i f ( lminusgtf i r s t + i d x lt lminusgtmax size ) 3 r e t u r n ( vo id lowast) lminusgte l e m e n t s [ lminusgtf i r s t + i d x ] 4 e l s e 5 I n s e r t e d S p e c t r e Gadget6 i f d e f SPECTRE VARIANT7 i n t temp = 0 8 i n t lowastadd r = ampg l o b a l i d x 9 i f (lowast add r lt a r r a y 1 s i z e )

10 temp amp= a r r a y 2 [ a r r a y 1 [lowast add r ] lowast 5 1 2 ] 11 12 e n d i f13

Listing 3 Missed Spectre gadget 2 by SpecFuzz

Table I shows that when tainting gadget-related inputbytes SpecTaint has no false positives and achieves a pre-cision rate of 100 Under the same configuration howeveroo7 still produced false positives For instance it reported13 gadget candidates in LibHTP but 12 of them are falsepositives The results show that static taint analysis suffersfrom the over-tainting issue thereby is hard to achieve highprecision in detecting Spectre gadgets As presented in Table Ioo7 has many false negatives For example it missed allinserted gadgets in YAML and Brotli We examined these falsenegatives and found the reasons are as follows Firstly thedetection results of oo7 [43] depend on the completeness of thecontrol-flow graph (CFG) extraction Some inserted gadgetsare missed since it failed to extract a complete CFG dueto the limitation of the static approach Also oo7 is limitedby static inter-procedure taint tracking and it missed manyinserted gadgets because it failed to propagate the taint sourceto the injection points in the target programs The evaluationresults substantiate our claim that dynamic taint analysis ismuch more accurate and effective than static taint analysis indetecting Spectre gadgets

As presented in Table I SpecTaint outperforms RHScanner and SpecFuzz under the whole input bytes taintingconfiguration Since we only consider injected gadgets to betrue positives in this dataset other detection results are labeledas false positives The analysis results of other gadgets reportedby SpecTaint are presented in Table IV Note that Spec-Taint missed some injected gadgets in this experiment Wefurther investigated the results and found that missed gadgetsare not covered by input seeds However SpecFuzz missedmany injected gadgets that are covered by input seeds anddetected by SpecTaint According to our analysis the reasonsare first SpecFuzz adopts a prioritized simulation of branchmispredictions it selectively chooses whether to simulate themisprediction or not on a conditional branch Therefore itmissed some injected gadgets For example as presented inListing 2 the seeds are able to reach the branch at line 2 butSpecFuzz did not simulate branch misprediction over these

8

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 5: SpecTaint: Speculative Taint Analysis for Discovering

so SpecTaint is able to continue the simulation even whenexceptions occur For other exceptions which SpecTaint isnot able to handle (eg caused by invalid jump target orincomplete address translation) because the emulator does notknow where the next instruction to be executed is SpecTaintwill make a state rollback and restore to the previously savedcheckpoint

B Exploration of Speculative Execution Paths

The path exploration involves two path types the normalexecution (NE) path and speculative execution (SE) path Toexplore the NE paths we feed seeds generated by fuzzingtools into the target program When encountering a conditionalbranching instruction SpecTaint redirects the execution tothe untaken branch first to explore SE paths Crucially CPUscan perform nested speculative execution in a SE path Sincewe center on Spectre V1 which exploits the misprediction ofconditional branches we take into account nested speculativeexecutions triggered by conditional branches To simulate thisbehavior SpecTaint explores SE paths for each conditionalbranch in NE paths and SE paths within SEW In other wordsSpecTaint explores the speculative paths in a depth-firstmanner During the exploration SpecTaint monitors whetherany termination condition is met and restore to the last savedstate if there is any

Execution State Rollback It is crucial to restore the executionstate after the speculative execution is terminated so as tomaintain the correctness of program execution An executionstate includes the state of all the CPU registers and used mem-ory regions The execution state rollback has two operationsstate backup and resume For a conditional branch SpecTaintfirst backups the current execution state before simulatesspeculative execution on that branch When the SE simulationis terminated SpecTaint resumes the backup state by resettingthe current execution environment with the backup state Tobackup an execution state SpecTaint saves the emulator stateeg all registers For memory it is too heavyweight to savethe entire memory so we adopt a lightweight ldquocopy-on-writerdquoapproach More specifically SpecTaint keeps track of memoryregions that are modified during speculative execution Beforethe memory regions are modified SpecTaint saves the originalvalues When making a state rollback SpecTaint writes theoriginal values back to the memory in a reversed order asthey are modified Moreover SpecTaint uses dynamic tainttracking to detect Spectre gadgets Thus it also restorestaint information that is created during simulated speculativeexecution More details are discussed in Section IV-A

Path Exploration The path exploration considers two typesof path coverage the NE and SE path coverage The pathcoverage on normal execution is used to explore more pathsduring the normal execution We utilize fuzzing techniquessuch as AFL [2] to improve path coverage on normal execu-tion The SE path coverage is to measure how many speculativepaths are covered The goal is to explore speculative executioncomprehensively so as to avoid missing Spectre gadgets onuncovered speculative paths

The switch point is the instruction that transfers the exe-cution mode from NE to SE SpecTaint treats a conditionalbranch as a switch point and conducts the SE simulation Once

Algorithm 1 Speculative Path Exploration AlgorithmInput Entry point pc SEW size w Backup state

set saved stateOutput gadgets gadget setgadget setlarr empty inst countlarr 0Function explorer(pc w)

while inst count lt w doif is terminator(pc) then

state = saved statepop()restore(state)pc = statepcexplorer(pc stateinsn count)

endif is branch(pc) then

saved statepush(checkpoint(pc))foreach t isin get targets(pc) do

explorer(t w minus insn count)end

endexecute(pc)if gadget checker(pc) then

gadget setlarr pcendpclarr next pcinst count ++

endreturn

entering the speculative execution mode SpecTaint exploreseach conditional branch and its targets for the speculativeexploration The speculative path exploration algorithm isshown in Algorithm 1 We use pc to represent the currentinstruction and gadget set to store the locations of detectedSpectre gadgets The speculative execution path explorationis a depth-first traversal on the control flow graph of theprogram at the speculative execution mode When exploringspeculative paths SpecTaint will first check whether thecurrent execution path has reached the SEW limit w or anyother termination conditions and if so restore to the last savedstate If the current instruction at pc is a conditional branchSpecTaint will simulate a misprediction by walking throughboth targets of this branch instruction Before exploring eachtarget SpecTaint will first backup the state and push it into astack saved state After finishing the path exploration on atarget (eg reaching the SEW limit) SpecTaint will resumethe last saved state and continue for the next target Alongwith the exploration SpecTaint conducts the gadget patternchecking on the current instruction If it matches SpecTaintwill save this gadget into gadget set and continue explorationuntil termination conditions are met

Mitigating Path Explosion As presented in our evaluationthe length of each speculative execution path is bounded by theSEW limit (see TableV) However there can be a large numberof paths within this SEW limit in the worst case Thereforewe may still encounter a path explosion problem Accordingto the analysis of our evaluation dataset we identify two kindsof cases where the exponential growth of paths may happenloops and recursive functions We propose our approach to

5

address the path explosion issue That is SpecTaint has athreshold on how many times simulation can happen on thesame branch In real hardware it is unlikely that speculativeexecution can be triggered by the same branch repeatedlyAfter iterations the same branch will not be mispredictedsince its memory value are very likely to be stored in registersor be cached after iterations [35] As demonstrated by Kocheret al [29] executing the same branch five times is enough totrain the branch predictor Therefore we also set the thresholdto five and SpecTaint will not simulate speculative executionrepeatedly on the same branch if it reaches the thresholdWe admit that this approach might miss gadgets in paths wehave not explored yet More details about false negatives arediscussed in Section VII

IV SPECTRE GADGET DETECTION

This section addresses how we detect Spectre gadgetsduring dynamic speculative execution simulation More specif-ically we first formalize the Spectre gadget definitions and thendescribe how we check the patterns to detect potential Spectregadgets on speculative paths

A Dynamic Taint Tracking

If a variable is data-dependent on user inputs we con-sider this variable is under an attackerrsquos control To findexploitable Spectre gadgets it is essential to find gadgetsthat can be controlled through external inputs Therefore inorder to capture this control relation we utilize dynamic taintanalysis to trace variables that are data-dependent on externalinputs during execution To this end we label user inputsas taint sources and observe how the data flows from userinputs Specifically we utilize the whole-system dynamic taintanalysis shipped with the platform we extend DECAF [19]and perform taint propagation along with the execution of theprogram DECAFrsquos tainting rules have been formally verifiedto be sound (guarantee of no under-tainting at instructionlevel) and most of them have also been verified to be precise(guarantee of no over-tainting) The details are documented inthis paper [24] We conduct dynamic taint analysis on bothNE and SE paths Performing taint analysis on NE paths isto ensure the propagation of taint labels in normal programexecution performing taint analysis on SE paths is to facilitatethe gadget checker to detect exploitable Spectre gadgets Whenthe SE path is terminated (eg when reaching SEW size) theCPU state and memory modifications will be restored so as thetaint information When restoring to the previously saved statewe also clean the variables which are marked as tainted duringthe simulated speculative execution in order to maintain thecorrectness of taint propagation

B Spectre Gadget Modeling

In this section we formulate the gadget patterns for twotypes of Spectre V1 gadgets Bounds Check Bypass (BCB)and Bounds Check Bypass Store (BCBS) To facilitate thediscussion we define the following notions before giving theSpectre gadget definitions

bull c is a conditional branch instructionbull T (c) denotes a set of instructions in a speculative execu-

tion trace from c

bull m(i) denotes i is a memory read instructionbull str(i) denotes i is a memory write instructionbull [i ] denotes the memory value accessed by instruction i bull dep(i j ) denotes instruction i is data dependent on j bull t(i) denotes the operands of instruction i is taintedbull δ denotes the size of speculative execution window

BCB Gadget In BCB attacks the speculative load instructionis under an attackerrsquos control thus the attacker could readarbitrary values from memory Then another load instructionis required to load the secret-indexed memory location withthe intention of leaking the secret To leak the secret throughthe cache side channel the leak instruction has to use theloaded secret as the index to read from memory thus thesecret can be retrieved by monitoring the cache line statechanges In general the BCB gadget involves a set of arrayoperations and the index in the latter array operation is data-dependent on the value from the former array [34] Essentiallythe first array access is responsible for loading secrets and thesecond one is responsible for leaking secrets However notall the code sequences matching such patterns are consideredas a BCB gadget The BCB gadget demands the index of theformer array access should be under the attackerrsquos control suchthat the attacker could read arbitrary values from memory inthe target program space by carefully manipulating the inputvalues To capture this control relation we utilize dynamictaint analysis to track data flow from external inputs whichis discussed in IV-A Suppose the speculative execution startsfrom a conditional branch c and we formalize our BCB gadgetpattern as Φbcb(c) and its definition is as follows

Φbcb(c) = existi j isin T (c)m(i) andm(j ) and dep(j [i ])

andt(i) and |c j | lt δ(1)

BCBS Gadget Unlike BCB gadget BCBS uses a speculativewrite (SW) to modify arbitrary memory locations In BCBSattacks attackers control the index of an array that wouldaccess out of boundary memory during speculative executionAs a result attackers can modify an arbitrary memory location(eg return address) by manipulating the index of the arrayWe formalize our BCBS gadget pattern as Φbcbs(c) and itsdefinition is as follows

Φbcbs(c)= existi isin T (c) str(i) and t(i) and |c i | lt δ (2)

In both patterns we leverage dynamic taint analysis todetermine whether the index of an array is under attackersrsquocontrol or not If an instruction i is tainted we consider i isunder attackersrsquo control and t(i) is set to be true

Gadget Classification Our gadget patterns are similar towhat is proposed in oo7 [43] The primary difference liesin whether the branch instruction is tainted or not Oo7 [43]considers that the branch should be tainted and controlled bythe inputs Thus attackers can poison the branch predictor tocause intentional misprediction of that branch by executing thevictim program with carefully-constructed inputs However asdiscovered by Canella et al [17] the branch prediction buffers

6

are shared and commonly indexed by the virtual address of thebranch instruction thus can be poisoned from another attack-controlled process by executing a congruent branch with thesame virtual address Based on this finding Spectre gadgetscan be exploited in the same address space (ie intra-process)or across address spaces (ie cross-process) [17] Thereforeonly considering tainted branches will exclude cross-processgadgets Our gadget patterns cover both situations if thespeculative execution is triggered by a taint conditional branchwe will mark this gadget as an intra-process Spectre gadgetotherwise this gadget will be marked as a cross-processSpectre gadget

C Gadget Detection

The gadget detection is to check whether the speculativeinstruction trace conforms to the gadget patterns describedabove We deploy the gadget checker in simulated speculativeexecution Although both our gadget patterns involve memoryaccess operations it is inefficient and unnecessary to checkall the memory access instructions on SE paths Instead weonly check memory access instructions which are tainted Tothis end we instrument memory read and write operationsTo detect BCB gadgets for each memory read we checkwhether its source is tainted To make sure the dependencydep(j [i ]) rule is satisfied between two instructions we alsotrack whether the tainted operand of one instruction j ispropagated from instruction i For BCBS gadgets the patterncaptures any tainted memory write whose destination addressis marked as tainted within the SEW

V IMPLEMENTATION

We have implemented the prototype SpecTaint in C Morespecifically we wrote a C plugin of 1 KLOC in C codeon top of DECAF [25] (a dynamic binary analysis platformbuilt on top of QEMU 10) This plugin implements thestate checkpoint management and Spectre gadget detectioncomponents We reused the dynamic taint analysis plugin ofDECAF for our taint analysis Overall the changes to developour prototype do not exceed 2 KLOC Besides to increase thecode coverage we use AFL 252b [2] and honggfuzz [8] togenerate seed inputs

VI EXPERIMENTAL EVALUATION

In this section we evaluate SpecTaint to answer thefollowing research questions

1) How effective is SpecTaint to find Spectre gadgets com-pared with other existing tools

2) How efficient is SpecTaint to find Spectre gadgets inreal-world applications

This section is composed as follows First we brieflydescribe the experiment setup the datasets and the evaluationmetrics used in our experiments (Sections VI-A) Second weevaluate the efficacy of SpecTaint (Sections VI-B) Then weevaluate the efficiency of SpecTaint on real-world applications(Sections VI-E) Finally we conduct case studies of someSpectre gadgets detected by SpecTaint in real-world applica-tions and the deep learning framework Caffe (Sections VI-F)

A Experiment Setup

Baseline Methods We compare SpecTaint with three base-line approaches Spectre 1 Scanner from Red Hat (RH Scan-ner) [5] oo7 [43] and SpecFuzz [39] RH Scanner is a staticanalysis tool that can be used to scan for Spectre gadgetsOo7 [43] is another static analysis tool that utilizes static taintanalysis to find Spectre gadgets SpecFuzz [39] extends thefuzzing technique to detect errors in speculative execution andreport Spectre gadgets Another related work SPECTECTORleverages symbolic execution to detect information-flow differ-ences introduced by speculative execution Since the manualsettings to make it work on large programs are not open-source it is hard to evaluate it on our real-world benchmarksAs discussed in SpecFuzz [39] the authors failed to runSPECTECTOR on the real-world benchmarks due to a largenumber of unsupported instructions Therefore we did notcompare with SPECTECTOR on real-world benchmarks (samewith SpecFuzzrsquos)

Evaluation Dataset The evaluation is conducted on twodatasets the Spectre Samples Dataset and Real-world DatasetAligned with other baseline works we utilize the same SpectreSamples Dataset to demonstrate the detection capability ofSpecTaint For baseline comparison with related works wecollected six real-world applications and created two real-world datasets

bull Spectre Samples Dataset This dataset is designed todemonstrate the efficacy of SpecTaint We collected15 Spectre V1 samples created by Paul Kocher [3] andcompiled 15 samples with the same configuration (gcc-484 with O0) [39]

bull Real-world V1 Dataset For fair baseline comparisonwe use the same dataset as SpecFuzzrsquos [39] This datasetcontains six widely used applications one cryptographicprogram from OpenSSL [12] a compression program(Brotli [7]) and four parsing programs (JSON [10]LibHTP [11] HTTP [9] and YAML [6])

bull Real-world V2 Dataset Systematic baseline evaluationis difficult due to the shortage of ground truth in real-world programs To solve this problem we injectedknown Spectre gadgets into programs from Real-worldV1 Dataset and created the Real-world V2 Dataset Weadopt the same injection approach proposed in LAVA [20]to build this dataset More specifically we utilize dynamictaint analysis to find attack points which can be controlledby input bytes that do not determine control flow andhave not been modified much (see [20] for more details)Then we inject the Spectre gadgets from Spectre SampleDataset into target programs and add code to makeinjected gadgets controllable via input bytes As a resultwe injected 15 Spectre V1 gadgets from Spectre SampleDataset into 52 different locations in six programs Tomake a fair comparison the input seeds used in the evalu-ation are all generated by a fuzzing tool [8] Therefore wehave no clue whether the injected locations are coveredby input seeds in the evaluation It is worth mentioningthat this dataset is used to evaluate the detection coverageof SpecTaint and related works instead of pathcodecoverage and we choose the same seed corpus withSpecFuzzrsquos to guarantee a fair comparison

7

Evaluation Metrics For the Real-world V1 dataset whichdoes not have ground truth we manually verify the detectionresults and calculate the precision rate to quantify the perfor-mance of SpecTaint and baseline approaches The precisionis calculated as precision(P ) = TP

TP+FP where TP is thenumber of detected gadgets that are manually verified to beexploitable and FP is the number of detected gadgets thatare not exploitable based on manual analysis For Real-worldV2 Dataset we only consider injected Spectre gadgets to betrue positives and other reported results to be false positivesThen we calculate the precision and recall to quantify theeffectiveness of the proposed approach and baseline methodsThe recall is calculated as recall(R) = TP

TP+FN where TP isthe number of inserted gadgets which are correctly detectedand FN is the number of inserted gadgets that are missed Theprecision is calculated as precision(P ) = TP

TP+FP where FPis the number of detected gadgets other than injected ones Tomeasure the efficiency we reported runtime per speculativeexecution and number of paths explored per speculative exe-cution along with other statistics (see Section VI-E for moredetails)

Configuration The experiments were conducted on a desktopwith 16 GB memory Intel Core i7 12 cores at 370 GHzCPU and running Linux 415 The Guest OS in QEMU isUbuntu 1404 with 1 GB memory The speculative window isdependent on the space limit ie ROB size and timing limitsWe follow the configuration used by SpecFuzz [39] and alsoset the speculative window size to 250

B Baseline Evaluation on Spectre Samples Dataset

In this experiment we compared SpecTaint with threebaseline tools on the Spectre Sample Dataset As presented inTable II SpecTaint successfully detected all Spectre gadgetsin the Spectre Samples Dataset while RH Scanner relies onsyntax-based pattern matching and missed three cases

C Baseline Comparison on Real-world V2 Dataset

We conducted the baseline comparison with three relatedworks RH Scanner oo7 and SpecFuzz on Real-world V2Dataset In this experiment we focused on detecting theinserted gadgets therefore we only consider inserted gadgetsto be true positives and all other detection results to befalse positives For tools that utilize taint tracking to detectSpectre gadget (oo7 and SpecTaint) we mark input bytes astaint sources Since the analysis of oo7 is very slow whenperforming whole input bytes tainting we only mark inputbytes that influence injected gadgets as taint sources We adoptthe same configuration for SpecTaint To compare with RHScanner and SpecFuzz we have another configuration forSpecTaint where we mark all input bytes as tainted Theresults for the former configuration are labeled with ldquordquo inTable I For dynamic analysis tools (SpecTaint and SpecFuzz)we used an external fuzzing tool [8] to fuzz the six programsfor 10 hours and fed the generated seeds as inputs to run theprograms

1 2 i f ( p a r s e rminusgtt o k n e x t gt= num tokens ) 3 I n s e r t e d S p e c t r e Gadget4 i f d e f SPECTRE VARIANT5 i f ( g l o b a l i d x lt a r r a y 1 s i z e )6 tmp amp= a r r a y 2 [ a r r a y 1 [ g l o b a l i d x ] lowast 5 1 2 ] 7 8 e n d i f9

Listing 2 Missed Spectre gadget 1 by SpecFuzz

1 2 i f ( lminusgtf i r s t + i d x lt lminusgtmax size ) 3 r e t u r n ( vo id lowast) lminusgte l e m e n t s [ lminusgtf i r s t + i d x ] 4 e l s e 5 I n s e r t e d S p e c t r e Gadget6 i f d e f SPECTRE VARIANT7 i n t temp = 0 8 i n t lowastadd r = ampg l o b a l i d x 9 i f (lowast add r lt a r r a y 1 s i z e )

10 temp amp= a r r a y 2 [ a r r a y 1 [lowast add r ] lowast 5 1 2 ] 11 12 e n d i f13

Listing 3 Missed Spectre gadget 2 by SpecFuzz

Table I shows that when tainting gadget-related inputbytes SpecTaint has no false positives and achieves a pre-cision rate of 100 Under the same configuration howeveroo7 still produced false positives For instance it reported13 gadget candidates in LibHTP but 12 of them are falsepositives The results show that static taint analysis suffersfrom the over-tainting issue thereby is hard to achieve highprecision in detecting Spectre gadgets As presented in Table Ioo7 has many false negatives For example it missed allinserted gadgets in YAML and Brotli We examined these falsenegatives and found the reasons are as follows Firstly thedetection results of oo7 [43] depend on the completeness of thecontrol-flow graph (CFG) extraction Some inserted gadgetsare missed since it failed to extract a complete CFG dueto the limitation of the static approach Also oo7 is limitedby static inter-procedure taint tracking and it missed manyinserted gadgets because it failed to propagate the taint sourceto the injection points in the target programs The evaluationresults substantiate our claim that dynamic taint analysis ismuch more accurate and effective than static taint analysis indetecting Spectre gadgets

As presented in Table I SpecTaint outperforms RHScanner and SpecFuzz under the whole input bytes taintingconfiguration Since we only consider injected gadgets to betrue positives in this dataset other detection results are labeledas false positives The analysis results of other gadgets reportedby SpecTaint are presented in Table IV Note that Spec-Taint missed some injected gadgets in this experiment Wefurther investigated the results and found that missed gadgetsare not covered by input seeds However SpecFuzz missedmany injected gadgets that are covered by input seeds anddetected by SpecTaint According to our analysis the reasonsare first SpecFuzz adopts a prioritized simulation of branchmispredictions it selectively chooses whether to simulate themisprediction or not on a conditional branch Therefore itmissed some injected gadgets For example as presented inListing 2 the seeds are able to reach the branch at line 2 butSpecFuzz did not simulate branch misprediction over these

8

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 6: SpecTaint: Speculative Taint Analysis for Discovering

address the path explosion issue That is SpecTaint has athreshold on how many times simulation can happen on thesame branch In real hardware it is unlikely that speculativeexecution can be triggered by the same branch repeatedlyAfter iterations the same branch will not be mispredictedsince its memory value are very likely to be stored in registersor be cached after iterations [35] As demonstrated by Kocheret al [29] executing the same branch five times is enough totrain the branch predictor Therefore we also set the thresholdto five and SpecTaint will not simulate speculative executionrepeatedly on the same branch if it reaches the thresholdWe admit that this approach might miss gadgets in paths wehave not explored yet More details about false negatives arediscussed in Section VII

IV SPECTRE GADGET DETECTION

This section addresses how we detect Spectre gadgetsduring dynamic speculative execution simulation More specif-ically we first formalize the Spectre gadget definitions and thendescribe how we check the patterns to detect potential Spectregadgets on speculative paths

A Dynamic Taint Tracking

If a variable is data-dependent on user inputs we con-sider this variable is under an attackerrsquos control To findexploitable Spectre gadgets it is essential to find gadgetsthat can be controlled through external inputs Therefore inorder to capture this control relation we utilize dynamic taintanalysis to trace variables that are data-dependent on externalinputs during execution To this end we label user inputsas taint sources and observe how the data flows from userinputs Specifically we utilize the whole-system dynamic taintanalysis shipped with the platform we extend DECAF [19]and perform taint propagation along with the execution of theprogram DECAFrsquos tainting rules have been formally verifiedto be sound (guarantee of no under-tainting at instructionlevel) and most of them have also been verified to be precise(guarantee of no over-tainting) The details are documented inthis paper [24] We conduct dynamic taint analysis on bothNE and SE paths Performing taint analysis on NE paths isto ensure the propagation of taint labels in normal programexecution performing taint analysis on SE paths is to facilitatethe gadget checker to detect exploitable Spectre gadgets Whenthe SE path is terminated (eg when reaching SEW size) theCPU state and memory modifications will be restored so as thetaint information When restoring to the previously saved statewe also clean the variables which are marked as tainted duringthe simulated speculative execution in order to maintain thecorrectness of taint propagation

B Spectre Gadget Modeling

In this section we formulate the gadget patterns for twotypes of Spectre V1 gadgets Bounds Check Bypass (BCB)and Bounds Check Bypass Store (BCBS) To facilitate thediscussion we define the following notions before giving theSpectre gadget definitions

bull c is a conditional branch instructionbull T (c) denotes a set of instructions in a speculative execu-

tion trace from c

bull m(i) denotes i is a memory read instructionbull str(i) denotes i is a memory write instructionbull [i ] denotes the memory value accessed by instruction i bull dep(i j ) denotes instruction i is data dependent on j bull t(i) denotes the operands of instruction i is taintedbull δ denotes the size of speculative execution window

BCB Gadget In BCB attacks the speculative load instructionis under an attackerrsquos control thus the attacker could readarbitrary values from memory Then another load instructionis required to load the secret-indexed memory location withthe intention of leaking the secret To leak the secret throughthe cache side channel the leak instruction has to use theloaded secret as the index to read from memory thus thesecret can be retrieved by monitoring the cache line statechanges In general the BCB gadget involves a set of arrayoperations and the index in the latter array operation is data-dependent on the value from the former array [34] Essentiallythe first array access is responsible for loading secrets and thesecond one is responsible for leaking secrets However notall the code sequences matching such patterns are consideredas a BCB gadget The BCB gadget demands the index of theformer array access should be under the attackerrsquos control suchthat the attacker could read arbitrary values from memory inthe target program space by carefully manipulating the inputvalues To capture this control relation we utilize dynamictaint analysis to track data flow from external inputs whichis discussed in IV-A Suppose the speculative execution startsfrom a conditional branch c and we formalize our BCB gadgetpattern as Φbcb(c) and its definition is as follows

Φbcb(c) = existi j isin T (c)m(i) andm(j ) and dep(j [i ])

andt(i) and |c j | lt δ(1)

BCBS Gadget Unlike BCB gadget BCBS uses a speculativewrite (SW) to modify arbitrary memory locations In BCBSattacks attackers control the index of an array that wouldaccess out of boundary memory during speculative executionAs a result attackers can modify an arbitrary memory location(eg return address) by manipulating the index of the arrayWe formalize our BCBS gadget pattern as Φbcbs(c) and itsdefinition is as follows

Φbcbs(c)= existi isin T (c) str(i) and t(i) and |c i | lt δ (2)

In both patterns we leverage dynamic taint analysis todetermine whether the index of an array is under attackersrsquocontrol or not If an instruction i is tainted we consider i isunder attackersrsquo control and t(i) is set to be true

Gadget Classification Our gadget patterns are similar towhat is proposed in oo7 [43] The primary difference liesin whether the branch instruction is tainted or not Oo7 [43]considers that the branch should be tainted and controlled bythe inputs Thus attackers can poison the branch predictor tocause intentional misprediction of that branch by executing thevictim program with carefully-constructed inputs However asdiscovered by Canella et al [17] the branch prediction buffers

6

are shared and commonly indexed by the virtual address of thebranch instruction thus can be poisoned from another attack-controlled process by executing a congruent branch with thesame virtual address Based on this finding Spectre gadgetscan be exploited in the same address space (ie intra-process)or across address spaces (ie cross-process) [17] Thereforeonly considering tainted branches will exclude cross-processgadgets Our gadget patterns cover both situations if thespeculative execution is triggered by a taint conditional branchwe will mark this gadget as an intra-process Spectre gadgetotherwise this gadget will be marked as a cross-processSpectre gadget

C Gadget Detection

The gadget detection is to check whether the speculativeinstruction trace conforms to the gadget patterns describedabove We deploy the gadget checker in simulated speculativeexecution Although both our gadget patterns involve memoryaccess operations it is inefficient and unnecessary to checkall the memory access instructions on SE paths Instead weonly check memory access instructions which are tainted Tothis end we instrument memory read and write operationsTo detect BCB gadgets for each memory read we checkwhether its source is tainted To make sure the dependencydep(j [i ]) rule is satisfied between two instructions we alsotrack whether the tainted operand of one instruction j ispropagated from instruction i For BCBS gadgets the patterncaptures any tainted memory write whose destination addressis marked as tainted within the SEW

V IMPLEMENTATION

We have implemented the prototype SpecTaint in C Morespecifically we wrote a C plugin of 1 KLOC in C codeon top of DECAF [25] (a dynamic binary analysis platformbuilt on top of QEMU 10) This plugin implements thestate checkpoint management and Spectre gadget detectioncomponents We reused the dynamic taint analysis plugin ofDECAF for our taint analysis Overall the changes to developour prototype do not exceed 2 KLOC Besides to increase thecode coverage we use AFL 252b [2] and honggfuzz [8] togenerate seed inputs

VI EXPERIMENTAL EVALUATION

In this section we evaluate SpecTaint to answer thefollowing research questions

1) How effective is SpecTaint to find Spectre gadgets com-pared with other existing tools

2) How efficient is SpecTaint to find Spectre gadgets inreal-world applications

This section is composed as follows First we brieflydescribe the experiment setup the datasets and the evaluationmetrics used in our experiments (Sections VI-A) Second weevaluate the efficacy of SpecTaint (Sections VI-B) Then weevaluate the efficiency of SpecTaint on real-world applications(Sections VI-E) Finally we conduct case studies of someSpectre gadgets detected by SpecTaint in real-world applica-tions and the deep learning framework Caffe (Sections VI-F)

A Experiment Setup

Baseline Methods We compare SpecTaint with three base-line approaches Spectre 1 Scanner from Red Hat (RH Scan-ner) [5] oo7 [43] and SpecFuzz [39] RH Scanner is a staticanalysis tool that can be used to scan for Spectre gadgetsOo7 [43] is another static analysis tool that utilizes static taintanalysis to find Spectre gadgets SpecFuzz [39] extends thefuzzing technique to detect errors in speculative execution andreport Spectre gadgets Another related work SPECTECTORleverages symbolic execution to detect information-flow differ-ences introduced by speculative execution Since the manualsettings to make it work on large programs are not open-source it is hard to evaluate it on our real-world benchmarksAs discussed in SpecFuzz [39] the authors failed to runSPECTECTOR on the real-world benchmarks due to a largenumber of unsupported instructions Therefore we did notcompare with SPECTECTOR on real-world benchmarks (samewith SpecFuzzrsquos)

Evaluation Dataset The evaluation is conducted on twodatasets the Spectre Samples Dataset and Real-world DatasetAligned with other baseline works we utilize the same SpectreSamples Dataset to demonstrate the detection capability ofSpecTaint For baseline comparison with related works wecollected six real-world applications and created two real-world datasets

bull Spectre Samples Dataset This dataset is designed todemonstrate the efficacy of SpecTaint We collected15 Spectre V1 samples created by Paul Kocher [3] andcompiled 15 samples with the same configuration (gcc-484 with O0) [39]

bull Real-world V1 Dataset For fair baseline comparisonwe use the same dataset as SpecFuzzrsquos [39] This datasetcontains six widely used applications one cryptographicprogram from OpenSSL [12] a compression program(Brotli [7]) and four parsing programs (JSON [10]LibHTP [11] HTTP [9] and YAML [6])

bull Real-world V2 Dataset Systematic baseline evaluationis difficult due to the shortage of ground truth in real-world programs To solve this problem we injectedknown Spectre gadgets into programs from Real-worldV1 Dataset and created the Real-world V2 Dataset Weadopt the same injection approach proposed in LAVA [20]to build this dataset More specifically we utilize dynamictaint analysis to find attack points which can be controlledby input bytes that do not determine control flow andhave not been modified much (see [20] for more details)Then we inject the Spectre gadgets from Spectre SampleDataset into target programs and add code to makeinjected gadgets controllable via input bytes As a resultwe injected 15 Spectre V1 gadgets from Spectre SampleDataset into 52 different locations in six programs Tomake a fair comparison the input seeds used in the evalu-ation are all generated by a fuzzing tool [8] Therefore wehave no clue whether the injected locations are coveredby input seeds in the evaluation It is worth mentioningthat this dataset is used to evaluate the detection coverageof SpecTaint and related works instead of pathcodecoverage and we choose the same seed corpus withSpecFuzzrsquos to guarantee a fair comparison

7

Evaluation Metrics For the Real-world V1 dataset whichdoes not have ground truth we manually verify the detectionresults and calculate the precision rate to quantify the perfor-mance of SpecTaint and baseline approaches The precisionis calculated as precision(P ) = TP

TP+FP where TP is thenumber of detected gadgets that are manually verified to beexploitable and FP is the number of detected gadgets thatare not exploitable based on manual analysis For Real-worldV2 Dataset we only consider injected Spectre gadgets to betrue positives and other reported results to be false positivesThen we calculate the precision and recall to quantify theeffectiveness of the proposed approach and baseline methodsThe recall is calculated as recall(R) = TP

TP+FN where TP isthe number of inserted gadgets which are correctly detectedand FN is the number of inserted gadgets that are missed Theprecision is calculated as precision(P ) = TP

TP+FP where FPis the number of detected gadgets other than injected ones Tomeasure the efficiency we reported runtime per speculativeexecution and number of paths explored per speculative exe-cution along with other statistics (see Section VI-E for moredetails)

Configuration The experiments were conducted on a desktopwith 16 GB memory Intel Core i7 12 cores at 370 GHzCPU and running Linux 415 The Guest OS in QEMU isUbuntu 1404 with 1 GB memory The speculative window isdependent on the space limit ie ROB size and timing limitsWe follow the configuration used by SpecFuzz [39] and alsoset the speculative window size to 250

B Baseline Evaluation on Spectre Samples Dataset

In this experiment we compared SpecTaint with threebaseline tools on the Spectre Sample Dataset As presented inTable II SpecTaint successfully detected all Spectre gadgetsin the Spectre Samples Dataset while RH Scanner relies onsyntax-based pattern matching and missed three cases

C Baseline Comparison on Real-world V2 Dataset

We conducted the baseline comparison with three relatedworks RH Scanner oo7 and SpecFuzz on Real-world V2Dataset In this experiment we focused on detecting theinserted gadgets therefore we only consider inserted gadgetsto be true positives and all other detection results to befalse positives For tools that utilize taint tracking to detectSpectre gadget (oo7 and SpecTaint) we mark input bytes astaint sources Since the analysis of oo7 is very slow whenperforming whole input bytes tainting we only mark inputbytes that influence injected gadgets as taint sources We adoptthe same configuration for SpecTaint To compare with RHScanner and SpecFuzz we have another configuration forSpecTaint where we mark all input bytes as tainted Theresults for the former configuration are labeled with ldquordquo inTable I For dynamic analysis tools (SpecTaint and SpecFuzz)we used an external fuzzing tool [8] to fuzz the six programsfor 10 hours and fed the generated seeds as inputs to run theprograms

1 2 i f ( p a r s e rminusgtt o k n e x t gt= num tokens ) 3 I n s e r t e d S p e c t r e Gadget4 i f d e f SPECTRE VARIANT5 i f ( g l o b a l i d x lt a r r a y 1 s i z e )6 tmp amp= a r r a y 2 [ a r r a y 1 [ g l o b a l i d x ] lowast 5 1 2 ] 7 8 e n d i f9

Listing 2 Missed Spectre gadget 1 by SpecFuzz

1 2 i f ( lminusgtf i r s t + i d x lt lminusgtmax size ) 3 r e t u r n ( vo id lowast) lminusgte l e m e n t s [ lminusgtf i r s t + i d x ] 4 e l s e 5 I n s e r t e d S p e c t r e Gadget6 i f d e f SPECTRE VARIANT7 i n t temp = 0 8 i n t lowastadd r = ampg l o b a l i d x 9 i f (lowast add r lt a r r a y 1 s i z e )

10 temp amp= a r r a y 2 [ a r r a y 1 [lowast add r ] lowast 5 1 2 ] 11 12 e n d i f13

Listing 3 Missed Spectre gadget 2 by SpecFuzz

Table I shows that when tainting gadget-related inputbytes SpecTaint has no false positives and achieves a pre-cision rate of 100 Under the same configuration howeveroo7 still produced false positives For instance it reported13 gadget candidates in LibHTP but 12 of them are falsepositives The results show that static taint analysis suffersfrom the over-tainting issue thereby is hard to achieve highprecision in detecting Spectre gadgets As presented in Table Ioo7 has many false negatives For example it missed allinserted gadgets in YAML and Brotli We examined these falsenegatives and found the reasons are as follows Firstly thedetection results of oo7 [43] depend on the completeness of thecontrol-flow graph (CFG) extraction Some inserted gadgetsare missed since it failed to extract a complete CFG dueto the limitation of the static approach Also oo7 is limitedby static inter-procedure taint tracking and it missed manyinserted gadgets because it failed to propagate the taint sourceto the injection points in the target programs The evaluationresults substantiate our claim that dynamic taint analysis ismuch more accurate and effective than static taint analysis indetecting Spectre gadgets

As presented in Table I SpecTaint outperforms RHScanner and SpecFuzz under the whole input bytes taintingconfiguration Since we only consider injected gadgets to betrue positives in this dataset other detection results are labeledas false positives The analysis results of other gadgets reportedby SpecTaint are presented in Table IV Note that Spec-Taint missed some injected gadgets in this experiment Wefurther investigated the results and found that missed gadgetsare not covered by input seeds However SpecFuzz missedmany injected gadgets that are covered by input seeds anddetected by SpecTaint According to our analysis the reasonsare first SpecFuzz adopts a prioritized simulation of branchmispredictions it selectively chooses whether to simulate themisprediction or not on a conditional branch Therefore itmissed some injected gadgets For example as presented inListing 2 the seeds are able to reach the branch at line 2 butSpecFuzz did not simulate branch misprediction over these

8

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 7: SpecTaint: Speculative Taint Analysis for Discovering

are shared and commonly indexed by the virtual address of thebranch instruction thus can be poisoned from another attack-controlled process by executing a congruent branch with thesame virtual address Based on this finding Spectre gadgetscan be exploited in the same address space (ie intra-process)or across address spaces (ie cross-process) [17] Thereforeonly considering tainted branches will exclude cross-processgadgets Our gadget patterns cover both situations if thespeculative execution is triggered by a taint conditional branchwe will mark this gadget as an intra-process Spectre gadgetotherwise this gadget will be marked as a cross-processSpectre gadget

C Gadget Detection

The gadget detection is to check whether the speculativeinstruction trace conforms to the gadget patterns describedabove We deploy the gadget checker in simulated speculativeexecution Although both our gadget patterns involve memoryaccess operations it is inefficient and unnecessary to checkall the memory access instructions on SE paths Instead weonly check memory access instructions which are tainted Tothis end we instrument memory read and write operationsTo detect BCB gadgets for each memory read we checkwhether its source is tainted To make sure the dependencydep(j [i ]) rule is satisfied between two instructions we alsotrack whether the tainted operand of one instruction j ispropagated from instruction i For BCBS gadgets the patterncaptures any tainted memory write whose destination addressis marked as tainted within the SEW

V IMPLEMENTATION

We have implemented the prototype SpecTaint in C Morespecifically we wrote a C plugin of 1 KLOC in C codeon top of DECAF [25] (a dynamic binary analysis platformbuilt on top of QEMU 10) This plugin implements thestate checkpoint management and Spectre gadget detectioncomponents We reused the dynamic taint analysis plugin ofDECAF for our taint analysis Overall the changes to developour prototype do not exceed 2 KLOC Besides to increase thecode coverage we use AFL 252b [2] and honggfuzz [8] togenerate seed inputs

VI EXPERIMENTAL EVALUATION

In this section we evaluate SpecTaint to answer thefollowing research questions

1) How effective is SpecTaint to find Spectre gadgets com-pared with other existing tools

2) How efficient is SpecTaint to find Spectre gadgets inreal-world applications

This section is composed as follows First we brieflydescribe the experiment setup the datasets and the evaluationmetrics used in our experiments (Sections VI-A) Second weevaluate the efficacy of SpecTaint (Sections VI-B) Then weevaluate the efficiency of SpecTaint on real-world applications(Sections VI-E) Finally we conduct case studies of someSpectre gadgets detected by SpecTaint in real-world applica-tions and the deep learning framework Caffe (Sections VI-F)

A Experiment Setup

Baseline Methods We compare SpecTaint with three base-line approaches Spectre 1 Scanner from Red Hat (RH Scan-ner) [5] oo7 [43] and SpecFuzz [39] RH Scanner is a staticanalysis tool that can be used to scan for Spectre gadgetsOo7 [43] is another static analysis tool that utilizes static taintanalysis to find Spectre gadgets SpecFuzz [39] extends thefuzzing technique to detect errors in speculative execution andreport Spectre gadgets Another related work SPECTECTORleverages symbolic execution to detect information-flow differ-ences introduced by speculative execution Since the manualsettings to make it work on large programs are not open-source it is hard to evaluate it on our real-world benchmarksAs discussed in SpecFuzz [39] the authors failed to runSPECTECTOR on the real-world benchmarks due to a largenumber of unsupported instructions Therefore we did notcompare with SPECTECTOR on real-world benchmarks (samewith SpecFuzzrsquos)

Evaluation Dataset The evaluation is conducted on twodatasets the Spectre Samples Dataset and Real-world DatasetAligned with other baseline works we utilize the same SpectreSamples Dataset to demonstrate the detection capability ofSpecTaint For baseline comparison with related works wecollected six real-world applications and created two real-world datasets

bull Spectre Samples Dataset This dataset is designed todemonstrate the efficacy of SpecTaint We collected15 Spectre V1 samples created by Paul Kocher [3] andcompiled 15 samples with the same configuration (gcc-484 with O0) [39]

bull Real-world V1 Dataset For fair baseline comparisonwe use the same dataset as SpecFuzzrsquos [39] This datasetcontains six widely used applications one cryptographicprogram from OpenSSL [12] a compression program(Brotli [7]) and four parsing programs (JSON [10]LibHTP [11] HTTP [9] and YAML [6])

bull Real-world V2 Dataset Systematic baseline evaluationis difficult due to the shortage of ground truth in real-world programs To solve this problem we injectedknown Spectre gadgets into programs from Real-worldV1 Dataset and created the Real-world V2 Dataset Weadopt the same injection approach proposed in LAVA [20]to build this dataset More specifically we utilize dynamictaint analysis to find attack points which can be controlledby input bytes that do not determine control flow andhave not been modified much (see [20] for more details)Then we inject the Spectre gadgets from Spectre SampleDataset into target programs and add code to makeinjected gadgets controllable via input bytes As a resultwe injected 15 Spectre V1 gadgets from Spectre SampleDataset into 52 different locations in six programs Tomake a fair comparison the input seeds used in the evalu-ation are all generated by a fuzzing tool [8] Therefore wehave no clue whether the injected locations are coveredby input seeds in the evaluation It is worth mentioningthat this dataset is used to evaluate the detection coverageof SpecTaint and related works instead of pathcodecoverage and we choose the same seed corpus withSpecFuzzrsquos to guarantee a fair comparison

7

Evaluation Metrics For the Real-world V1 dataset whichdoes not have ground truth we manually verify the detectionresults and calculate the precision rate to quantify the perfor-mance of SpecTaint and baseline approaches The precisionis calculated as precision(P ) = TP

TP+FP where TP is thenumber of detected gadgets that are manually verified to beexploitable and FP is the number of detected gadgets thatare not exploitable based on manual analysis For Real-worldV2 Dataset we only consider injected Spectre gadgets to betrue positives and other reported results to be false positivesThen we calculate the precision and recall to quantify theeffectiveness of the proposed approach and baseline methodsThe recall is calculated as recall(R) = TP

TP+FN where TP isthe number of inserted gadgets which are correctly detectedand FN is the number of inserted gadgets that are missed Theprecision is calculated as precision(P ) = TP

TP+FP where FPis the number of detected gadgets other than injected ones Tomeasure the efficiency we reported runtime per speculativeexecution and number of paths explored per speculative exe-cution along with other statistics (see Section VI-E for moredetails)

Configuration The experiments were conducted on a desktopwith 16 GB memory Intel Core i7 12 cores at 370 GHzCPU and running Linux 415 The Guest OS in QEMU isUbuntu 1404 with 1 GB memory The speculative window isdependent on the space limit ie ROB size and timing limitsWe follow the configuration used by SpecFuzz [39] and alsoset the speculative window size to 250

B Baseline Evaluation on Spectre Samples Dataset

In this experiment we compared SpecTaint with threebaseline tools on the Spectre Sample Dataset As presented inTable II SpecTaint successfully detected all Spectre gadgetsin the Spectre Samples Dataset while RH Scanner relies onsyntax-based pattern matching and missed three cases

C Baseline Comparison on Real-world V2 Dataset

We conducted the baseline comparison with three relatedworks RH Scanner oo7 and SpecFuzz on Real-world V2Dataset In this experiment we focused on detecting theinserted gadgets therefore we only consider inserted gadgetsto be true positives and all other detection results to befalse positives For tools that utilize taint tracking to detectSpectre gadget (oo7 and SpecTaint) we mark input bytes astaint sources Since the analysis of oo7 is very slow whenperforming whole input bytes tainting we only mark inputbytes that influence injected gadgets as taint sources We adoptthe same configuration for SpecTaint To compare with RHScanner and SpecFuzz we have another configuration forSpecTaint where we mark all input bytes as tainted Theresults for the former configuration are labeled with ldquordquo inTable I For dynamic analysis tools (SpecTaint and SpecFuzz)we used an external fuzzing tool [8] to fuzz the six programsfor 10 hours and fed the generated seeds as inputs to run theprograms

1 2 i f ( p a r s e rminusgtt o k n e x t gt= num tokens ) 3 I n s e r t e d S p e c t r e Gadget4 i f d e f SPECTRE VARIANT5 i f ( g l o b a l i d x lt a r r a y 1 s i z e )6 tmp amp= a r r a y 2 [ a r r a y 1 [ g l o b a l i d x ] lowast 5 1 2 ] 7 8 e n d i f9

Listing 2 Missed Spectre gadget 1 by SpecFuzz

1 2 i f ( lminusgtf i r s t + i d x lt lminusgtmax size ) 3 r e t u r n ( vo id lowast) lminusgte l e m e n t s [ lminusgtf i r s t + i d x ] 4 e l s e 5 I n s e r t e d S p e c t r e Gadget6 i f d e f SPECTRE VARIANT7 i n t temp = 0 8 i n t lowastadd r = ampg l o b a l i d x 9 i f (lowast add r lt a r r a y 1 s i z e )

10 temp amp= a r r a y 2 [ a r r a y 1 [lowast add r ] lowast 5 1 2 ] 11 12 e n d i f13

Listing 3 Missed Spectre gadget 2 by SpecFuzz

Table I shows that when tainting gadget-related inputbytes SpecTaint has no false positives and achieves a pre-cision rate of 100 Under the same configuration howeveroo7 still produced false positives For instance it reported13 gadget candidates in LibHTP but 12 of them are falsepositives The results show that static taint analysis suffersfrom the over-tainting issue thereby is hard to achieve highprecision in detecting Spectre gadgets As presented in Table Ioo7 has many false negatives For example it missed allinserted gadgets in YAML and Brotli We examined these falsenegatives and found the reasons are as follows Firstly thedetection results of oo7 [43] depend on the completeness of thecontrol-flow graph (CFG) extraction Some inserted gadgetsare missed since it failed to extract a complete CFG dueto the limitation of the static approach Also oo7 is limitedby static inter-procedure taint tracking and it missed manyinserted gadgets because it failed to propagate the taint sourceto the injection points in the target programs The evaluationresults substantiate our claim that dynamic taint analysis ismuch more accurate and effective than static taint analysis indetecting Spectre gadgets

As presented in Table I SpecTaint outperforms RHScanner and SpecFuzz under the whole input bytes taintingconfiguration Since we only consider injected gadgets to betrue positives in this dataset other detection results are labeledas false positives The analysis results of other gadgets reportedby SpecTaint are presented in Table IV Note that Spec-Taint missed some injected gadgets in this experiment Wefurther investigated the results and found that missed gadgetsare not covered by input seeds However SpecFuzz missedmany injected gadgets that are covered by input seeds anddetected by SpecTaint According to our analysis the reasonsare first SpecFuzz adopts a prioritized simulation of branchmispredictions it selectively chooses whether to simulate themisprediction or not on a conditional branch Therefore itmissed some injected gadgets For example as presented inListing 2 the seeds are able to reach the branch at line 2 butSpecFuzz did not simulate branch misprediction over these

8

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 8: SpecTaint: Speculative Taint Analysis for Discovering

Evaluation Metrics For the Real-world V1 dataset whichdoes not have ground truth we manually verify the detectionresults and calculate the precision rate to quantify the perfor-mance of SpecTaint and baseline approaches The precisionis calculated as precision(P ) = TP

TP+FP where TP is thenumber of detected gadgets that are manually verified to beexploitable and FP is the number of detected gadgets thatare not exploitable based on manual analysis For Real-worldV2 Dataset we only consider injected Spectre gadgets to betrue positives and other reported results to be false positivesThen we calculate the precision and recall to quantify theeffectiveness of the proposed approach and baseline methodsThe recall is calculated as recall(R) = TP

TP+FN where TP isthe number of inserted gadgets which are correctly detectedand FN is the number of inserted gadgets that are missed Theprecision is calculated as precision(P ) = TP

TP+FP where FPis the number of detected gadgets other than injected ones Tomeasure the efficiency we reported runtime per speculativeexecution and number of paths explored per speculative exe-cution along with other statistics (see Section VI-E for moredetails)

Configuration The experiments were conducted on a desktopwith 16 GB memory Intel Core i7 12 cores at 370 GHzCPU and running Linux 415 The Guest OS in QEMU isUbuntu 1404 with 1 GB memory The speculative window isdependent on the space limit ie ROB size and timing limitsWe follow the configuration used by SpecFuzz [39] and alsoset the speculative window size to 250

B Baseline Evaluation on Spectre Samples Dataset

In this experiment we compared SpecTaint with threebaseline tools on the Spectre Sample Dataset As presented inTable II SpecTaint successfully detected all Spectre gadgetsin the Spectre Samples Dataset while RH Scanner relies onsyntax-based pattern matching and missed three cases

C Baseline Comparison on Real-world V2 Dataset

We conducted the baseline comparison with three relatedworks RH Scanner oo7 and SpecFuzz on Real-world V2Dataset In this experiment we focused on detecting theinserted gadgets therefore we only consider inserted gadgetsto be true positives and all other detection results to befalse positives For tools that utilize taint tracking to detectSpectre gadget (oo7 and SpecTaint) we mark input bytes astaint sources Since the analysis of oo7 is very slow whenperforming whole input bytes tainting we only mark inputbytes that influence injected gadgets as taint sources We adoptthe same configuration for SpecTaint To compare with RHScanner and SpecFuzz we have another configuration forSpecTaint where we mark all input bytes as tainted Theresults for the former configuration are labeled with ldquordquo inTable I For dynamic analysis tools (SpecTaint and SpecFuzz)we used an external fuzzing tool [8] to fuzz the six programsfor 10 hours and fed the generated seeds as inputs to run theprograms

1 2 i f ( p a r s e rminusgtt o k n e x t gt= num tokens ) 3 I n s e r t e d S p e c t r e Gadget4 i f d e f SPECTRE VARIANT5 i f ( g l o b a l i d x lt a r r a y 1 s i z e )6 tmp amp= a r r a y 2 [ a r r a y 1 [ g l o b a l i d x ] lowast 5 1 2 ] 7 8 e n d i f9

Listing 2 Missed Spectre gadget 1 by SpecFuzz

1 2 i f ( lminusgtf i r s t + i d x lt lminusgtmax size ) 3 r e t u r n ( vo id lowast) lminusgte l e m e n t s [ lminusgtf i r s t + i d x ] 4 e l s e 5 I n s e r t e d S p e c t r e Gadget6 i f d e f SPECTRE VARIANT7 i n t temp = 0 8 i n t lowastadd r = ampg l o b a l i d x 9 i f (lowast add r lt a r r a y 1 s i z e )

10 temp amp= a r r a y 2 [ a r r a y 1 [lowast add r ] lowast 5 1 2 ] 11 12 e n d i f13

Listing 3 Missed Spectre gadget 2 by SpecFuzz

Table I shows that when tainting gadget-related inputbytes SpecTaint has no false positives and achieves a pre-cision rate of 100 Under the same configuration howeveroo7 still produced false positives For instance it reported13 gadget candidates in LibHTP but 12 of them are falsepositives The results show that static taint analysis suffersfrom the over-tainting issue thereby is hard to achieve highprecision in detecting Spectre gadgets As presented in Table Ioo7 has many false negatives For example it missed allinserted gadgets in YAML and Brotli We examined these falsenegatives and found the reasons are as follows Firstly thedetection results of oo7 [43] depend on the completeness of thecontrol-flow graph (CFG) extraction Some inserted gadgetsare missed since it failed to extract a complete CFG dueto the limitation of the static approach Also oo7 is limitedby static inter-procedure taint tracking and it missed manyinserted gadgets because it failed to propagate the taint sourceto the injection points in the target programs The evaluationresults substantiate our claim that dynamic taint analysis ismuch more accurate and effective than static taint analysis indetecting Spectre gadgets

As presented in Table I SpecTaint outperforms RHScanner and SpecFuzz under the whole input bytes taintingconfiguration Since we only consider injected gadgets to betrue positives in this dataset other detection results are labeledas false positives The analysis results of other gadgets reportedby SpecTaint are presented in Table IV Note that Spec-Taint missed some injected gadgets in this experiment Wefurther investigated the results and found that missed gadgetsare not covered by input seeds However SpecFuzz missedmany injected gadgets that are covered by input seeds anddetected by SpecTaint According to our analysis the reasonsare first SpecFuzz adopts a prioritized simulation of branchmispredictions it selectively chooses whether to simulate themisprediction or not on a conditional branch Therefore itmissed some injected gadgets For example as presented inListing 2 the seeds are able to reach the branch at line 2 butSpecFuzz did not simulate branch misprediction over these

8

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 9: SpecTaint: Speculative Taint Analysis for Discovering

RH Scanner oo7 SpecFuzz SpecTaintProgram GT TP FP FN Precision Recall TP FP FN Precision Recall TP FP FN Precision Recall TP FP FP FN Precision Precision Recall

JSMN 3 1 448 2 0002 033 3 0 0 100 100 2 17 1 0105 067 3 0 1 0 100 0750 100Brotli 13 2 811 11 0003 015 0 0 13 NA 0 7 43 6 0140 054 12 0 17 1 100 0141 092HTTP 9 3 128 6 0023 033 1 1 8 05 011 8 9 1 0471 089 8 0 6 1 100 0574 089LibHTP 7 4 254 3 0016 057 1 12 6 0077 014 5 79 2 0059 071 7 0 14 0 100 0333 100YAML 10 2 36 8 0526 020 0 0 10 NA 0 4 215 6 0018 040 7 0 3 3 100 0700 070SSL 10 3 100 7 0029 030 7 8 3 0467 070 6 55 4 0098 060 10 0 16 0 100 0385 100

TABLE I Evaluation Results on Real-world V2 dataset (GT ground truth TP true positive FP false positive FN falsenegative) means we only mark input bytes that influence injected gadgets as taint sources

Total RH Scanner oo7 SpecFuzz SpecTaintSpectre 15 12 15 15 15

TABLE II The number of detected Spectre gadgets from theSpectre Sample Dataset

JSMN Brotli HTTP LibHTP YAML SSLSpecFuzz 16 68 9 91 140 589Reproduce 17 43 9 79 215 55SpecTaint 1 17 6 14 3 16Tainted branch 1 13 6 9 0 13Unique 0 6 1 0 0 4

TABLE III The number of detected gadgets in Real-world V1Dataset Reproduce shows the results that were reproducedusing the open-sourced SpecFuzz implementation Taintedbranch means the gadgets are detected during speculativeexecution over tainted branches Unique means the gadgetsare detected only by SpecTaint

two conditional branches and thus missed the inserted gadgetat line 5 Second SpecFuzz stops the simulation and rollsback to a previously saved state once an invalid memoryaccess or other exceptions are captured Thus it will missgadgets located after the invalid memory access or exceptionFor example as presented in Listing 3 SpecFuzz detected anout-of-bounds access at line 3 then it stopped the simulatedspeculative execution and restored As a result it failed todetect the inserted gadget in the ldquoelserdquo branch

Table I also shows that RH Scanner missed many insertedgadgets This is because it failed to explore the execution pathsof target programs due to the limitation of the static pathexploration it uses It also produced a large number of detectionresults due to the simplistic syntax-based gadget modeling Inthis experiment since there are too many results reported byRH Scanner and we only focus on inserted gadgets we did notinspect all detection results and only consider inserted gadgetsto be true positives In conclusion the evaluation results showthat SpecTaint outperforms state-of-the-art tools in terms ofprecision and recall when analyzing real-world programs

D Baseline Evaluation on Real-world V1 Dataset

We further conducted a baseline comparison with anotherdynamic analysis approach SpecFuzz [39] In this experimentwe evaluate two aspects of dynamic Spectre gadget detectiontools The first is the effectiveness which aims at comparingthe number of detected gadgets Since there is no groundtruth in this dataset we manually analyze detected gadgetsand consider exploitable gadgets to be true positives based onhuman knowledge The second is efficiency This is to evaluate

SpecFuzz SpecTaintProgram TP FP Precision TP FP PrecisionJSMN 1 16 0059 1 0 1Brotli 5 38 0116 11 6 0647HTTP 2 7 0222 2 4 0333LibHTP 3 76 0038 3 11 0214YAML 0 215 0 0 3 0SSL 2 53 0036 6 10 0375

TABLE IV True positive false positive and precision ofgadgets detected by SpecFuzz and SpecTaint on Real-worldV1 dataset

the performance of gadget detection tools and runtime perfor-mance of target programs after patching detected gadgets

1 2 f o r ( k = 1 k lt wid th k ++)3 4 o c t e t = p a r s e rminusgtr aw buf fe r p o i n t e r [ k ] 5 v a l u e = ( v a l u e ltlt 6) + ( o c t e t amp 0x3F ) 6 7

Listing 4 One false positive in YAML reported bySpecFuzz

Effectiveness We evaluated the detection effectiveness on theReal-world V1 Dataset six original programs without gadgetinstrumentation We run SpecTaint and SpecFuzz on theseprograms and manually analyze how many Spectre gadgetsare detected in these programs To make a fair comparisonwe first reproduced similar detection results as reported inthe SpecFuzz paper and used the same seed corpus to runSpecTaint More specifically we ran SpecFuzz on each pro-gram for 10 hours (single-thread mode) collected the detectionresults and generated seeds then we used the same seeds torun SpecTaint on six applications The number of detectedgadgets is presented in Table III Note that for OpenSSL wewere not able to reproduce similar results so we listed it asa reference For the rest we were able to reproduce similarresults as presented in their paper [39]

As presented in Table III SpecTaint detected fewer gad-gets than SpecFuzz did Then we manually investigated eachof the detected gadgets and determined true positives and falsepositives As listed in Table IV SpecTaint outperforms Spec-Fuzz in terms of precision rate Although SpecFuzz reporteda large number of gadget candidates most of them are falsepositives according to our manual inspection For examplesome candidates reported by SpecFuzz only contain one out-of-bound access because SpecFuzz considers a memory errorto be a sign of a Spectre gadget Furthermore some false

9

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 10: SpecTaint: Speculative Taint Analysis for Discovering

positives are not data dependent on input bytes as presentedin Listing 4 so they can not be controlled to read and leakarbitrary values We also found that some candidate gadgetsare under loops or constant comparison statements wherespeculation can be resolved shortly Therefore these gadgetsare considered false positives This is also why SpecTaintproduces false positives (see more discussion in Section VII)As presented in Table IV SpecTaint has significantly higherprecision than SpecFuzz For instance SpecFuzz and Spec-Taint detected three true positives for LibHTP but SpecTaintonly has 11 false positives while SpecFuzz reported 76 falsepositives Also SpecFuzz missed some true positives that arereported by SpecTaint For instance SpecTaint reported 11true positives for Brotli and six of them are not detected bySpecFuzz (labeled as unique in Table III) One explanationis that the gadget detection of SpecFuzz is probabilistic andgenerated seeds did not trigger memory errors when executingthose gadgets thus they were missed by SpecFuzzrsquos sanitizer-based gadget detector This also substantiates that the taint-based gadget pattern checking of SpecTaint is deterministicbecause it detects any executed gadgets as long as they satisfythe pre-defined gadget patterns

1

2 s t a t i c BROTLI INLINE B r o t l i D e c o d e r E r r o r C o d eP r o c e s s C o m m a n d s I n t e r n a l (

3 i n t s a f e B r o t l i D e c o d e r S t a t elowast s ) 4 5 t r a n s f o r m i d x i s t a i n t e d6 i f ( t r a n s f o r m i d x lt ( i n t ) t r a n s f o r m sminusgtnum transforms )

7 c o n s t u i n t 8 tlowast word = ampwordsminusgtd a t a [ o f f s e t ] 8 i n t l e n = i 9 i f ( t r a n s f o r m i d x == t r a n s f o r m sminusgtc u t O f f T r a n s f o r m s

[ 0 ] ) 10 memcpy(ampsminusgtr i n g b u f f e r [ pos ] word ( s i z e t ) l e n ) 11 e l s e 12 l e n = B r o t l i T r a n s f o r m D i c t i o n a r y W o r d (ampsminusgt

r i n g b u f f e r [ pos ] word l en t r a n s f o r m s t r a n s f o r m i d x )

13 14 15 16 17 18 r e a d and l e a k s e c r e t u s i n g t r a n s f o r m i d x as i n d e x 19 d e f i n e BROTLI TRANSFORM PREFIX ID( T I ) ( ( T )minusgtt r a n s f o r m s

[ ( ( I ) lowast 3) + 0 ] )20 d e f i n e BROTLI TRANSFORM PREFIX( T I ) (amp(T )minusgt

p r e f i x s u f f i x [ ( T )minusgtp r e f i x s u f f i x m a p [BROTLI TRANSFORM PREFIX ID( T I ) ] ] )

21

22 i n t B r o t l i T r a n s f o r m D i c t i o n a r y W o r d ( u i n t 8 tlowast d s t c o n s tu i n t 8 tlowast word i n t l en

23 c o n s t B r o t l i T r a n s f o r m slowast t r a n s f o r m s i n t t r a n s f o r m i d x )

24 i n t i d x = 0 25 c o n s t u i n t 8 tlowast p r e f i x = BROTLI TRANSFORM PREFIX(

t r a n s f o r m s t r a n s f o r m i d x ) 26 27

Listing 5 Speculative BCB gadget found in Brotli

Performance Overhead after Patching Afterward we ap-plied a serialization tool to patch all reported gadget locationsin six programs and then compared the performance of sixprograms after patching the reported gadgets Specificallywe used a modified version of Speculative Load Hardening(SLH) [18] shipped with SpecFuzz and patched the pro-grams We only patched the gadgets reported by SpecFuzz

Fig 3 The performance overhead after patching wrt native

and SpecTaint and compared the runtime performance withfully hardened programs (patching all conditional branches)In this experiment we used benchmarks shipped with theprograms if available as well as test benchmarks provided bySpecFuzz Figure 3 shows the comparison results As we cansee patching the gadgets detected by SpecTaint introducesnegligible overhead compared with patching gadgets detectedby SpecFuzz and full hardening On average the performanceoverhead was reduced by 55 compared with SpecFuzzrsquospatching and 73 compared with full hardening As presentedin Table III SpecTaint produced fewer gadget candidatesthan SpecFuzz Therefore the runtime overhead introduced bypatching those gadgets were reduced greatly We also foundthat in some cases SLH patching introduces large runtimeoverhead For instance SpecFuzz found three more gadgetcandidates in HTTP but the performance slowdown causedby these three candidates is almost 60 It is because thesethree candidates are located on hot paths that are exercisedfrequently This also demonstrates the importance of highprecision in gadget detection

E Efficiency Evaluation

In this experiment we evaluated the runtime performanceof SpecTaint on the six real-world applications Since theworkflow of SpecTaint is fundamentally different from Spec-Fuzz it is not easy to have an end-to-end runtime comparisonSpecFuzz extends the fuzzing technique and its fuzzing proce-dure and gadget detector are closely coupled While SpecTaintis a detection tool and can receive test cases from fuzzers Inthis evaluation we marked the user inputs as taint sources andsimulate speculative execution over tainted branches Note thatSpecTaint is able to simulate speculative execution on everyconditional branch The runtime of simulating speculative overtainted branches is reasonable to reflect the efficiency becauseas presented in Table III around 80 of detected gadgets areenclosed by tainted branches (intra-process gadgets IV-B)

We collected several statistical results including the to-tal number of executed branches at the normal mode the

10

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 11: SpecTaint: Speculative Taint Analysis for Discovering

Lib Name of Seeds Binary Size Analysis Time(h) of Branch of Tainted Branch of Nested Branch PathBranch TimePath(ms)JSMN 6204 17K 1075 554017 178542 5498472 31 70Brotli 100 440K 065 33912 8900 274948 31 86HTTP 944 85K 524 184044 22531 94477 5 2000

LibHTP 155 549K 04 108101 29836 423263 14 34YAML 2525 251K 684 5311472 604917 30303111 50 08

SSL 46 11M 175 122694 56914 189332 4 4320

TABLE V Runtime Performance Results on real-world applications

total number of tainted branches for speculative executionsimulation the total number of nested branches explored inspeculative execution mode and the number of speculativeexecution paths on average to be explored We also collectedthe execution time for each simulated speculative executionpath The execution time includes the taint analysis pathexploration time and gadget detection time

Since SpecTaint extends a whole-system emulation plat-form and performs dynamic taint analysis by design it paysmore runtime overhead for speculative taint analysis Howeveras presented in Table V we can see that SpecTaint isable to analyze large programs within a reasonable amountof time For example for Brotli SpecTaint finished 8900branches as switch points for speculative execution simulationwithin 40 minutes That means that SpecTaint can finishthe SE path exploration including state management taintanalysis and pattern checking for one switch point within004s Besides SpecTaint has reasonable analysis time Itcan finish the analysis within a few hours for most of theprograms only JSMN and SSL exceed 10 hours In fact SSLis substantially more complex than the other programs andrunning it in the emulator is already very slow On averageSpecTaint takes around 68 hours to analyze one programCompared with SpecFuzz which takes 10 hours to reproducethe results presented in Table III the analysis time in Table Vsuggests that SpecTaint can achieve precise simulation ofspeculative execution and perform dynamic speculative taintanalysis without introducing much overhead

F Case Study

As presented in Table III SpecTaint discovered 11 newSpectre gadgets that were not detected by SpecFuzz [39]After manual inspection we confirmed that ten of them areexploitable and one gadget is considered a false positive (notexploitable see VII) In this section we showcase one detectedSpectre gadget from Real-world V1 Dataset due to the pagelimitation To further demonstrate the capability of SpecTaintwe present one Spectre gadget detected by SpecTaint from awell-known machine learning framework Caffe [1]

Speculative BCB in BROTLI Brotli is a generic-purposelossless compression program SpecTaint found an exploitableSpectre gadget in function ProcessCommandsInternal andBrotliTransformDictionaryWord Listing 5 presents the relevantcode snippets Before calling the function BrotliTransformDic-tionaryWord at line 11 it first checks whether transform -idx is less than num transforms to avoid potentialoverflow In function BrotliTransformDictionaryWord it usestransform idx as index to perform two memory ac-cesses using a macro BROTLI TRANSFORM PREFIX (loada value from an array using transform idx then uses

the loaded value as index to read another array) If thebranch at line 5 was mispredicted during speculative execu-tion BROTLI TRANSFORM PREFIX would perform out-of-bound memory access and leak the loaded value via convertedcache side channels

In this example three properties make this Spectre gadgetexploitable Firstly transform idx is marked as taintedwhich is propagated from user inputs This means the attackercan control its value by manipulating the input Secondlycomputing the branch outcome at line 5 may take hundredsof CPU cycles when transforms-gtnum transformsis not in cache and needs to be fetched from memory Thusit opens a large speculative execution window and allows theexecution of the following gadget during speculative executionFinally the macro BROTLI TRANSFORM PREFIX loads theout-of-bound value using transform idx as an index thenuse the loaded value as an index to access another arrayit first reads potential secret via an out-of-bound memoryaccess and leaks the secret by using the secret as an indexto access another array Thus the attacker can retrieve thesecret via the cache side channel As mentioned this gadgetis newly detected by SpecTaint Although it is hard to findthe reason why SpecFuzz missed this gadget we speculatethat SpecFuzz failed to detect it due to incomplete speculativeexecution simulation As discussed earlier SpecFuzz favorsfuzzing throughput by adopting a lightweight speculative ex-ecution simulation strategy which selectively overlooks somespeculative paths

1 t e m p l a t e lttypename Dtypegt2 vo id So lve rltDtype gt UpdateSmoothedLoss ( Dtype l o s s i n t

s t a r t i t e r 3 i n t a v e r a g e l o s s ) 4 i f ( l o s s e s s i z e ( ) lt a v e r a g e l o s s ) 5 l o s s e s push back ( l o s s ) 6 i n t s i z e = l o s s e s s i z e ( ) 7 smoothed loss = ( smoothed loss lowast ( s i z e minus 1) +

l o s s ) s i z e 8 e l s e 9 i n t i d x = ( i t e r minus s t a r t i t e r ) a v e r a g e l o s s

10 smoothed loss += ( l o s s minus l o s s e s [ i d x ] ) a v e r a g e l o s s

11 l o s s e s [ i d x ] = l o s s 12 13

Listing 6 Speculative Bounds Check Bypass (on Stores)gadget found in Caffe framework

Caffe (v10) We also show a typical bounds check bypasson store gadget in Listing 6 This gadget is discovered inthe Caffe framework using CIFAR-10 as the input modelWe found the gadget during the model training process Inthis experiment we focus on the coverage of the internalcode invoked by the test model rather than the precision

11

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 12: SpecTaint: Speculative Taint Analysis for Discovering

of the model itself Thus we trained the CIFAR-10 modelwith the CIFAR-10 dataset consisting of 60000 images in 10classes The function UpdateSmoothedLoss is defined insrccaffesolvercpp and is frequently called duringthe training process The branch outcome at line 4 dependson the value from losses size() which may takehundreds of cycles to compute In this case the CPU willpredict the branch target and continue to execute speculativelyThe attackers can carefully poison the branch predictor intomispredicting the direction of the branch This may resultin the CPU making a wrong speculative prediction over thisbranch and the value of idx will be greater than its value innormal execution As a result the idx would reach out of thearray boundary and a boundary check bypass on store wouldhappen at line 11

VII DISCUSSION

In this section we discuss the limitations of SpecTaintand possible remedies

1 s t a t i c B r o t l i D e c o d e r E r r o r C o d e ReadSymbolCodeLengths (2 u i n t 3 2 t a l p h a b e t s i z e B r o t l i D e c o d e r S t a t elowast s ) 3 4 code len = BROTLI HC FAST LOAD VALUE( p ) 5 i f ( code len lt BROTLI REPEAT PREVIOUS CODE LENGTH) 6 P r o c e s s S i n g l e C o d e L e n g t h ( code len ampsymbol amp

r e p e a t ampspace 7 ampprev code len s y m b o l l i s t s c o d e l e n g t h h i s t o

next symbol ) 8 9

10 11 s t a t i c BROTLI INLINE vo id P r o c e s s S i n g l e C o d e L e n g t h (

u i n t 3 2 t code len 12 u i n t 3 2 tlowast symbol u i n t 3 2 tlowast r e p e a t u i n t 3 2 tlowast space 13 u i n t 3 2 tlowast prev code len u i n t 1 6 tlowast s y m b o l l i s t s 14 u i n t 1 6 tlowast c o d e l e n g t h h i s t o i n t lowast next symbol ) 15 lowast r e p e a t = 0 16 i f ( code len = 0) 17 s y m b o l l i s t s [ next symbol [ code len ] ] = ( u i n t 1 6 t )

(lowast symbol ) 18 next symbol [ code len ] = ( i n t ) (lowast symbol ) 19 lowastprev code len = code len 20 lowast s p a c e minus= 32768U gtgt code len 21 c o d e l e n g t h h i s t o [ code len ] + + 22 23 (lowast symbol ) ++24

Listing 7 A false positive in Brotli detected by SpecTaint

False positives in detection In this section we discuss thecases where detected gadgets are not exploitable First whenthe triggering instruction can be resolved shortly eg loopsand constant comparison the following gadgets cannot beexecuted before the speculation is resolved However in thiscase SpecTaint would simulate speculative execution with thedefault SEW and detect the following gadgets that satisfy thepre-defined pattern

Second when the triggering instruction opens a large SEW(eg due to a cache miss) SpecTaint may produce false posi-tives in some cases For instance SpecTaint detects a Spectregadget in Brotli shown in Listing 7 In this example thecode len is tainted which is propagated from user inputsIf the branch at line 4 was mispredicted and the CPU specula-tively executes the function ProcessSingleCodeLengthit would pass an out-of-bounds code len to function

ProcessSingleCodeLength At line 17 the out-of-bounds code len is used as an index to access the arraynext symbol the loaded value is further used as an indexto access another array symbol list Thus the attackcan retrieve the out-of-bound value by monitoring the cachestate In practice this gadget is hard to exploit to launch aSpectre V1 attack because the out-of-bound access at line17 uses code len as an index By the time code lenis available the conditional branch at line 5 has already beenresolved which means the speculative execution is terminatedAccording to our pattern checking policy SpecTaint still treatsit as a Spectre gadget

To guarantee a low false negative rate we conservativelyassume the maximum values for CPU optimization parameterssuch as ROB limit This means our detected gadgets may notbe exploitable in some processor models

Incomplete path coverage Like other dynamic analysis toolsour approach is also bounded by the quality of test casesIn other words our approach would fail to detect gadgets inuncovered paths We can leverage state-of-the-art fuzzers [2][8] to increase path coverage Moreover SpecTaint can becombined with other static Spectre gadget detection tools [5][43] and test the uncovered paths to improve coverage Toensure security all the uncovered paths can be hardenedconservatively for security-sensitive projects

Control dependent attacks Our Spectre gadget detectiondepends on dynamic taint analysis that keeps track of directdata flows It will not detect Spectre gadgets that are controldependent on user inputs Compared with the gadgets thatare controlled via direct data-floe dependency the control-dependent gadgets often are of limited attack capabilities andare currently beyond the scope of this work We leave theinvestigation of this kind of gadgets as future work

VIII RELATED WORK

We have discussed related works closely throughout thepaper In this section we briefly survey additional relatedworks We focus on gadget detection techniques Many otherapproaches aim at designing the hardware architecture todefeat the transient execution vulnerabilities [36] [47] [49]Since they are orthogonal to our approach we will only brieflydiscuss these approaches in this section

Transient Execution Attack Variants Transient executionattacks include Spectre-type attacks and Meltdown-type at-tacks in general Spectre-type attacks can be categorized intoSpectre-PHT [29] Spectre-BTB [29] [31] Spectre-RSB [30][34] and Spectre-STL [17] [37] They focus on exploitingdifferent hardware caches For example Spectre-PHT attackspoison the Pattern History Table (PHT) to trigger speculativeexecution Spectre-BTB exploits the Branch Target Buffer(BTB) Meltdown-type attacks usually exploit fault-handlingexceptions such as virtual memory exception [16] [32] [45]or an exception reading a disabled or privileged register [26][42] Aligned with other tools [23] [39] we only focus ondetecting gadgets in victim programs that can be exploited bySpectre V1 attacks and leak sensitive data through cache sidechannels

12

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 13: SpecTaint: Speculative Taint Analysis for Discovering

Spectre Gadget Detection Spectre gadget detection can becategorized as static and dynamic detection techniques Onedirection of the static analysis technique is to model the Spectregadget by using its syntax pattern such as Spectre 1 Scannerfrom RedHat [5] and MSCV Spectre 1 pass [41] and conductthe pattern search on binaries for potential candidates Thesetools produce a large number of false positives Furthermoretheir approaches are not generic and only designed for gad-gets with special patterns Another direction is to explore amore precise modeling by using symbolic execution or statictaint analysis to detect Spectre gadgets These approachesare more reliable and generic [23] [43] but they are stillbounded by limitations of the static analysis For exampleoo7 utilizes a static tainting analysis to capture the data-dependency of Spectre gadget for detection However it inher-its the limitation of static taint analysis such as over-taintingand under-tainting issues SPECTECTOR [23] proposes touse symbolic execution to automatically prove speculativenon-interference or to detect violations However it inheritsthe limitations of symbolic execution and has to sacrificesoundness and completeness of analysis when analyzing largeprograms SpecFuzz [39] extends the fuzzing technique anddetect memory errors during simulated speculative executionHowever to ensure high fuzzing throughput the simulationlogic is over-simplified resulting in poor precision and recallCompared with these approaches SpecTaint is designed toprovide a more precise detection approach that is also scalablefor detecting Spectre gadgets from real-world programs

Mitigation and Defense Intel proposed hardware fixes [44]including improved process and privilege-level separationbut they are only designed for Spectre 20 ConTExT [36]provides a new architecture design using a temporary bufferto mitigate information leakage during speculative executionIt relies on users to identify the confidential information andperform dynamic taint analysis on hardware to keep trackof the confidential information Other approaches [28] [46]either proposed to isolate the cache side channel or providea speculative buffer as a temporary buffer to mitigate thecache leakage but they are still at the design stage At thesystem level the kernel page-table isolation is proposed tomitigate Meltdown attack [22] Many approaches [4] [18]start to add mitigation instructions (serializing instructions ormitigation instructions) at the compiler time to mitigate Spectreattacks Since SpecTaint can effectively provide more preciseSpectre gadget candidates it can greatly reduce the numberof instructions for mitigation insertion Therefore SpecTaintcan improve the runtime performance after patching whichhas been substantiated in Section 3

Dynamic Analysis PIN [33] DynamoRIO [15] and Val-grind [38] are powerful dynamic instrumentation tools In factour approach can also be implemented on these platformsXforce [40] is the first tool to propose the idea of force execu-tion but Xforce is used for code coverage based explorationand it is not designed for the speculative execution simulationOur approach is inspired by their approach and builds theunique features for the speculative execution simulation thatenables dynamic taint analysis on speculative paths

IX CONCLUSION

In this paper we enable dynamic taint analysis for SpectreV1 gadget detection To this end we present a system-levelapproach to simulate and explore the speculative executionand provide fine-grained gadget patterns for precise gadgetdetection We have implemented a prototype SpecTaint todemonstrate the efficacy of our proposed approach We eval-uated the effectiveness of SpecTaint on our Spectre SamplesDataset and real-world programs Our experimental resultsdemonstrate that SpecTaint outperforms the existing meth-ods with reasonable runtime efficiency and it discloses newSpectre V1 gadgets from real-world applications

AVAILABILITY

The source code of SpecTaint and the dataset used in theevaluation can be found via httpsgithubcombitsecurerlabSpecTaintgit

ACKNOWLEDGEMENT

We would like to thank the anonymous reviewers fortheir valuable suggestions and comments This work wassupported by the Office of Naval Research under Award NoN00014-17-1-2893 Any opinions findings and conclusionsor recommendations expressed in this paper are those of theauthors and do not necessarily reflect the views of the fundingagencies

REFERENCES

[1] Caffe httpscaffeberkeleyvisionorg[2] American Fuzzy Lop (252b) httplcamtufcoredumpcxafl 2011[3] Spectre Mitigations in Microsoftrsquos CC++ Compiler httpswww

paulkochercomdocMicrosoftCompilerSpectreMitigationhtml 2018[4] Spectre mitigations in MSVC httpsblogsmsdnmicrosoftcom

vcblog20180115spectre-mitigations-in-msvc 2018[5] SPECTRE Variant 1 scanning tool httpsaccessredhatcomblogs

766093posts3510331 2018[6] LibYAML httpspyyamlorgwikiLibYAML 2019[7] Brotli httpsbrotliorg Accessed June 2020[8] Honggfuzz httphonggfuzzcom Accessed June 2020[9] HTTP httpsgithubcomnodejshttp-parser Accessed June 2020

[10] JSMN httpsgithubcomzsergejsmn Accessed June 2020[11] LibHTP httpsgithubcomOISFlibhtp Accessed June 2020[12] OpenSSL httpswwwopensslorg Accessed June 2020[13] Intel httpsxemgithubiominix86manual

intel-x86-and-64-manual-vol3o fe12b1e2a880e0ce-273html citedby 2019

[14] Fabrice Bellard Qemu a fast and portable dynamic translator In InProceedings of the USENIX Annual Technical Conference (ATC rsquo05)ATC 2005

[15] Derek Bruening Timothy Garnett and Saman Amarasinghe Aninfrastructure for adaptive dynamic optimization In Proceedings ofthe International Symposium on Code Generation and OptimizationFeedback-directed and Runtime Optimization CGO rsquo03 2003

[16] Jo Van Bulck Marina Minkin Ofir Weisse Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Thomas F Wenisch YuvalYarom and Raoul Strackx Foreshadow Extracting the keys to the intelSGX kingdom with transient out-of-order execution In 27th USENIXSecurity Symposium (USENIX Security 18) 2018

[17] Claudio Canella Jo Van Bulck Michael Schwarz Moritz Lipp Ben-jamin von Berg Philipp Ortner Frank Piessens Dmitry Evtyushkin andDaniel Gruss A systematic evaluation of transient execution attacks anddefenses In 28th USENIX Security Symposium (USENIX Security 19)2019

13

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References
Page 14: SpecTaint: Speculative Taint Analysis for Discovering

[18] C Carruth Speculative load hardening httpsdocsgooglecomdocumentd1wwcfv3UV9ZnZVcGiGuoITT61eKo3TmoCS3uXLcJR0editheading=hphdehs44eom6 2018

[19] Ali Davanian Zhenxiao Qi Yu Qu and Heng Yin Decaf++ Elas-tic whole-system dynamic taint analysis In the 22nd InternationalSymposium on Research in Attacks Intrusions and Defenses (RAID)September 2019 RAID 2019

[20] B Dolan-Gavitt P Hulin E Kirda T Leek A Mambretti W Robert-son F Ulrich and R Whelan Lava Large-scale automated vulnerabil-ity addition In 2016 IEEE Symposium on Security and Privacy (SP)pages 110ndash121 2016

[21] J Fustos and H Yun Spectrerewind A framework for leaking secretsto past instructions arXiv 200312208

[22] Daniel Gruss Moritz Lipp Michael Schwarz Richard FellnerClementine Maurice and Stefan Mangard Kaslr is dead Long livekaslr In the 9th International Symposium on Engineering SecureSoftware and Systems (ESSoSrsquo17) 2017

[23] Marco Guarnieri Boris Kopf Jose F Morales Jan Reineke andAndres Sanchez SPECTECTOR principled detection of speculativeinformation flows CoRR abs181208639 2018

[24] A Henderson L K Yan X Hu A Prakash H Yin and S McCa-mant Decaf A platform-neutral whole-system dynamic binary analysisplatform IEEE Transactions on Software Engineering 43(2)164ndash1842017

[25] Andrew Henderson Aravind Prakash Lok Kwong Yan Xunchao HuXujiewen Wang Rundong Zhou and Heng Yin Make it work make itright make it fast Building a platform-neutral whole-system dynamicbinary analysis platform In Proceedings of the 2014 InternationalSymposium on Software Testing and Analysis ISSTA 2014

[26] Intel Q2 2018 speculative execution side channel update 2018[27] K N Khasawneh E M Koruyeh C Song D Evtyushkin D Pono-

marev and N Abu-Ghazaleh Safespec Banishing the spectre of ameltdown with leakage-free speculation In 2019 56th ACMIEEEDesign Automation Conference (DAC) pages 1ndash6 2019

[28] V Kiriansky I Lebedev S Amarasinghe S Devadas and J EmerDawg A defense against cache timing attacks in speculative executionprocessors 2018

[29] Paul Kocher Daniel Genkin Daniel Gruss Werner Haas MikeHamburg Moritz Lipp Stefan Mangard Thomas Prescher MichaelSchwarz and Yuval Yarom Spectre attacks Exploiting speculativeexecution CoRR abs180101203 2018

[30] Esmaeil Mohammadian Koruyeh Khaled N Khasawneh ChengyuSong and Nael Abu-Ghazaleh Spectre returns speculation attacksusing the return stack buffer In Proceedings of the 12th USENIXConference on Offensive Technologies WOOTrsquo18 2018

[31] Sangho Lee Ming-Wei Shih Prasun Gera Taesoo Kim Hyesoon Kimand Marcus Peinado Inferring fine-grained control flow inside SGXenclaves with branch shadowing In 26th USENIX Security Symposium(USENIX Security 17) 2017

[32] Moritz Lipp Michael Schwarz Daniel Gruss Thomas Prescher WernerHaas Anders Fogh Jann Horn Stefan Mangard Paul Kocher DanielGenkin Yuval Yarom and Mike Hamburg Meltdown Reading kernelmemory from user space In 27th USENIX Security Symposium(USENIX Security 18) 2018

[33] Chi-Keung Luk Robert Cohn Robert Muth Harish Patil Artur KlauserGeoff Lowney Steven Wallace Vijay Janapa Reddi and Kim Hazel-wood Pin Building customized program analysis tools with dynamicinstrumentation In Proceedings of the 2005 ACM SIGPLAN Conferenceon Programming Language Design and Implementation PLDI rsquo052005

[34] Giorgi Maisuradze and Christian Rossow Ret2spec Speculative execu-tion using return stack buffers In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security(CCSrsquo18) 2018

[35] Andrea Mambretti Matthias Neugschwandtner Alessandro SorniottiEngin Kirda William Robertson and Anil Kurmus Speculator A toolto analyze speculative execution attacks and mitigations In Proceedingsof the 35th Annual Computer Security Applications Conference ACSACrsquo19 page 747ndash761 New York NY USA 2019 Association forComputing Machinery

[36] Claudio Canella Robert Schilling Florian Kargl Daniel Gruss

Michael Schwarz Moritz Lipp Context A generic approach formitigating spectre Proceedings of the Annual Network and DistributedSystem Security Symposium (NDSSrsquo20) 2020

[37] Marina Minkin Daniel Moghimi Moritz Lipp Michael SchwarzJo Van Bulck Daniel Genkin Daniel Gruss Frank Piessens Berk Sunarand Yuval Yarom Fallout Reading kernel writes from user spaceCoRR 2019

[38] Nicholas Nethercote and Julian Seward Valgrind A framework forheavyweight dynamic binary instrumentation In Proceedings of the28th ACM SIGPLAN Conference on Programming Language Designand Implementation PLDI rsquo07 2007

[39] Oleksii Oleksenko Bohdan Trach Mark Silberstein and ChristofFetzer Specfuzz Bringing spectre-type vulnerabilities to the surfaceCoRR abs190510311 2019

[40] Fei Peng Zhui Deng Xiangyu Zhang Dongyan Xu Zhiqiang Lin andZhendong Su X-force Force-executing binary programs for securityapplications In 23rd USENIX Security Symposium (USENIX Security14) 2014

[41] Read Sprabery Konstantin Evchenko Abhilash Raj Rakesh B BobbaSibin Mohan and R H Campbell Scheduling isolation and cacheallocation A side-channel defense 2018

[42] Julian Stecklina and Thomas Prescher Lazyfp Leaking FPU registerstate using microarchitectural side-channels CoRR abs1806074802018

[43] Guanhua Wang Sudipta Chattopadhyay Ivan Gotovchits Tulika Mitraand Abhik Roychoudhury oo7 Low-overhead defense against spectreattacks via binary analysis CoRR abs180705843 2018

[44] T Warren Intel processors are being redesigned to protect againstspectre 2018

[45] Ofir Weisse Jo Van Bulck Marina Minkin Daniel Genkin BarisKasikci Frank Piessens Mark Silberstein Raoul Strackx Thomas FWenisch and Yuval Yarom Foreshadow-ng Breaking the virtualmemory abstraction with transient out-of-order execution Technicalreport 2018

[46] Mengjia Yan Jiho Choi Dimitrios Skarlatos Adam Morrison Christo-pher W Fletcher and Josep Torrellas Invisispec Making speculativeexecution invisible in the cache hierarchy 2018 51st Annual IEEEACMInternational Symposium on Microarchitecture (MICRO) 2018

[47] Mengjia Yan Bhargava Gopireddy Thomas Shull and Josep TorrellasSecure hierarchy-aware cache replacement policy (sharp) Defendingagainst cache-based side channel atacks In Proceedings of the 44thAnnual International Symposium on Computer Architecture ISCA rsquo172017

[48] Yuval Yarom and Katrina Falkner Flush+reload A high resolutionlow noise l3 cache side-channel attack In 23rd USENIX SecuritySymposium (USENIX Security 14) pages 719ndash732 San Diego CAAugust 2014 USENIX Association

[49] Si Yu Xiaolin Gui and Jiancai Lin An approach with two-stage modeto detect cache-based side channel attacks In Proceedings of the 2013International Conference on Information Networking (ICOIN) ICOINrsquo13 2013

14

  • Introduction
  • Overview
    • Motivating Example
    • Background and Rationale
    • System Overview
      • Dynamic Speculative Execution Simulation
        • Misprediction Simulation
        • Exploration of Speculative Execution Paths
          • Spectre Gadget Detection
            • Dynamic Taint Tracking
            • Spectre Gadget Modeling
            • Gadget Detection
              • Implementation
              • Experimental Evaluation
                • Experiment Setup
                • Baseline Evaluation on Spectre Samples Dataset
                • Baseline Comparison on Real-world V2 Dataset
                • Baseline Evaluation on Real-world V1 Dataset
                • Efficiency Evaluation
                • Case Study
                  • Discussion
                  • Related Work
                  • Conclusion
                  • References