static taint-analysis on binary executablesweb.cs.iastate.edu/~weile/cs513x/5.taintanalysis2.pdf ·...

30
Static Taint-Analysis on Binary Executables Sanjay Rawat, Laurent Mounier, Marie-Laure Potet VERIMAG University of Grenoble October 2011 Static Taint-Analysis on Binary Executables 1/29

Upload: others

Post on 14-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Static Taint-Analysis on Binary Executables

Sanjay Rawat, Laurent Mounier, Marie-Laure Potet

VERIMAGUniversity of Grenoble

October 2011

Static Taint-Analysis on Binary Executables 1/29

Page 2: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Outline

1 Introduction

2 Intra-procedural taint analysis

3 Inter-procedural taint analysis

4 Tool plateform and experimental results

5 Conclusion

Static Taint-Analysis on Binary Executables 2/29

Page 3: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Context: vulnerability analysis

Vulnerable functions VF

unsafe library functions (strcpy, memcpy, etc.) or codepatterns (unchecked buffer copies, memory de-allocations)

critical parts of the code (credential checkings)

etc.

Vulnerable paths = execution paths allowing to

read external inputs (keyboard, network, files, etc.) on amemory location Mi

call a vulnerable function VF with parameter values dependingon Mi

Static Taint-Analysis on Binary Executables 3/29

Page 4: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Vulnerable paths detection based on taint analysis

Input:

a set of input sources (IS) = tainted data

a set of vulnerable functions (VF)

Output:

a set of tainted paths =tainted data-dependency paths from IS to VF

x=IS() · · · −→ · · · y := x · · · −→ · · · VF(y)

Static Taint-Analysis on Binary Executables 4/29

Page 5: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Work objective

→ Staticaly compute vulnerable execution paths:

on large applications (several thousands of functions)scalability issues ⇒ lightweight analysis

from binary executable code

Evaluation on existing vulnerable code(Firefox, Acroread, MediaPlayer, . . . )

→ Links with more general (test related) problems:

interprocedural information flow analysis

program chopping [T. Reps], impact analysis

Static Taint-Analysis on Binary Executables 5/29

Page 6: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Outline

1 Introduction

2 Intra-procedural taint analysis

3 Inter-procedural taint analysis

4 Tool plateform and experimental results

5 Conclusion

Static Taint-Analysis on Binary Executables 6/29

Page 7: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Taint Analysis

• Identify input dependent variables at each program location

• Two kinds of dependencies:

Data dependencies

// x is tainted

y = x ; z = y + 1 ; y = 3 ;

// z is tainted

Control dependencies

// x is tainted

if (x > 0) y = 3 else y = 4 ;

// y is tainted

⇒ we focus on data dependencies . . .

Static Taint-Analysis on Binary Executables 7/29

Page 8: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Static taint data-dependency analysis

(classical) data-flow analysis problem:

input functions return tainted values

constants are untainted

forward computation of a set of pairs (v , T ) at each programlocation:

v is a variableT ∈ {Tainted ,Untainted} is a taint value

fix-point computation (backward dependencies inside loops)

Main difficulties:

memory aliases (pointers)

non scalar variables (e.g., arrays)

read (T[i]) ; ... ; x = T[j]

Static Taint-Analysis on Binary Executables 8/29

Page 9: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

A source-level Example

int x, y, z, t ;

...

read(y) ; read(t) ;

y = 3 ;

x = t ;

z = y ;

// x and t are now tainted

Static Taint-Analysis on Binary Executables 9/29

Page 10: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Taint analysis at the assembly level

y at ebp-8, x at ebp-4 and z at ebp-12.

y = 3 ;

...

z = y ;

1: t3 := 3

2: t4 := ebp-8

3: Mem[t4] := t3

...

7: t5 := ebp-8

8: t6 := Mem[t5]

9: t7 := ebp-12

10: Mem[t7] := t6

Needs to identify that:

value written at ebp-8 ← mem. loc. written at line 3

content of reg. t4 at line 2 = content of reg. t5 at line 7

⇒ compute possible values of each registers and mem. locations . . .

Static Taint-Analysis on Binary Executables 10/29

Page 11: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Value Set Analysis (VSA)

Compute the sets of mem. addresses defined at each prog. loc.

Difficult because:

addresses and other values are not distinguishable

both direct and indirect memory adresssing

address arithmetic is pervasive

Compute at each prog. loc. an over-approximation of:

the set of (abstract) adresses that are defined

the value contained in each register and each abstract address

⇒ Can be expressed as a forward data-flow analysis . . .

Static Taint-Analysis on Binary Executables 11/29

Page 12: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Memory model

• Memory = (unbounded) set of fix-sized memory cells• Memloc = (consecutive) memory cells accessed during load/store ops.

Memloc addresses:

local variables and parameters offset w.r.t to ebp

global variables fixed value

dynamically allocated memory return values from malloc

However:

the exact value of ebp is unknown

the value returned by a malloc() is unknown

arithmetic computations to access non-scalar variables

set of memory locations accessed is unknown statically

Static Taint-Analysis on Binary Executables 12/29

Page 13: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Abstracting Adresses and Values

Abstract address/value =

- set of offsets w.r.t. register content at a given instruction- expressed as a pair < B,X > s.t.:

B is an (abstract) “base value”, which can be either

a pair (instruction, register)an element of {Empty,None,Any}

X is a finite set of integers (X ⊆ Z).

Concrete values represented by < B,X > =∅ if B = Empty (empty value)

Z if B = Any (any value)

X if B = None (constant value)

{v + x | x ∈ X ∧ v ∈ concrete val. of t at i} if B = (i , t)

Static Taint-Analysis on Binary Executables 13/29

Page 14: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Example

1. t0 = ebp + 8

t0 =< (1, ebp), {8} >

2. t1 = Mem[t0] {the content of Mem[t0] is unknown ... }t1 =< (2, t1), {0} >

3. t2 = t1 + 4

t2 =< (2, t1), {4} >

4. Mem[t2] = 50

< (2, t1), 4 >=< None, {50} >

Static Taint-Analysis on Binary Executables 14/29

Page 15: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Intraprocedural VSA as a data-flow analysis

Mapping

{ Register × Abstract addresses } → Abstract values

associated to each CFG node

Forward least-fix-point computation:

lattice of abstract address/values (more precise ≤ less precise)

widening operator (set of offsets is bounded)

merge operator: least upper bound

< B1,X1 > t < B2,X2 >=

{< Any, ∅ > if B1 6= B2

< B1,X1 ∪ X2 > if B1 = B2

transfer function: abstracts the instruction semantics . . .

Static Taint-Analysis on Binary Executables 15/29

Page 16: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Example 1: conditional statement

1 #include <stdio.h>

2 int main()

3 {

4 int x, y=5, z ;

5 if (y<4) {

6 x=3; z=4;

7 } else {

8 x=4; z=3;

9 } ;

10 y=x+z;

11 return z;

12 }

’ebp’ = <’40105601’, {0}>

’esp’ = <’init’, {4}>

’init-12’ = <’noval’, {6, 7, 8}>

’init-16’ = <’noval’, {3, 4}>

’init-8’ = <’noval’, {3, 4}>

Static Taint-Analysis on Binary Executables 16/29

Page 17: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Example 2: iterative statement

1 #include <stdio.h>

2 int main()

3 {

4 int x=0, i, y;

5 for (i=0; i<4;i++) {

6 y=6; x=x+i;

7 }

8 return 0;

9 }

’ebp’ = <’40105701’, {0}>

’esp’ = ’init’, {4}>

’initESP-12’ = <’anyval’, {}>

’initESP-16’ = <’noval’, {6}>

’initESP-8’ = <’anyval’, {}>

Static Taint-Analysis on Binary Executables 17/29

Page 18: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Possible improvements

use more sophisticated abstract domain ?(e.g., stridded intervals ?)

restrict VSA to registers and memory locations involved inaddress computationsPartially done . . .

take into account the size of memory transfers

Static Taint-Analysis on Binary Executables 18/29

Page 19: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Outline

1 Introduction

2 Intra-procedural taint analysis

3 Inter-procedural taint analysis

4 Tool plateform and experimental results

5 Conclusion

Static Taint-Analysis on Binary Executables 19/29

Page 20: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Hypothesis on information flows

Inside procedures:

assigments: x := y + z

From caller to callee:

arguments: foo (x, y+12)

From callee to caller:

return value and pointer to arguments: z = foo (x, &y)

And global variables . . .

⇒ compute procedure summaries to express these dependencies.

Static Taint-Analysis on Binary Executables 20/29

Page 21: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

A summary-based inter-procedural data-flow analysis

intra-procedural level: summary computation

→ express side-effects wrt taintedness and aliases

int foo(int x, int *y){

int z;

z = x+1 ; *y = z ;

return z ;

}

Summary: x is tainted ⇒ z is tainted, z and *y are aliases

inter-procedural level: apply summaries to effective parameters

read(b) ; // taints b

a = foo (b+12, &c) ; // a and c are now tainted ...

Static Taint-Analysis on Binary Executables 21/29

Page 22: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

A summary-based inter-procedural data-flow analysis

intra-procedural level: summary computation

→ express side-effects wrt taintedness and aliases

int foo(int x, int *y){

int z;

z = x+1 ; *y = z ;

return z ;

}

Summary: x is tainted ⇒ z is tainted, z and *y are aliases

inter-procedural level: apply summaries to effective parameters

read(b) ; // taints b

a = foo (b+12, &c) ; // a and c are now tainted ...

Static Taint-Analysis on Binary Executables 21/29

Page 23: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Scalability issues

Fine-grained data-flow analysis → not applicable on large programs

⇒ needs some “aggressive” approximations:

some deliberate over-approximations(global variables, complex data structures, etc.)

consider only data-flow propagation

operate at fine-grained level only on a program slice(parts of the code outside the slice either irrelevant orapproximated)

Static Taint-Analysis on Binary Executables 22/29

Page 24: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Slicing the Call Graph

Inter-procedural information flow from IS to VFHow to reduce the set of procedures to be analysed ?

→ A slice computation performed at the call graph level

→ Split this set into 3 parts:

1 procedure that are not relevant

2 procedure those side-effect can be (implicitely)over-approximated→ use of default summaries . . .

3 procedure requiring a more detailed analysis→ summary computation through intra-procedural analysis

Static Taint-Analysis on Binary Executables 23/29

Page 25: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Outline

1 Introduction

2 Intra-procedural taint analysis

3 Inter-procedural taint analysis

4 Tool plateform and experimental results

5 Conclusion

Static Taint-Analysis on Binary Executables 24/29

Page 26: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Tool Highlights

Based on two existing platforms:

IDA Pro, a “general purpose” disassembler

BinNavi:

translation to an intermediate representation (REIL)a data-flow analysis engine (MonoREIL)

+ an additional set of Jython procedures

But still under construction/evaluation . . .

Static Taint-Analysis on Binary Executables 25/29

Page 27: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Tool Architecture

IDA Pro BinNavix86 REILExecutable

var identificationactual param identification

Value Analysis

VF detection

detection

Vulnerable Path

Static Taint-Analysis on Binary Executables 26/29

Page 28: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Example of experimental result

Name: Fox Player

Total functions: 1074

Total vulnerable functions: 48

Total slices found: 16 (5 with a tainted data flow)

Smallest slice: 3 func

Largest slice: 40 func

Average func in slice: 18

⇒ About 10 “vulnerable paths” discovered . . .

Static Taint-Analysis on Binary Executables 27/29

Page 29: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Taintflow slice for Foxplayer Example

ReadFile

sscanf

sub_40DE00

984

985

sub_40D810

975 sub_40DD20

980 986 995

sub_4106A0

941

sub_411480

943

sub_40EB80

1231

sub_40EED0

1232

sub_40F6F0

1321

972

1075 1076 1078 1073 1097 1160

Static Taint-Analysis on Binary Executables 28/29

Page 30: Static Taint-Analysis on Binary Executablesweb.cs.iastate.edu/~weile/cs513x/5.TaintAnalysis2.pdf · Static taint data-dependency analysis (classical) data-ow analysis problem: input

Conclusion

Part of a more complete tool chain for“Vulnerability Detection and Exploitability Analysis”

To be continued within the BinSec ANR project . . .

Static Taint-Analysis on Binary Executables 29/29