symbolic path simulation in path-sensitive dataflow analysis hari hampapuram jason yue yang manuvir...

Post on 22-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Symbolic Path Simulation in Path-Sensitive Dataflow Analysis

Hari Hampapuram

Jason Yue Yang

Manuvir Das

Center for Software Excellence (CSE)Microsoft Corporation

PASTE'05 Jason Yang, Microsoft2

Gist of Results

Symbolic path simulation engine supporting:

1. Merge – For merge-based path-sensitive analysis

2. Function summaries– For scalable global analysis

3. Pointers– Our main client is Windows

PASTE'05 Jason Yang, Microsoft3

Infeasible Path False Positive

extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;

if (b > 0) y = 1; else y = 2;

if (x != 1) UseHandle(handle); }

START

OPEN CLOSE

ERROR

OpenHandle

UseHandle

CloseHandle

UseHandle

PASTE'05 Jason Yang, Microsoft4

Infeasible Path False Positive

extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;

if (b > 0) y = 1; else y = 2;

if (x != 1) UseHandle(handle); }

START

OPEN CLOSE

ERROR

OpenHandle

UseHandle

CloseHandle

UseHandle

PASTE'05 Jason Yang, Microsoft5

Need for Merge

The “knob” for scalability vs. precision tradeoff– Always merge (traditional dataflow) false errors– Always separate: exponential blow-up

Driven by client analyses

PASTE'05 Jason Yang, Microsoft6

Merge Criterion for ESP

Selective merging based on property states– Partition symbolic states into property states and

everything else– If the incoming paths differ in property states,

track them separately; otherwise, merge them.

PASTE'05 Jason Yang, Microsoft7

extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;

if (b > 0) y = 1; else y = 2;

if (x != 1) UseHandle(handle); }

Merge Criterion for ESP Example

Property states different along paths

PASTE'05 Jason Yang, Microsoft8

extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;

if (b > 0) y = 1; else y = 2;

if (x != 1) UseHandle(handle); }

Merge Criterion for ESP Example

Property states different along paths

Do not merge

PASTE'05 Jason Yang, Microsoft9

extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;

if (b > 0) y = 1; else y = 2;

if (x != 1) UseHandle(handle); }

Merge Criterion for ESP Example

Property states are the same

Property states change along paths

Do not merge

PASTE'05 Jason Yang, Microsoft10

extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;

if (b > 0) y = 1; else y = 2;

if (x != 1) UseHandle(handle); }

Merge Criterion for ESP Example

Property states are the same

Merge

Property states change along paths

Do not merge

PASTE'05 Jason Yang, Microsoft11

extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;

if (b > 0) y = 1; else y = 2;

if (x != 1) UseHandle(handle); }

Merge Criterion for ESP Example

Property states are the same

Merge

Still maintains the needed fact: “If CloseHandle is called, branch should fail.”

Property states change along paths

Do not merge

PASTE'05 Jason Yang, Microsoft12

extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;

if (b > 0) y = Foo(b); else y = 2;

if (x != 1) UseHandle(handle); }

Need for Function Summaries

Partial transfer functionsComputed on-demandEnforced by “into-binding” and “back-binding”

PASTE'05 Jason Yang, Microsoft13

Support for Language Features

Pointers Field-based objects Operator expressions …

PASTE'05 Jason Yang, Microsoft14

Symbolic Simulator Architecture

Client Application Client Application

Simulation Interface(SI)

Simulation Interface(SI)

Simulation State Manager(SSM)

Defect detection, core dump analysis, test generation code review ...

“Semantic translator”

“Theorem prover”

PASTE'05 Jason Yang, Microsoft15

Semantic Domains

Environment– ProgramSymbol Loc– Managed by Simulation Interface

Store– Loc Val– Managed by Simulation State Manager

Region-based model for symbolic store– region Loc– value Val

PASTE'05 Jason Yang, Microsoft16

Simulation State Manager (SSM)

Tracking symbolic simulation states to answer queries about path feasibility

What should be tracked?– Mapping of store region value– Constraints on values

PASTE'05 Jason Yang, Microsoft17

Regions

Variable regions vs. deref regions– Important for pointer dereference– Important for supporting merge and binding

void Process(int *p, int *q) { int x = *p; int y = *q; if (p != q) return; if (*p != *q) … // Not reachable }

Variable regions: R(p), R(q), R(x), R(y)

Deref regions: R(*p), R(*q)

PASTE'05 Jason Yang, Microsoft18

Values

Constant values (integers, floats, …) Operator values (arithmetic, bitwise, relational) Symbolic values (general constraint variables) Region-initial values (constraint variables for

initial values) Pointer values (for points-to relationship) Field-based values (for compound types)

PASTE'05 Jason Yang, Microsoft19

Need for Region-Initial Values

Important for function summary– Pre-condition: simulation state at Entry node– Post-condition: simulation state at Exit node– Input values vs. current values

To support lazy initialization for input values– An input region gets region-initial values by default,

unless it has been killed– Need to maintain a kill set

PASTE'05 Jason Yang, Microsoft20

Decision Procedures

Current implementation:– Equality (e.g. a == b): equivalence classes– Disequality (e.g. a != b): multi-maps between

equivalence classes– Inequality (e.g. a< b): a graph (nodes are

equivalence classes and edges are inequality relations)

Can plug in other theorem provers if needed

PASTE'05 Jason Yang, Microsoft21

Merge

Moves symbolic states upwards in the lattice– Less constraints on path feasibility after merge

Maps the memory graphs and the associated constraints on values

R1

R2

R1’

R2’

R1’’

R2’’

0xEFD0 0xEFD0 0xEFD0

$1 $3$2

JOIN

$1 > 0 $3 > 0$2 > 0

PASTE'05 Jason Yang, Microsoft22

Example Client Analysis ESP

Path-sensitive, context sensitive, inter-procedural defect detection tool for large C/C++ programs

PASTE'05 Jason Yang, Microsoft23

Simulation Interface (SI)

Fetching regions and values Assignments

– E.g., x = 1;

Branches– E.g., a == b;

Procedure call (into-binding) Call back (back-binding)

PASTE'05 Jason Yang, Microsoft24

Into-Binding

Two approaches:– Binding precise calling context into callee

Less demand in reasoning power to refute infeasible path More suitable for top-down analysis

– Binding no constraints (TOP) into callee More demand in reasoning power to refute infeasible path More suitable for bottom-up analysis

Binding from caller Call node to callee Entry node– Bind parameters– Bind global variables– Bind constraints

PASTE'05 Jason Yang, Microsoft25

Back-Binding

Binding from callee Exit node to caller Return node– Bind the region-initial values of input regions– Bind values of output regions– Bind constraints

PASTE'05 Jason Yang, Microsoft26

Experiences

Security properties for future version of Windows Difficult to check with other tools Scalability

– E.g., for all device drivers, found ~500 errors in 20 hours

Precision: – E.g., for Windows kernel (216,000 LOC, 9755 functions)

Bugs False Positives Time (sec)

With Path Simulation 2 0 1098

Without Path Simulation 2 12 1037

PASTE'05 Jason Yang, Microsoft27

Summary

Critical for improving precision Scalable enough for industrial programs Other client analyses

– PSE– Iterative refinement for ESP

Beneficial to have built-in support for

merge, function summaries, and pointers

PASTE'05 Jason Yang, Microsoft28

Thank You!

For more information, please visithttp://www.microsoft.com/windows/cse/pa

top related