symbolic path simulation in path-sensitive dataflow analysis hari hampapuram jason yue yang manuvir...
Post on 22-Dec-2015
215 views
TRANSCRIPT
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis
Hari Hampapuram
Jason Yue Yang
Manuvir Das
Center for Software Excellence (CSE)Microsoft Corporation
PASTE'05 Jason Yang, Microsoft2
Gist of Results
Symbolic path simulation engine supporting:
1. Merge – For merge-based path-sensitive analysis
2. Function summaries– For scalable global analysis
3. Pointers– Our main client is Windows
PASTE'05 Jason Yang, Microsoft3
Infeasible Path False Positive
extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;
if (b > 0) y = 1; else y = 2;
if (x != 1) UseHandle(handle); }
START
OPEN CLOSE
ERROR
OpenHandle
UseHandle
CloseHandle
UseHandle
PASTE'05 Jason Yang, Microsoft4
Infeasible Path False Positive
extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;
if (b > 0) y = 1; else y = 2;
if (x != 1) UseHandle(handle); }
START
OPEN CLOSE
ERROR
OpenHandle
UseHandle
CloseHandle
UseHandle
PASTE'05 Jason Yang, Microsoft5
Need for Merge
The “knob” for scalability vs. precision tradeoff– Always merge (traditional dataflow) false errors– Always separate: exponential blow-up
Driven by client analyses
PASTE'05 Jason Yang, Microsoft6
Merge Criterion for ESP
Selective merging based on property states– Partition symbolic states into property states and
everything else– If the incoming paths differ in property states,
track them separately; otherwise, merge them.
PASTE'05 Jason Yang, Microsoft7
extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;
if (b > 0) y = 1; else y = 2;
if (x != 1) UseHandle(handle); }
Merge Criterion for ESP Example
Property states different along paths
PASTE'05 Jason Yang, Microsoft8
extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;
if (b > 0) y = 1; else y = 2;
if (x != 1) UseHandle(handle); }
Merge Criterion for ESP Example
Property states different along paths
Do not merge
PASTE'05 Jason Yang, Microsoft9
extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;
if (b > 0) y = 1; else y = 2;
if (x != 1) UseHandle(handle); }
Merge Criterion for ESP Example
Property states are the same
Property states change along paths
Do not merge
PASTE'05 Jason Yang, Microsoft10
extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;
if (b > 0) y = 1; else y = 2;
if (x != 1) UseHandle(handle); }
Merge Criterion for ESP Example
Property states are the same
Merge
Property states change along paths
Do not merge
PASTE'05 Jason Yang, Microsoft11
extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;
if (b > 0) y = 1; else y = 2;
if (x != 1) UseHandle(handle); }
Merge Criterion for ESP Example
Property states are the same
Merge
Still maintains the needed fact: “If CloseHandle is called, branch should fail.”
Property states change along paths
Do not merge
PASTE'05 Jason Yang, Microsoft12
extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2;
if (b > 0) y = Foo(b); else y = 2;
if (x != 1) UseHandle(handle); }
Need for Function Summaries
Partial transfer functionsComputed on-demandEnforced by “into-binding” and “back-binding”
PASTE'05 Jason Yang, Microsoft13
Support for Language Features
Pointers Field-based objects Operator expressions …
PASTE'05 Jason Yang, Microsoft14
Symbolic Simulator Architecture
Client Application Client Application
Simulation Interface(SI)
Simulation Interface(SI)
Simulation State Manager(SSM)
Defect detection, core dump analysis, test generation code review ...
“Semantic translator”
“Theorem prover”
PASTE'05 Jason Yang, Microsoft15
Semantic Domains
Environment– ProgramSymbol Loc– Managed by Simulation Interface
Store– Loc Val– Managed by Simulation State Manager
Region-based model for symbolic store– region Loc– value Val
PASTE'05 Jason Yang, Microsoft16
Simulation State Manager (SSM)
Tracking symbolic simulation states to answer queries about path feasibility
What should be tracked?– Mapping of store region value– Constraints on values
PASTE'05 Jason Yang, Microsoft17
Regions
Variable regions vs. deref regions– Important for pointer dereference– Important for supporting merge and binding
void Process(int *p, int *q) { int x = *p; int y = *q; if (p != q) return; if (*p != *q) … // Not reachable }
Variable regions: R(p), R(q), R(x), R(y)
Deref regions: R(*p), R(*q)
PASTE'05 Jason Yang, Microsoft18
Values
Constant values (integers, floats, …) Operator values (arithmetic, bitwise, relational) Symbolic values (general constraint variables) Region-initial values (constraint variables for
initial values) Pointer values (for points-to relationship) Field-based values (for compound types)
PASTE'05 Jason Yang, Microsoft19
Need for Region-Initial Values
Important for function summary– Pre-condition: simulation state at Entry node– Post-condition: simulation state at Exit node– Input values vs. current values
To support lazy initialization for input values– An input region gets region-initial values by default,
unless it has been killed– Need to maintain a kill set
PASTE'05 Jason Yang, Microsoft20
Decision Procedures
Current implementation:– Equality (e.g. a == b): equivalence classes– Disequality (e.g. a != b): multi-maps between
equivalence classes– Inequality (e.g. a< b): a graph (nodes are
equivalence classes and edges are inequality relations)
Can plug in other theorem provers if needed
PASTE'05 Jason Yang, Microsoft21
Merge
Moves symbolic states upwards in the lattice– Less constraints on path feasibility after merge
Maps the memory graphs and the associated constraints on values
R1
R2
R1’
R2’
R1’’
R2’’
0xEFD0 0xEFD0 0xEFD0
$1 $3$2
JOIN
$1 > 0 $3 > 0$2 > 0
PASTE'05 Jason Yang, Microsoft22
Example Client Analysis ESP
Path-sensitive, context sensitive, inter-procedural defect detection tool for large C/C++ programs
PASTE'05 Jason Yang, Microsoft23
Simulation Interface (SI)
Fetching regions and values Assignments
– E.g., x = 1;
Branches– E.g., a == b;
Procedure call (into-binding) Call back (back-binding)
PASTE'05 Jason Yang, Microsoft24
Into-Binding
Two approaches:– Binding precise calling context into callee
Less demand in reasoning power to refute infeasible path More suitable for top-down analysis
– Binding no constraints (TOP) into callee More demand in reasoning power to refute infeasible path More suitable for bottom-up analysis
Binding from caller Call node to callee Entry node– Bind parameters– Bind global variables– Bind constraints
PASTE'05 Jason Yang, Microsoft25
Back-Binding
Binding from callee Exit node to caller Return node– Bind the region-initial values of input regions– Bind values of output regions– Bind constraints
PASTE'05 Jason Yang, Microsoft26
Experiences
Security properties for future version of Windows Difficult to check with other tools Scalability
– E.g., for all device drivers, found ~500 errors in 20 hours
Precision: – E.g., for Windows kernel (216,000 LOC, 9755 functions)
Bugs False Positives Time (sec)
With Path Simulation 2 0 1098
Without Path Simulation 2 12 1037
PASTE'05 Jason Yang, Microsoft27
Summary
Critical for improving precision Scalable enough for industrial programs Other client analyses
– PSE– Iterative refinement for ESP
Beneficial to have built-in support for
merge, function summaries, and pointers
PASTE'05 Jason Yang, Microsoft28
Thank You!
For more information, please visithttp://www.microsoft.com/windows/cse/pa