type-based data structure verification ming kawaguchi, patrick rondon , ranjit jhala
DESCRIPTION
Type-Based Data Structure Verification Ming Kawaguchi, Patrick Rondon , Ranjit Jhala University of California, San Diego. Goal: Static Software Verification. Verify absence of run-time errors. Buffer overflows Deadlocks Assertion failures. Requires precise data structure verification. - PowerPoint PPT PresentationTRANSCRIPT
Type-Based Data Structure VerificationMing Kawaguchi, Patrick Rondon, Ranjit Jhala
University of California, San Diego
Goal: Static Software Verification
Verify absence of run-time errors• Buffer overflows• Deadlocks• Assertion failures
Requires precise data structure verification
The Problem With Structures
1
2 5
3 4
1 2 53 4
1
2 43
5
Unbounded Size Need Universally Quantified Properties
“Every element has property P”
Contributions
Precise properties of individual cells
Types
Predicates
Lift properties toinvariants on structures
Contributions
• Recursive Structures (Lists, Trees)• Table Structures (Fields, Arrays, Hash Maps)• Supports Inference• Expressive Sorted, Duplicate-Free, Height-Balanced, Acyclic, …• Practical Sorting Algorithms, Splay Heaps, Binary Heaps,
AVL Trees, Red-Black Trees, Vectors, Union-Find, BDDs, …
Predicate-Type MechanismsType Mechanisms Refined
Plan
• Contributions• Types & Structures• Refined Types & Data Structures• Expressiveness• Results
Types & Structures How do types handle structures?
1. Represent Universal Properties2. Algorithm for Instantiating Properties3. Algorithm for Generalizing Properties4. Algorithm for Inference
1. Representation: Recursive Types
1 2 31::(2::(3::[]))
type int list =
| []
| :: of x:int * int list
[] ::
x:int
[] ::
x:int
[] ::
h1:int [] ::
x:int
[] ::
h2:int [] ::
x:int
[] ::
h3:int [] ::
x:int
1. Representation: Recursive Types
Type UnfoldingUniversal Property:For all x in l, x is an int
Universal Property:h1: int, h2: int, h3: int, …
Types & Structures How do types handle structures?
1. Represent Universal Properties2. Algorithm for Instantiating Properties3. Algorithm for Generalizing Properties4. Algorithm for Inference
2. Instantiation Algorithm: Unfold
[] ::
x:int
Unfold::
h:int
[] ::
x:int
l : int
listh:intt:int
listInstantiate
tl
l = h::t
3. Generalization Algorithm: Fold
[] ::
x:int
Fold::
h:int
[] ::
x:int
h:intt:int
listGeneralize
tl
l = h::tl : int
list
Types & Structures How do types handle structures?
1. Represent Universal Properties2. Algorithm for Instantiating Properties3. Algorithm for Generalizing Properties4. Algorithm for Inference
let rec insert(x,l) = match l with | [] ->
x::[] | h::t ->
if x<h then x::lelse
h::insert(x,t)
insert :: (x: int, l: int list) ! int
listVerification = Generalization + Instantiation
Ex: Typecheck Insertion Into List
1.
2.
3.
Assume Input Type Output Checks!
[]:int list x:int
let rec insert(x,l) = match l with | [] ->
x::[] | h::t ->
if x<h then x::lelse
h::insert(x,t)
Generalize
x::[]:int list
Ex: Typecheck Insertion Into List 1/3
G
insert :: (x: int, l: int list) ! int
list
1.
2.
3.
Assume Input Type Output Checks!
let rec insert(x,l) = match l with | [] ->
x::[] | h::t ->
if x<h then x::lelse
h::insert(x,t)
Ex: Typecheck Insertion into List 2/3
GeneralizeG
insert :: (x: int, l: int list) ! int
list
x:int l:int list
x::l:int list
1.
2.
3.
Input Assumption Output Checks!
let rec insert(x,l) = match l with | [] ->
x::[] | h::t ->
if x<h then x::lelse
h::insert(x,t)
h::t:int list
Ex: Typecheck Insertion into List 3/3
h:intt:int listI
Generalize
h::insert(x,t):int
list
h:intinsert(x,t):int list
G
Instantiate
insert :: (x: int, l: int list) ! int
list
x:int2.
3.
1.
let rec insert(x,l) = match l with | [] ->
x::[] | h::t ->
if x<h then x::lelse
h::insert(x,t)
I
G
insert :: (x: int, l: int list) ! int
list
G
G
Verification = Generalization + Instantiation
• Gen when adding to struct.• Ins when taking from struct.
Types & Structures How do types handle structures?
1. Represent Universal Properties2. Algorithm for Instantiating Properties3. Algorithm for Generalizing Properties4. Algorithm for Inference
Plan
• Contributions• Types & Structures• Refined Types & Data Structures• Expressiveness• Results
Idea: “Piggyback” Predicates over Type 1. Representation for Universal Data Properties2. Algorithm for Instantiating Data Properties3. Algorithm for Generalizing Data Properties4. Algorithm for Inference
Refined Types & Data Structures
0<x
[] ::
x:int
[] ::
h:int
1. Representation: Refined RecTypes
Refined Type Unfolding
[] ::
x:int
0<x0<h
0<x
0<x
[] ::
x:int
0<x
[] ::
h1:int [] ::
x:int [] ::
x:int [] ::
x:int
1. Representation: Refined RecTypes
0<h1
0<h2
0<h3
0<x
[] ::
h2:int [] ::
h3:int
Universal Data Property:For all x in l, 0<x
Refined Type Unfolding
Universal Data Property:l:{x: int | 0<x}
list
1. Representation: RecRefined RecTypes
[] ::
x:int
[] ::
x1:int
[] ::
x:int
x<V
0<x
V Refers to Each Element in Tail
Refines all elements
Refines tail’s elements
x<V[] ::
x:int
[] ::
h:int
1. Representation: RecRefined RecTypes
RecRefined Type Unfolding
[] ::
x:int
x<Vh<V
Push Edge Predicate Into Node
h<x
Instantiate V
h2<xh1<x
h1<xx<V
x<V
[] ::
x:int
x<V[] ::
h1:int [] ::
x:int [] ::
x:int
1. Representation: RecRefined RecTypes
h1<V
h3<x
[] ::
h2:int [] ::
h3:int
RecRefined Type Unfolding
h1<h2
h2<V
h1<h3
[] ::
x:inth2<h3
h3<V
h1<xh2<x
x<V
Universal Recursive Data Propertyh1 < h2 < h3 < …Universal Recursive Data Propertyl: sorted list
2. Instantiation Algorithm: Unfold
[] ::
x:intUnfold
::
h:int
[] ::
x:int
l:sorted list h:int t:sorted
list & {h<x}
listInstantiate
tl
l = h::t
x<Vx<V
h<x
3. Generalization Algorithm: Fold
[] ::
x:intFold
h:int
[] ::
x:int
::
l:sorted list h:int t:sorted
list & {h<x}
listGeneralize
tl
l = h::t
x<Vx<V
h<x
Refined Types & Data StructuresIdea: “Piggyback” Predicates over Type
1. Representation for Universal Data Properties2. Algorithm for Instantiating Data Properties3. Algorithm for Generalizing Data Properties4. Algorithm for Inference
Refinement Type Inference
[] ::
x:int
x<V
Refinements Determine Property1. Unknown refinements are variables2. Constraints over variables3. Solve to find refinements
0<x
K2
K1
Hints for Type InferenceApply at Refinement PointsKeep Only Valid Refinements
Refinement Type Inference
[] ::
x:int
Refinements Determine Property1. Unknown refinements are variables2. Constraints over variables3. Solve to find refinements
K2
K1
*<*0<**=*…
x<V0<xx=V…
x<V
0<x
x<V0<xx=V…
Instantiate With Program Variables
Refined Types & Data StructuresIdea: “Piggyback” Predicates over Type
1. Representation for Universal Data Properties2. Algorithm for Instantiating Data Properties3. Algorithm for Generalizing Data Properties4. Algorithm for Inference
Free Representation
Free Algorithms
let rec insert(x,l) = match l with | [] ->
x::[] | h::t ->
if x<h then x::lelse
h::insert(x,t)
I
G
G
G
Verification = Generalization + Instantiationinsert :: (x:int, l:sorted list) ! sorted list
Generalize, Instantiate at same place as typechecker!• Gen when adding to struct.• Ins when taking from struct.
Verify Insertion Into List
insert :: (x: int, l: int list) ! int list
Verify Insertion Into Sorted List
Plan
• Contributions• Types & Structures• Refined Types & Data Structures• Expressiveness• Results
[] ::
x:int
x<V::
h1:int
Property: h1 < h2 < h3 < …
Refined Unfolded
::
h2:int
h1<h2
::
h3:inth1<h3h2<h3
h3<x
[] ::
x:int
h1<xh2<x
x<V
Property: sorted list
Type
Refinements Determine Property
[] ::
x:int
xV::
h1:int
Property: h1 h2 h3 …
Refined Unfolded
::
h2:int
h1h2
::
h3:inth1h3h2h3
h3x
[] ::
x:int
h1xh2x
xV
Property: duplicate-free
list
Type
Non-aliasing in Collections e.g. list of distinct addresses
Trees: Type
Leaf
Node
x:int
x<VV<x
Trees: Refined Type
Node
Leaf
Node
x:int
Leaf
Node
x:int
Unfold Refined Type
Node
x:int
Leaf
x<VV<x x<VV<x
r<VV<r
Push edge predicate insideLHS nodes < root < RHS nodes
Property: binary-search
treer<VV<x
r:int
V<r x<V
x<r r<x
Refined Type
Node|Hl – Hr|<
2Leaf
l r
Refine Nodel , r = Names of left, right treesH l, H r = Heights of left, right trees
Node|Hl – Hr|<
2Leaf
l r
Refined TypeUnfold
Node|Hl1 – Hr1|< 2
l1 r1
Node|Hl – Hr|<
2Leaf
l r Node|Hl – Hr|<
2Leaf
l r
Refined TypeUnfoldHeight balanced at each levelProperty: balanced tree
Plan
• Contributions• Types & Structures• Refined Types & Data Structures• Expressiveness• Results
Our Inference Tool
SpecificationUnsafe
Safe
DsolveHints
OCaml Program
Data StructuresProgram
Lines
List-sort 111Map 98Redblack 106Stablesort 124Vec 343BinHeap 122SplayHeap 134Malloc 71Bdd 206UnionFind 65SubvSolve 264Total 173
6
Data StructuresProgram Lines PropertyList-sort 111 Sorted, Outputs Permutation of InputMap 98 Balance, BST, Implements a SetRedblack 106 Balance, BST, ColorStablesort 124 SortedVec 343 Balance, Bounds Checking, …BinHeap 122 Heap, Returns Minimum, …SplayHeap 134 BST, Returns Minimum, Implements a
SetMalloc 71 Used and Free Lists Are AccurateBdd 206 Maintains Variable OrderUnionFind 65 AcyclicSubvSolve 264 AcyclicTotal 173
6
Data StructuresProgram Lines PropertyList-sort 111 Sorted, Outputs Permutation of InputMap 98 Balance, BST, Implements a SetRedblack 106 Balance, BST, ColorStablesort 124 SortedVec 343 Balance, Bounds Checking, …BinHeap 122 Heap, Returns Minimum, …SplayHeap 134 BST, Returns Minimum, Implements a
SetMalloc 71 Used and Free Lists Are AccurateBdd 206 Maintains Variable OrderUnionFind 65 AcyclicSubvSolve 264 AcyclicTotal 173
6
Program Lines PropertyList-sort 111 Sorted, Outputs Permutation of InputMap 98 Balance, BST, Implements a SetRedblack 106 Balance, BST, ColorStablesort 124 SortedVec 343 Balance, Bounds Checking, …BinHeap 122 Heap, Returns Minimum, …SplayHeap 134 BST, Returns Minimum, Implements a
SetMalloc 71 Used and Free Lists Are AccurateBdd 206 Maintains Variable OrderUnionFind 65 AcyclicSubvSolve 264 AcyclicTotal 173
6
Data Structures
Program Lines PropertyList-sort 111 Sorted, Outputs Permutation of InputMap 98 Balance, BST, Implements a SetRedblack 106 Balance, BST, ColorStablesort 124 SortedVec 343 Balance, Bounds Checking, …BinHeap 122 Heap, Returns Minimum, …SplayHeap 134 BST, Returns Minimum, Implements a
SetMalloc 71 Used and Free Lists Are AccurateBdd 206 Maintains Variable OrderUnionFind 65 AcyclicSubvSolve 264 AcyclicTotal 173
6
Data Structures
Program Lines PropertyList-sort 111 Sorted, Outputs Permutation of InputMap 98 Balance, BST, Implements a SetRedblack 106 Balance, BST, ColorStablesort 124 SortedVec 343 Balance, Bounds Checking, …BinHeap 122 Heap, Returns Minimum, …SplayHeap 134 BST, Returns Minimum, Implements a
SetMalloc 71 Used and Free Lists Are AccurateBdd 206 Maintains Variable OrderUnionFind 65 AcyclicSubvSolve 264 AcyclicTotal 173
6
Data Structures
Data StructuresProgram
Lines
List-sort 111Map 98Redblack 106Stablesort 124Vec 343BinHeap 122SplayHeap 134Malloc 71Bdd 206UnionFind 65SubvSolve 264Total 173
6
Data StructuresProgram
Lines Hints
List-sort 111 7Map 98 14Redblack 106 2Stablesort 124 1Vec 343 9BinHeap 122 6SplayHeap 134 3Malloc 71 2Bdd 206 3UnionFind 65 2SubvSolve 264 2Total 173
654
3% of code size
Data StructuresProgram
Lines Hints Time (sec)
List-sort 111 7 5Map 98 14 25Redblack 106 2 29Stablesort 124 1 4Vec 343 9 87BinHeap 122 6 33SplayHeap 134 3 6Malloc 71 2 2Bdd 206 3 80UnionFind 65 2 5SubvSolve 264 2 20Total 173
654 300
Data StructuresProgram
Lines Hints Time (sec)
List-sort 111 7 5Map 98 14 25Redblack 106 2 29Stablesort 124 1 4Vec 343 9 87BinHeap 122 6 33SplayHeap 134 3 6Malloc 71 2 2Bdd 206 3 80UnionFind 65 2 5SubvSolve 264 2 20Total 173
654 300
Vec: Extensible Arrays (317 LOC)
“Python-style” arrays for OCamlfind, insert, delete, join
etc.
Efficiency via balanced trees
Balanced
Height difference between siblings ≤ 2
Dsolve found balance violation
fatal off-by-one error
Recursive Rebalance
Debugging via Inference
Using Dsolve we found
Where imbalance occurred
(specific path conditions)
How imbalance occurred
(left tree off by up to 4)
Leading to test and fix
Plan
• Contributions• Types & Structures• Refined Types & Data Structures• Expressiveness• Results
http://pho.ucsd.edu/liquidsource, papers, demo, etc.
Data Precision
TypesPredicates
Lifting to Structures
Data Structure Verification
Conclusion
(Finite) Maps1
2 43
5n1.succs=[n2;n3;n4]
(node, node list) Map
Field Read/Get
Field Write/Set
n.succs
n.succs :=
e
set succs n e
get succs n
DataKey
Refined Maps1
2 43
5
(node, node list) Map(n:node,{x:node|n<x} list) Map
P(x0), P(x1),… 8x. P(x)
How to Generalize?
How to Instantiate?
Refine poly-type for set
Refine poly-type for getWhen getting data from key
When setting key to data
Acyclic Graph!
Textual Representation
μt. [] + ::(x: int, t)
[] ::
x:int
x<V
0<x
<<>, <0<x, true>>
type int list =
| []
| :: of x:int * int list
[] has no parametersRefines ElementsRefines Tail
[] + ::(x: {0 < x}, μt. [] + ::(x: int, <…>t))μt. [] + ::(x: int, <<>, <x<V, true>>t)
Insertion Sort Type and Hint
let rec ins l x = match l with | [] -> x :: [] | h :: xs -> if x < h then x :: h :: xs else h :: (ins xs x)
let insert_sort lst = List.fold_left ins [] lst
hint: * =< *
sorted = μt. [] + ::(x: int, <<>, <V >= x, true>>t)
insert_sort: sorted ! sorted