the benefits of exposing calls and returns rajeev alur university of pennsylvania concur/spin,...
Post on 21-Dec-2015
216 views
TRANSCRIPT
The Benefits of Exposing Calls and Returns
Rajeev AlurUniversity of Pennsylvania
CONCUR/SPIN, August 2005
Software Model Checking
Code Abstractor Model
Verifier Specification
Yes
Counter-example
Predicate abstraction
Control flow graph +Boolean vars
(Pushdown automata)
On-the-fly explicit stateSymbolic fixpoint evaluation
Temporal logics/Automata Regular!
Observables
Regular specifications not expressive enough
Classical Hoare-style pre/post conditions If p holds when procedure A is invoked, q holds upon return Total correctness: every invocation of A terminates Integral part of emerging standard JML
Stack inspection properties: security/access control If a setuuid bit is being set, process root must be in the call
stack
Inter-procedural data-flow analysis An expression e is very busy at a control point p if on all
paths from p, e will be used before any of its variables (possibly local in current procedure) are modified
Need matching of calls with returns, or finding pending calls, or local paths --- Context-free properties!
Checking Context-free Specifications
Obstcales Context-free languages are not closed under intersection Checking context-free properties against context-free models
is undecidable
However, many such properties are verifiable Existing work in security that handles some stack inspection
properties [JMT99,JKS03] Adding assert statements in the program (with additional
local variables, if needed), and then checking regular properties (e.g. reachability) amounts to checking context-free properties
Inter-procedural data-flow analysis algorithms [RHS95]
Exposing Calls and Returns
What’s common to the checkable properties? Both model and property have their own stack, but the two
stacks are synchronized and grow/shrink together!
As a generator, program exposes its calls and returns, and as an acceptor, property pushes on calls and pops on returns
Formalization of this intuition: Visibly Pushdown Languages
A surprisingly robust class of languages with properties like the regular languages and potentially many applications
Talk Outline
Visibly Pushdown Languages
Temporal Logic CaRet and Model Checking
Ongoing Work
References:
Visibly pushdown languages
Alur, Madhusudan; STOC’04 A temporal logic of nested calls and returns
Alur, Etessami, Madhusudan; TACAS’04 Congruences for visibly pushdown languages
Alur, Kumar, Madhusudan, Viswanthan; ICALP’05
Context-free Languages: Recap (1/2)
Given an alphabet , a language L is a set of finite words over
A pushdown automaton (PDA) has a finite control and a stack, and while reading a word, it can push/pop stack symbols while updating control state
Configuration of a PDA: control state + a string of stack symbols
Acceptance defined by empty stack or final state A language L is a context-free language (CFL) if
there is a pushdown automaton that accepts it Sample CFLs
All regular languages Set of words of the form an bn, for some n Set of words with equal number of a and b symbols
Non-CFL: Set of words of the form an bn cn
Context-free Languages: Recap (2/2)
Alternative characterization: Context-free grammars Natural and popular for defining syntax Nondeterministic PDAs are more expressive than
deterministic ones Emptiness of a PDA solvable in polynomial-time Closed under union, but not closed under
intersection or complementation Language inclusion, emptiness of intersection
undecidable Applications: Parsing, Natural language processing,
Program analysis…
Exposing Calls and Returns
Pushdown alphabet: partitioned into 3 disjoint sets
Σ = push pop local
Pushdown words: finite words over pushdown Σ
A visibly pushdown automaton over a pushdown alphabet Σ is a pushdown automaton that pushes a symbol onto the stack on a symbol in push
pops the stack on a symbol in pop
cannot change the stack on a symbol in local
Key: Stack size at any time is determined by the input wordbut not control state or stack content
A language L is a VPL over a pushdown alphabet Σ, if there is a visibly pushdown automaton that accepts it (acceptance by final state)
The language {an bn | for some n} VPL if a is in push and b is in pop
Not a VPL for other partitions
The language of words with equal number of a and b symbols is not a VPL (independent of partition)
Every regular language L is a VPL independent of partitioning
Dyck language (words with well-balanced parantheses) is a VPL provided left/right parantheses are in push/pop resp
Visibly pushdown languages (VPL)
VPLs in Program AnalysisProgram
bool P(u:int) { global int x; local int y; … a: if Q { x = (x+y) }; …}
bool Q { local int y; if { …. y++; return 1;} else return P(x)}
To figure out whether the expressione=(x+y) is very busy at program point a,
push = {call-p, call-q}pop = {ret-p, ret-q}local = {used-e, mod-x, mod-y, skip}
Executions are pushdown words, e.g.
call-q, skip, mod-y, ret-q, used-e, mod-x, skip, ret-p
Set of executions starting at a location a is a VPL: LaSet of executions in which e is very-busy is also a VPL: Le e is very busy at a if La is included in Le
Analysis
VPLs for Document ProcessingXML Document
<conference> <name> CONCUR 2005 </name> <location> <city> San Francisco </city> <hotel> Stanford Court </hotel> </location> <sponsor> CISCO </sponsor> <sponsor> Microsoft </sponsor> …</conference>
Pushdown alphabet
push = {<name>, <location>, …}pop = {</name>, </location>, …}local = {San Francisco, Microsoft, …}
A document d is a pushdown word
Sample Query: Find documents related to conferences sponsored by Microsoft inSan Francisco
Specify query as a VPL: LAnalysis: Membership question Does document d satisfy query L ?
Use VPAs instead of tree automata!(typically, no recursion, but only hierarchy)
Query Processing
Note: can’t combine languages wrt different partitions
Closed under intersection: Given two VPAs A and B, build a product C accepting intersection of L(A) and L(B) State of C: (state of A, state of B) Stack symbol of C: (stack symbol of A, stack symbol of B) C can simulate the stacks of A and B together
Closed under union Closed under complementation Closed under concatenation and Kleene-* Closed under partition-preserving homomorphisms
Closure Properties
Given a nondeterministic VPA A, we can construct a deterministic VPA B that accepts the same language and has size exponential in A
Potentially useful for building runtime monitors for checking program executions, and online algorithms for XML query processing
VPLs are a subclass of DCFLs (languages defined by deterministic PDAs) DCFLs not closed under union Equivalence problem for DCFLs decidable, but complex
Determinization
Determinization of nondeterministic automata uses subset construction: a state R of B is a set of states of A (the states that A can be, having read the word w so far)
Subset construction does not apply to stack But we can do subsets of summaries: if w is a well-
matched word, (q,q’) is a summary of A on w, if A can go from (q,$) to (q’,$), where $ is stack bottom
More precisely, if w=w1c1w2c2…cnwn+1, where ci’s are calls and wi’s are well-matched words, then after reading w, determinized automaton B has Stack is (Sn,Rn,cn),….(S1,R1,c1)$
Control state is (Sn+1,Rn+1)
Ri = Set of all states A can be in after w1c1…wi
Si= Set of all summaries of A on the segment wi
Determinization: Sketch of the construction
Emptiness: Given a VPA A, is its language empty? Same as for PDAs: Polynomial-time complete (cubic)
Language inclusion (or equivalence): Given VPAs A and B, is language of A contained in that of B? Determinize B, take its complement, take product with A,
and test for emptiness Exponential-time complete Recall: Inclusion is PSPACE-complete for (nondeterministic)
finite automata, and undecidable for PDAs
Decision Problems
VPL Properties Summary
Regular
CFL
DCFL
VPL
L Emptiness Inclusion
Yes Yes Yes
Yes No No
No
UndecPTIME
No Yes PTIME Undec
Yes Yes Yes PTIME Exptime
NLOG Pspace
Pushdown Words as Binary Trees
Let w = i5 c1 i1 c2 i4 i3 i3 r2 c1 i1 r1 r1 i5 i3
i5
c1
r1i1
c2
i4
i3
i3
i5
i3r2
Stack trees
r1
c1
i1
VPL: Connection to tree languages
Tree-language characterization:
Let L be a set of pushdown words and let ST(L) be the set of stack trees that correspond to L.
Theorem: L is a VPL iff ST(L) is a regular tree language
Note: It is well-known that the set of parse trees correspondingto a context-free grammar is a regular tree language
Finite word automata that can jump
Let w = i5 c1 i1 c2 i4 i3 i3 r2 c1 i1 r1 r1 i5 i3
Summary Automata Finite-state automaton that reads pushdown word While reading a call, can send a copy to matching return (q,a) is a set of pairs of states if a is in push
Nondeterministic summary automata are expressively equivalent to VPAs
Deterministic VPA (= VPL)> Deterministic summary automata > Deterministic tree automata (on stack trees)
Robustness: Alternative Characterizations
Monadic second order logic with matching predicate (x,y) means x is a call and y is matching return Sample formula:
forall x. if p(x) then exists y,z. ( q(y) and x<y<z and (x,z) )
Thm. MSO + matching predicate interpreted over pushdown words is expressively equivalent to VPLs
Thm: Every CFL is a homomorphic image of a VPL
Context-free grammar based characterization Two types of non-terminals V0 (matched words) and V1
All productions are of the formX a if X is in V0 then a must be local
X a Y b Z a is a call, b is a return, Y is in V0
if X is in V0 then Z must be in V0
“Regular-like” properties continue..
Congruences and minimization (Myhill-Nerode Theorem) central to theory of regular languages
Given a language L, for well-matched words u and v, define u ~L v iff for all words x and y, xuy in L iff xvy in L
Theorem: A language L of well-matched words is a VPL iff the congruence ~L is of finite index
Minimization No unique minimal deterministic VPA in general, but… Minimization of (single-entry) RSMs (i.e. procedural
boolean programs) possible. Partitioning into k procedures/modules is adequate to get canonicity!
ω-VPL - Extension to Infinite Words
A Büchi VPA: VPA over infinite pushdown words A word is accepted if along a run, the set F is seen infinitely often
ω-VPL – class of languages accepted by Büchi VPAs
ω-VPL is closed under all Boolean operations
Characterization using regular trees and MSO characterization hold.
However, ω-VPLs are not determinizable!
Let L be set of all words such that the stack is repeatedly bounded i.e. for some n, the stack depth is n infinitely often. L is an ω-VPL but there is no deterministic (Muller) VPA for it Language inclusion and equivalence are still decidable
Talk Outline
Visibly Pushdown Languages
Temporal Logic CaRet and Model Checking
Ongoing Work
Software Model Checking
Code Abstractor Model
Verifier Specification
Yes
Counter-example
Predicate abstraction
Control flow graph +Boolean vars
(Pushdown automata)
CaRet/VPAs
Observables
Abstracting Software
int x, y;
if x>0 { ……. y=x+1 .……}else { …… y=x+1 ……}
bx: x>0
by: y>0
Program
bool bx, by;
if bx { ……… by=true ………}else { ………… by={true,false} ……….}
Boolean Program
Abstracting Modular Programs
main() { bool y; … x = P(y); … z = P(x); …}bool P(u: bool) {…return Q(u);}bool Q(w: bool) { if … else return P(~w)}
A2
A1
A3
A2
A2
A3
A3
A1Entry/Inputs
Exit/outputs
Box (function-calls)
Program Recursive State Machine (RSM)/ Pushdown automaton
Linear-time Propositional Temporal Logic
Q ::- p | not Q | Q or Q’ | Next Q | Always Q | Eventually Q | Q Until Q’
Interpreted over (infinite) sequences.Models of an LTL formula is a -regular language.Useful for stating sequencing properties:
If req happens, then req holds until it is granted: Always ( req → (req Until grant) )
An exception is never raised: Always ( not Exception )
CARET
CARET: A temporal logic for Calls and Returns Expresses context-free properties
A
B
C
A
Global successor used by LTL
………….
CARET
CARET: A temporal logic for Calls and Returns Expresses context-free properties
A
B
C
D
Global successor used by LTL
………….
Local successor: Jump from calls to returns Otherwise global successor at the same level
CARET
CARET: A temporal logic for Calls and Returns Expresses context-free properties
A
B
C
A
Global successor used by LTL
………….
Local successor: Jump from calls to returns Otherwise global successor at the same level
CARET
CARET: A temporal logic for Calls and Returns Expresses context-free properties
A
B
C
A
Global successor used by LTL
………….
Local successor: Jump from calls to returns Otherwise global successor at the same level
Local path
CARET
CARET: A temporal logic for Calls and Returns Expresses context-free properties
A
B
C
A
Global successor used by LTL
………….
Local successor: Jump from calls to returns Otherwise global successor at the same level
Caller modality: Jump to the caller of the current module Defined for every position except top-level ones
CARET
CARET: A temporal logic for Calls and Returns Expresses context-free properties
A
B
C
A
Global successor used by LTL
………….
Abstract successor: Jump from calls to returns Otherwise global successor at the same level
Caller modality: Jump to the caller of the current module Defined for every position except top-level ones
Caller path gives the stack content!
CARET Definition
Syntax: Q ::- p | not Q | Q or Q’ |
Next Q | Always Q | Eventually Q | Q Until Q’
Local-Next Q | Local-Always Q Local-Eventually Q | Q Local-Until Q’
Caller Q | CallerPath-Always Q CallerPath-Eventually Q | Q CallerPath-Until Q’
Local- and Caller- versions of all temporal operators All these operators can be nested
Expressing properties in Caret
Pre-post conditions:If P holds when A is called, then Q must hold when
the call returns
Always ( (P and call-to-A) Local-Next Q )
A
PQ
Integrating Manna/Pnueli-style reasoning for reactive computationswith Hoare-style reasoning for structured programs
Expressing properties in Caret
If A is called with low priority, then it cannot access the file Always ( call-to-A and low-priority
Local-Always ( not access-file ) )
Alow-priority
Ahigh-priority access-file
Expressing properties in Caret
Stack inspection properties
If a variable x is accessed, then A must be on the call stack Always ( access-to-x
CallerPath-Eventually call-to-A )
access-to-x
A
Model checking CARET
Given: A (boolean) recursive state machine/ visibly pushdown automaton M A CARET formula Q Model-checking: Do all runs of M satisfy the specification Q?
CARET can be model-checked in time that is polynomial in M and exponential in Q.
|M|3 . 2O(|Q|)
Complexity class same as that for LTL !Generalization of Vardi-Wolper construction
Model-checking CARET: Intuition
The specification matches calls and returns of the program, so the runs of the program and models of the formula are both visibly pushdown languages
Given M and formula Q, Build a Buchi pushdown automaton that accepts words exhibited by M that satisfy (not Q) Check this pushdown automaton for emptiness Construction builds on the classical tableaux for LTL
s, Q1
sPush s and Q1
Local-Next Q1 Pop s and Q1 ;
Check Q1
Conclusions and Ongoing Work
Exposing calls and returns lets you hide the stack! VPLs seem robust and adequate to model software
analysis problems VPL-triggered research
Dynamic logic with VPL (Loding,Serre) Visibly pushdown games (Loding,Madhusudan,Serre) XML query processing (Pitcher) Third-order Algol with iteration (Murawski,Walukiewicz)
Active area of current research DTDs, XML, and query languages Branching-time logics, Fixpoint calculus, and visibly
pushdown tree automata (Alur, Chaudhuri, Madhusudan) Expressive completeness of temporal operators Implementing a model checker for VPL monitors