the benefits of exposing calls and returns rajeev alur university of pennsylvania concur/spin,...

The Benefits of Exposing Calls and Returns

Rajeev AlurUniversity of Pennsylvania

CONCUR/SPIN, August 2005

Software Model Checking

Code Abstractor Model

Verifier Specification

Yes

Counter-example

Predicate abstraction

Control flow graph +Boolean vars

(Pushdown automata)

On-the-fly explicit stateSymbolic fixpoint evaluation

Temporal logics/Automata Regular!

Observables

Regular specifications not expressive enough

Classical Hoare-style pre/post conditions If p holds when procedure A is invoked, q holds upon return Total correctness: every invocation of A terminates Integral part of emerging standard JML

Stack inspection properties: security/access control If a setuuid bit is being set, process root must be in the call

stack

Inter-procedural data-flow analysis An expression e is very busy at a control point p if on all

paths from p, e will be used before any of its variables (possibly local in current procedure) are modified

Need matching of calls with returns, or finding pending calls, or local paths --- Context-free properties!

Checking Context-free Specifications

Obstcales Context-free languages are not closed under intersection Checking context-free properties against context-free models

is undecidable

However, many such properties are verifiable Existing work in security that handles some stack inspection

properties [JMT99,JKS03] Adding assert statements in the program (with additional

local variables, if needed), and then checking regular properties (e.g. reachability) amounts to checking context-free properties

Inter-procedural data-flow analysis algorithms [RHS95]

Exposing Calls and Returns

What’s common to the checkable properties? Both model and property have their own stack, but the two

stacks are synchronized and grow/shrink together!

As a generator, program exposes its calls and returns, and as an acceptor, property pushes on calls and pops on returns

Formalization of this intuition: Visibly Pushdown Languages

A surprisingly robust class of languages with properties like the regular languages and potentially many applications

Talk Outline

Visibly Pushdown Languages

Temporal Logic CaRet and Model Checking

Ongoing Work

References:

Visibly pushdown languages

Alur, Madhusudan; STOC’04 A temporal logic of nested calls and returns

Alur, Etessami, Madhusudan; TACAS’04 Congruences for visibly pushdown languages

Alur, Kumar, Madhusudan, Viswanthan; ICALP’05

Context-free Languages: Recap (1/2)

Given an alphabet , a language L is a set of finite words over

A pushdown automaton (PDA) has a finite control and a stack, and while reading a word, it can push/pop stack symbols while updating control state

Configuration of a PDA: control state + a string of stack symbols

Acceptance defined by empty stack or final state A language L is a context-free language (CFL) if

there is a pushdown automaton that accepts it Sample CFLs

All regular languages Set of words of the form an bn, for some n Set of words with equal number of a and b symbols

Non-CFL: Set of words of the form an bn cn

Context-free Languages: Recap (2/2)

Alternative characterization: Context-free grammars Natural and popular for defining syntax Nondeterministic PDAs are more expressive than

deterministic ones Emptiness of a PDA solvable in polynomial-time Closed under union, but not closed under

intersection or complementation Language inclusion, emptiness of intersection

undecidable Applications: Parsing, Natural language processing,

Program analysis…

Exposing Calls and Returns

Pushdown alphabet: partitioned into 3 disjoint sets

Σ = push pop local

Pushdown words: finite words over pushdown Σ

A visibly pushdown automaton over a pushdown alphabet Σ is a pushdown automaton that pushes a symbol onto the stack on a symbol in push

pops the stack on a symbol in pop

cannot change the stack on a symbol in local

Key: Stack size at any time is determined by the input wordbut not control state or stack content

A language L is a VPL over a pushdown alphabet Σ, if there is a visibly pushdown automaton that accepts it (acceptance by final state)

The language {an bn | for some n} VPL if a is in push and b is in pop

Not a VPL for other partitions

The language of words with equal number of a and b symbols is not a VPL (independent of partition)

Every regular language L is a VPL independent of partitioning

Dyck language (words with well-balanced parantheses) is a VPL provided left/right parantheses are in push/pop resp

Visibly pushdown languages (VPL)

VPLs in Program AnalysisProgram

bool P(u:int) { global int x; local int y; … a: if Q { x = (x+y) }; …}

bool Q { local int y; if { …. y++; return 1;} else return P(x)}

To figure out whether the expressione=(x+y) is very busy at program point a,

push = {call-p, call-q}pop = {ret-p, ret-q}local = {used-e, mod-x, mod-y, skip}

Executions are pushdown words, e.g.

call-q, skip, mod-y, ret-q, used-e, mod-x, skip, ret-p

Set of executions starting at a location a is a VPL: LaSet of executions in which e is very-busy is also a VPL: Le e is very busy at a if La is included in Le

Analysis

VPLs for Document ProcessingXML Document

<conference> <name> CONCUR 2005 </name> <location> <city> San Francisco </city> <hotel> Stanford Court </hotel> </location> <sponsor> CISCO </sponsor> <sponsor> Microsoft </sponsor> …</conference>

Pushdown alphabet

push = {<name>, <location>, …}pop = {</name>, </location>, …}local = {San Francisco, Microsoft, …}

A document d is a pushdown word

Sample Query: Find documents related to conferences sponsored by Microsoft inSan Francisco

Specify query as a VPL: LAnalysis: Membership question Does document d satisfy query L ?

Use VPAs instead of tree automata!(typically, no recursion, but only hierarchy)

Query Processing

Note: can’t combine languages wrt different partitions

Closed under intersection: Given two VPAs A and B, build a product C accepting intersection of L(A) and L(B) State of C: (state of A, state of B) Stack symbol of C: (stack symbol of A, stack symbol of B) C can simulate the stacks of A and B together

Closed under union Closed under complementation Closed under concatenation and Kleene-* Closed under partition-preserving homomorphisms

Closure Properties

Given a nondeterministic VPA A, we can construct a deterministic VPA B that accepts the same language and has size exponential in A

Potentially useful for building runtime monitors for checking program executions, and online algorithms for XML query processing

VPLs are a subclass of DCFLs (languages defined by deterministic PDAs) DCFLs not closed under union Equivalence problem for DCFLs decidable, but complex

Determinization

Determinization of nondeterministic automata uses subset construction: a state R of B is a set of states of A (the states that A can be, having read the word w so far)

Subset construction does not apply to stack But we can do subsets of summaries: if w is a well-

matched word, (q,q’) is a summary of A on w, if A can go from (q,$) to (q’,$), where $ is stack bottom

More precisely, if w=w1c1w2c2…cnwn+1, where ci’s are calls and wi’s are well-matched words, then after reading w, determinized automaton B has Stack is (Sn,Rn,cn),….(S1,R1,c1)$

Control state is (Sn+1,Rn+1)

Ri = Set of all states A can be in after w1c1…wi

Si= Set of all summaries of A on the segment wi

Determinization: Sketch of the construction

Emptiness: Given a VPA A, is its language empty? Same as for PDAs: Polynomial-time complete (cubic)

Language inclusion (or equivalence): Given VPAs A and B, is language of A contained in that of B? Determinize B, take its complement, take product with A,

and test for emptiness Exponential-time complete Recall: Inclusion is PSPACE-complete for (nondeterministic)

finite automata, and undecidable for PDAs

Decision Problems

VPL Properties Summary

Regular

CFL

DCFL

VPL

L Emptiness Inclusion

Yes Yes Yes

Yes No No

No

UndecPTIME

No Yes PTIME Undec

Yes Yes Yes PTIME Exptime

NLOG Pspace

Pushdown Words as Binary Trees

Let w = i5 c1 i1 c2 i4 i3 i3 r2 c1 i1 r1 r1 i5 i3

i5

c1

r1i1

c2

i4

i3

i3

i5

i3r2

Stack trees

r1

c1

i1

VPL: Connection to tree languages

Tree-language characterization:

Let L be a set of pushdown words and let ST(L) be the set of stack trees that correspond to L.

Theorem: L is a VPL iff ST(L) is a regular tree language

Note: It is well-known that the set of parse trees correspondingto a context-free grammar is a regular tree language

Finite word automata that can jump

Let w = i5 c1 i1 c2 i4 i3 i3 r2 c1 i1 r1 r1 i5 i3

Summary Automata Finite-state automaton that reads pushdown word While reading a call, can send a copy to matching return (q,a) is a set of pairs of states if a is in push

Nondeterministic summary automata are expressively equivalent to VPAs

Deterministic VPA (= VPL)> Deterministic summary automata > Deterministic tree automata (on stack trees)

Robustness: Alternative Characterizations

Monadic second order logic with matching predicate (x,y) means x is a call and y is matching return Sample formula:

forall x. if p(x) then exists y,z. ( q(y) and x<y<z and (x,z) )

Thm. MSO + matching predicate interpreted over pushdown words is expressively equivalent to VPLs

Thm: Every CFL is a homomorphic image of a VPL

Context-free grammar based characterization Two types of non-terminals V0 (matched words) and V1

All productions are of the formX a if X is in V0 then a must be local

X a Y b Z a is a call, b is a return, Y is in V0

if X is in V0 then Z must be in V0

“Regular-like” properties continue..

Congruences and minimization (Myhill-Nerode Theorem) central to theory of regular languages

Given a language L, for well-matched words u and v, define u ~L v iff for all words x and y, xuy in L iff xvy in L

Theorem: A language L of well-matched words is a VPL iff the congruence ~L is of finite index

Minimization No unique minimal deterministic VPA in general, but… Minimization of (single-entry) RSMs (i.e. procedural

boolean programs) possible. Partitioning into k procedures/modules is adequate to get canonicity!

ω-VPL - Extension to Infinite Words

A Büchi VPA: VPA over infinite pushdown words A word is accepted if along a run, the set F is seen infinitely often

ω-VPL – class of languages accepted by Büchi VPAs

ω-VPL is closed under all Boolean operations

Characterization using regular trees and MSO characterization hold.

However, ω-VPLs are not determinizable!

Let L be set of all words such that the stack is repeatedly bounded i.e. for some n, the stack depth is n infinitely often. L is an ω-VPL but there is no deterministic (Muller) VPA for it Language inclusion and equivalence are still decidable

Talk Outline

Visibly Pushdown Languages

Temporal Logic CaRet and Model Checking

Ongoing Work

Software Model Checking

Code Abstractor Model

Verifier Specification

Yes

Counter-example

Predicate abstraction

Control flow graph +Boolean vars

(Pushdown automata)

CaRet/VPAs

Observables

Abstracting Software

int x, y;

if x>0 { ……. y=x+1 .……}else { …… y=x+1 ……}

bx: x>0

by: y>0

Program

bool bx, by;

if bx { ……… by=true ………}else { ………… by={true,false} ……….}

Boolean Program

Abstracting Modular Programs

main() { bool y; … x = P(y); … z = P(x); …}bool P(u: bool) {…return Q(u);}bool Q(w: bool) { if … else return P(~w)}

A2

A1

A3

A2

A2

A3

A3

A1Entry/Inputs

Exit/outputs

Box (function-calls)

Program Recursive State Machine (RSM)/ Pushdown automaton

Linear-time Propositional Temporal Logic

Q ::- p | not Q | Q or Q’ | Next Q | Always Q | Eventually Q | Q Until Q’

Interpreted over (infinite) sequences.Models of an LTL formula is a -regular language.Useful for stating sequencing properties:

If req happens, then req holds until it is granted: Always ( req → (req Until grant) )

An exception is never raised: Always ( not Exception )

CARET

CARET: A temporal logic for Calls and Returns Expresses context-free properties

A

B

C

A

Global successor used by LTL

………….

CARET


A

B

C

D


………….

Local successor: Jump from calls to returns Otherwise global successor at the same level

CARET


A

B

C

A


………….


CARET


A

B

C

A


………….


Local path

CARET


A

B

C

A


………….


Caller modality: Jump to the caller of the current module Defined for every position except top-level ones

CARET


A

B

C

A


………….

Abstract successor: Jump from calls to returns Otherwise global successor at the same level

Caller modality: Jump to the caller of the current module Defined for every position except top-level ones

Caller path gives the stack content!

Expressing properties in Caret

Pre-post conditions:If P holds when A is called, then Q must hold when

the call returns

Always ( (P and call-to-A) Local-Next Q )

A

PQ

Integrating Manna/Pnueli-style reasoning for reactive computationswith Hoare-style reasoning for structured programs


If A is called with low priority, then it cannot access the file Always ( call-to-A and low-priority

Local-Always ( not access-file ) )

Alow-priority

Ahigh-priority access-file


Stack inspection properties

If a variable x is accessed, then A must be on the call stack Always ( access-to-x

CallerPath-Eventually call-to-A )

access-to-x

A

Model checking CARET

Given: A (boolean) recursive state machine/ visibly pushdown automaton M A CARET formula Q Model-checking: Do all runs of M satisfy the specification Q?

CARET can be model-checked in time that is polynomial in M and exponential in Q.

|M|3 . 2O(|Q|)

Complexity class same as that for LTL !Generalization of Vardi-Wolper construction

Model-checking CARET: Intuition

The specification matches calls and returns of the program, so the runs of the program and models of the formula are both visibly pushdown languages

Given M and formula Q, Build a Buchi pushdown automaton that accepts words exhibited by M that satisfy (not Q) Check this pushdown automaton for emptiness Construction builds on the classical tableaux for LTL

s, Q1

sPush s and Q1

Local-Next Q1 Pop s and Q1 ;

Check Q1

Conclusions and Ongoing Work

Exposing calls and returns lets you hide the stack! VPLs seem robust and adequate to model software

analysis problems VPL-triggered research

Dynamic logic with VPL (Loding,Serre) Visibly pushdown games (Loding,Madhusudan,Serre) XML query processing (Pitcher) Third-order Algol with iteration (Murawski,Walukiewicz)

Active area of current research DTDs, XML, and query languages Branching-time logics, Fixpoint calculus, and visibly

pushdown tree automata (Alur, Chaudhuri, Madhusudan) Expressive completeness of temporal operators Implementing a model checker for VPL monitors

the benefits of exposing calls and returns rajeev alur university of pennsylvania concur/spin,...

Documents