learning universally quantified invariants of linear data structures pranav garg 1, christof loding,...

21
Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1 , Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University of Illinois at Urbana-Champaign 2 RWTH Aachen, Germany

Upload: prosper-hawkins

Post on 17-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

 Active learning: - learner queries teacher with equivalence and membership queries  Passive learning: - given a sample = (examples, counter-examples), learn the simplest concept 3 Active Learning and Passive Learning Teacher Active Learner membership/ equivalence yes/no Learner Sample S

TRANSCRIPT

Page 1: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

Learning Universally Quantified Invariants of Linear Data Structures

Pranav Garg1, Christof Loding, 2 P. Madhusudan1 and Daniel Neider2

1University of Illinois at Urbana-Champaign2RWTH Aachen, Germany

Page 2: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

2

Renewed interest in application of learning to synthesizing invariants [Sharma et al. CAV-12], [Sharma et al. SAS-13], [Kong et al. APLAS-10]

Black-box learning of invariants:

Advantages with respect to white-box techniques: - verification of complex program with simple invariants - generalization - apply extremely scalable Machine Learning algorithms

for verification.

Black-box learning of invariants

checkHypothesis?

Program Learner

Teacher

H(hypothesis)

Page 3: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

3

Active learning:- learner queries teacher with equivalence and

membership queries

Passive learning:- given a sample = (examples, counter-examples),

learn the simplest concept

Active Learning and Passive Learning

Teacher Active Learner

membership/equivalence

yes/no

LearnerSample S

Page 4: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

4

Build active learning algorithms for learning quantified formulas over linear data structures (arrays/lists).

- introduce Quantified Data Automata normal form for such invariants.

- build active learning algorithm for QDAs.

Build passive learning algorithm using active learning algorithm.

- based on an imprecise teacher that answers questions wrt the samples.

Introduce elastic QDAs (EQDAs) that translate to decidable logics.

- develop learning algorithms for EQDAs.

Overview

5 7 8 9

head List pointed to by head is sorted

))()(.( 212*

1*

21 ydataydatayyheadyy nextnext

Page 5: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

5

Program Configuration/Data words

8 932

head

4 7

i

Programconfiguration:

{}7

{}9

{}3

}{8

{}4

}{2

ihead

Data word:

Page 6: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

6

Quantified Data Automata QDAs represent universally quantified properties of linear

data structures.)),(),(.(/\ ypDataypGuardy iii

Example:

b b

head y1 y2 data(y1) <= data(y2)

))()(.( 212121 ydataydatayyheadyy nextnext

b b

Page 7: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

7

Quantified Data AutomataFix P – program pointer variablesFix Y – set of quantified variablesFix F – numerical abstract domain over data formulas

QDA over linear data structures:- reads a data word annotated with pointers P and Y- checks whether data stored at these positions satisfy a data property

QDA accepts a data word w with pointers P if it accepts all possible extensions of w with valuations for Y.

b b

head y1 y2 data(y1) <= data(y2)

))()(.( 212121 ydataydatayyheadyy nextnext

b b

Page 8: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

8

Valuation words Valuation word = data word over P + valuation for Y

Data word

Valuation words

Universal Quantification QDA accepts a data word iff it accepts ALL corresponding valuation words.

8 932head

4 7i8 932

head, y14 7

i, y28 932

head4 7

i, y2y1

Page 9: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

9

Quantified Data Automata Deterministic, finite, register automata over words

- each state labeled with a data formula f For a valuation word, QDA reads ptr. and univ. vars. and

stores the data values in the register reg.

At the final state, QDA checks if these data values satisfy the formula labeling the state.- reg satisfies f(q) Accepts the valuation word- reg does not satisfy f(q) Rejects the valuation word

head 2y1 4i 8y2 8

reg:

f(q) = data(y1) <= data(y2)

8 932head

4 7i, y2y1

8 932head

4 7i, y2y1

Page 10: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

10

QDAs are finite automata which output data formulas.

Lift Angluin’s L* algorithm for learning DFAs to learn QDAs.

Given a teacher, the unique minimal QDA can be learned in time polynomial in the size of this minimal QDA.

Learning QDAs

b

head y1 y2 data(y1) <= data(y2)

b b

*}{*}{*}{ 21 bybybheadRegular expression outputsdata(y1) <= data(y2)

Page 11: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

11

Elastic Quantified Data Automata (EQDA)

Subclass of QDAs which translate to decidable logics- Array Property Fragment (APF) [Bradley et al. VMCAI-

06] - decidable fragment of Strand over lists [Madhusudan

et al. POPL-11]

Cannot test whether two universal vars. are a bounded distance away.

21 ),( qbq 21 qq Restriction for EQDAs: All transitions on blank symbols (no ptr./univ. var) must be self-loops

)1( 1221 yyyy )( 2121 yyyy outside APF inside APF

y1y2 b

by1

y2 b b

QDA EQDA

Page 12: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

12

Elastic Quantified Data Automata (EQDA)

Unique minimal over-approximation theorem:A QDA A can be uniquely minimally over-approximated by a language of valuation words that is accepted by an EQDA Ael

The construction of Ael given QDA A is called elastification.

Learning EQDAs <= learning QDAs + elastification.

AAel

Bel Cel

Page 13: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

13

Passively learning QDAs

Given the samples S+ and S-, the teacher uses them to answer the active learner.The teacher wants the active learner to construct a QDA that includes S+ and

excludes S-.

Membership query:- if s belongs to S+, return yes- if s belongs to S-, return no- otherwise, return no (errs on keeping the learned concept semantically small)

Equivalence query: - checks if conjectured invariant is consistent with S+ and S-

The learned QDA might be non-optimal (usually small).Running time is polynomial in the size of the learned QDA.

TeacherSample S+, S-

Active Learner

PassiveLearner

Page 14: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

14

Experiments Run the program on arrays/lists of small bounded sizes,

with data values from a bounded data-domain, eg. {0, 1, 2}, etc.

Extract the concrete data-structures that get manifest at loop headers.

Obtain the set S+ on which passive learning is performed.- fix F to the cartesian lattice of atomic formulas over

relations {=, <, ≤}

Learn QDAs using Angluin’s algorithm- The learner never asks long membership queries- The teacher, thus, often has correct answers.

The learned QDA is over-approximated to an elastic QDA to get a quantified invariant over decidable Strand or APF.

Page 15: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

15

ExperimentsPrograms #Equiv. #Mem #States Time (teacher) Time (learner)

BUBBLE-SORT 3 447 12 0.19 0.01

QUICK-SORT 1 37 5 0.03 0.00

SELECTION-SORT 3 306 11 0.18 0.01

INSERTION-SORT 3 305 11 0.19 0.00

HEAP-SORT 1 57 6 0.05 0.01

SORTED-FIND 6 1683 15 0.04 0.01

SORTED-INSERT 3 1096 20 0.04 0.01

SORTED-MERGE 1 5775 42 10.50 0.06

SORTED-REVERSE 2 439 18 0.02 0.00

COPY 2 146 10 1.75 0.00

COMPARE 2 146 10 0.51 0.00

MAX 7 1608 14 0.08 0.00

INIT 5 879 10 0.07 0.01

FIND 2 121 8 0.05 0.00

PARTITITON 10 11807 38 11.40 0.11

SPLIT 2 287 14 0.21 0.00

COREUTILS-SORT 17 37 5 0.03 0.07

Page 16: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

16

ExperimentsPrograms #Equiv. #Mem #States Time (teacher) Time (learner)

BUBBLE-SORT 3 447 12 0.19 0.01

QUICK-SORT 1 37 5 0.03 0.00

SELECTION-SORT 3 306 11 0.18 0.01

INSERTION-SORT 3 305 11 0.19 0.00

HEAP-SORT 1 57 6 0.05 0.01

SORTED-FIND 6 1683 15 0.04 0.01

SORTED-INSERT 3 1096 20 0.04 0.01

SORTED-MERGE 1 5775 42 10.50 0.06

SORTED-REVERSE 2 439 18 0.02 0.00

COPY 2 146 10 1.75 0.00

COMPARE 2 146 10 0.51 0.00

MAX 7 1608 14 0.08 0.00

INIT 5 879 10 0.07 0.01

FIND 2 121 8 0.05 0.00

PARTITITON 10 11807 38 11.40 0.11

SPLIT 2 287 14 0.21 0.00

COREUTILS-SORT 17 37 5 0.03 0.07

Page 17: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

17

ExperimentsPrograms #Equiv. #Mem #States Time (teacher) Time (learner)

BUBBLE-SORT 3 447 12 0.19 0.01

QUICK-SORT 1 37 5 0.03 0.00

SELECTION-SORT 3 306 11 0.18 0.01

INSERTION-SORT 3 305 11 0.19 0.00

HEAP-SORT 1 57 6 0.05 0.01

SORTED-FIND 6 1683 15 0.04 0.01

SORTED-INSERT 3 1096 20 0.04 0.01

SORTED-MERGE 1 5775 42 10.50 0.06

SORTED-REVERSE 2 439 18 0.02 0.00

COPY 2 146 10 1.75 0.00

COMPARE 2 146 10 0.51 0.00

MAX 7 1608 14 0.08 0.00

INIT 5 879 10 0.07 0.01

FIND 2 121 8 0.05 0.00

PARTITITON 10 11807 38 11.40 0.11

SPLIT 2 287 14 0.21 0.00

COREUTILS-SORT 17 37 5 0.03 0.07

Page 18: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

18

ExperimentsPrograms #Equiv. #Mem #States Time (teacher) Time (learner)

BUBBLE-SORT 3 447 12 0.19 0.01

QUICK-SORT 1 37 5 0.03 0.00

SELECTION-SORT 3 306 11 0.18 0.01

INSERTION-SORT 3 305 11 0.19 0.00

HEAP-SORT 1 57 6 0.05 0.01

SORTED-FIND 6 1683 15 0.04 0.01

SORTED-INSERT 3 1096 20 0.04 0.01

SORTED-MERGE 1 5775 42 10.50 0.06

SORTED-REVERSE 2 439 18 0.02 0.00

COPY 2 146 10 1.75 0.00

COMPARE 2 146 10 0.51 0.00

MAX 7 1608 14 0.08 0.00

INIT 5 879 10 0.07 0.01

FIND 2 121 8 0.05 0.00

PARTITITON 10 11807 38 11.40 0.11

SPLIT 2 287 14 0.21 0.00

COREUTILS-SORT 17 37 5 0.03 0.07

Page 19: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

19

ExperimentsPrograms #Equiv. #Mem #States Time (teacher) Time (learner)

BUBBLE-SORT 3 447 12 0.19 0.01

QUICK-SORT 1 37 5 0.03 0.00

SELECTION-SORT 3 306 11 0.18 0.01

INSERTION-SORT 3 305 11 0.19 0.00

HEAP-SORT 1 57 6 0.05 0.01

SORTED-FIND 6 1683 15 0.04 0.01

SORTED-INSERT 3 1096 20 0.04 0.01

SORTED-MERGE 1 5775 42 10.50 0.06

SORTED-REVERSE 2 439 18 0.02 0.00

COPY 2 146 10 1.75 0.00

COMPARE 2 146 10 0.51 0.00

MAX 7 1608 14 0.08 0.00

INIT 5 879 10 0.07 0.01

FIND 2 121 8 0.05 0.00

PARTITITON 10 11807 38 11.40 0.11

SPLIT 2 287 14 0.21 0.00

COREUTILS-SORT 17 37 5 0.03 0.07

Page 20: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

20

Related Work Daikon [Ernst et al. ICSE-00]

- conjunctive Boolean learning- learns quantified invariants over arrays, to some

extent.

Applications of learning in verification- rely-guarantee contracts [Cobleigh et al. TACAS-03,

Alur et al. CAV-05]- stateful interfaces [Alur et al. POPL-05]- learning quantified invariants over predicates [Kong et

al. APLAS-10]

Machine learning algorithms for invariant synthesis[Sharma et al. CAV-12, SAS-13, ESOP-13]

Page 21: Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University

21

Conclusion Learning universally quantified invariants over linear

data structures- Quantified Data Automata (QDA) / elastic QDAs- Active learning for QDAs- Unique elastification- Algorithm for passive learning QDAs/EQDAs.- Experimental validation

Future Work: Extensions to trees to capture universally quantified

properties like binary-search-tree, max-heap, … Combining automata based structural learning with

machine learning algorithms for learning data formulasThank You !