efficient tabling of structured data using indexing and program transformation

36
Efficient Tabling of Structured Data Using Indexing and Program Transformation Christian Theil Have and Henning Christiansen Research group PLIS: Programming, Logic and Intelligent Systems Department of Communication, Business and Information Technologies Roskilde University, P.O.Box 260, DK-4000 Roskilde, Denmark PADL in Philadelphia, January 23, 2012 Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Upload: christian-have

Post on 29-Aug-2014

359 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Efficient Tabling of Structured DataUsing Indexing and Program Transformation

Christian Theil Have and Henning Christiansen

Research group PLIS: Programming, Logic and Intelligent SystemsDepartment of Communication, Business and Information Technologies

Roskilde University, P.O.Box 260, DK-4000 Roskilde, Denmark

PADL in Philadelphia, January 23, 2012

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 2: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Outline

1 Motivation and backgroundThe trouble with tabling of structured data

2 A workaround implemented in PrologExamples

Example: Edit DistanceExample: Hidden Markov Model in PRISM

3 An automatic program transformation

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 3: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background

Motivation

In the LoSt project we explore the use of Probabilistic Logicprogramming (with PRISM) for biological sequence analysis.

We had some problems analyzing very long sequences..

.. and identified the culprit: Tabling of structured data.

Inspired by earlier work by Christiansen and Gallagher which address asimilar problem related to tabling non-discrimininatory arguments.

Henning Christiansen, John P. GallagherNon-discriminating Arguments and their Uses.ICLP 2009.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 4: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background The trouble with tabling of structured data

The trouble with tabling of structured data

Tabling in logic programming is an established technique which,

can give a significant speed-up of program execution.

make it easier to write efficient programs in a declarative style.

is similar to memoization in functional programming.

The idea:

The system maintains a table of calls and their answers.

when a new call is entered, check if it is stored in the table

if so, use previously found solution.

Tabling is included in several recognized Prolog systems such as B-Prolog,YAP and XSB.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 5: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background The trouble with tabling of structured data

The trouble with tabling of structured data

An innocent lookingpredicate: last/2

last([X],X).

last([_|L],X) :-

last(L,X).

Traverses a list to findthe last element.

Time/space complexity:O(n).

If we table last/2:n + n − 1 + n − 2 . . . 1≈ O(n2) !

call: last([1,2,3,4,5],X)

last([1,2,3,4,5],X) last([1,2,3,4],X) last([1,2,3],X) last([1,2],X) last([1],X)

call table

last([1,2,3,4,5],X).last([1,2,3,4],X).last([1,2,3],X).last([1,2],X).last([1],X).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 6: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background The trouble with tabling of structured data

Wait a minute..

Tabling systems do employ some advanced techniques to avoid theexpensive copying and which may reduce memory consumption and/ortime complexity.For instance,

B-Prolog uses hashing of goals.

XSB uses a trie data structure.

YAP uses a trie structure, which is refined into a so-called global triewhich applies a sharing strategy for common subterms wheneverpossible.

These techniques may reduce space consumption, but if there is no sharingbetween the tables and the actual arguments of an active call, eachexecution of a call may involve a full traversal (naive copying) of itsarguments.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 7: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background The trouble with tabling of structured data

Benchmarking last/2 for B-Prolog, Yap and XSB

To investigate, we benchmarked the tabled version of last/2 for each ofthe major Prolog tabling engines.

To further investigate whether performance is data dependent, webenchmarked with both repeated data and random data.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 8: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background The trouble with tabling of structured data

Space usage, tabled last/2 (1)

● ● ● ●●

●●

●●

0 500 1000 1500 2000 2500 3000

050

0000

1000

000

1500

000

2000

000

b) Space usage

list length (N)

spac

e us

age

(kilo

byte

s)●

XSB (random data)B−Prolog (random data)Yap (random data)XSB (repeated data)B−Prolog (repeated data)Yap (repeated data)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 9: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background The trouble with tabling of structured data

Space usage, tabled last/2 (2)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0 5000 10000 15000

020

0040

0060

0080

0010

000

d) Space usage expanded for the four lower curves

list length (N)

spac

e us

age

(kilo

byte

s)● B−Prolog (random data)

XSB (repeated data)B−Prolog (repeated data)Yap (repeated data)

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 10: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background The trouble with tabling of structured data

Time usage, tabled last/2 (1)

●●

● ●●

●●

● ●

●●

●●

0 500 1000 1500 2000 2500 3000

01

23

4

a) Time usage

list length (N)

time

usag

e (s

econ

ds)

XSB (random data)B−Prolog (random data)Yap (random data)XSB (repeated data)B−Prolog (repeated data)Yap (repeated data)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 11: Efficient Tabling of Structured Data Using Indexing and Program Transformation

Motivation and background The trouble with tabling of structured data

Time usage, tabled last/2 (2)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ●●

● ●●

● ●●

●●

●●

●●

●●

●●

0 10000 20000 30000 40000 50000

0.0

0.2

0.4

0.6

0.8

1.0

c) Time usage expanded for the three lower curves

list length (N)

time

usag

e (s

econ

ds)

● B−Prolog (random data)XSB (repeated data)Yap (repeated data)

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 12: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog

A workaround implemented in Prolog

We describe a workaround giving O(1) time and space complexity for tablelookups for programs with arbitrarily large ground structured data as inputarguments.

A term is represented as a set of facts.

A subterm is referenced by a unique integer serving as an abstractpointer.

Matching related to tabling is done solely by comparison of suchpointers.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 13: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog

An abstract data type

The representation is given by the following predicates which all togethercan be understood as an abstract datatype.

store term( +ground-term, pointer)The ground-term is any ground term, and the pointerreturned is a unique reference (an integer) for that term.

retrieve term( +pointer, ?functor, ?arg-pointers-list)Returns the functor and a list of pointers to representationsof the substructures of the term represented by pointer.

full retrieve term( +pointer, ?ground-term)

Returns the term represented by pointer.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 14: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog

ADT properties

Property 1

It must hold for any ground term s, that the query

store term(s, P), full retrieve term(P, S),

assigns to the variable S a value identical to s.

Property 2

It must hold for any ground term s of the form f (s1, . . . ,sn) that

store term(s, P), retrieve term(P, F, Ss),

assigns to the variable F the symbol f , and to Ss a list of ground values[p1,. . .,pn] such that additional queries

full retrieve term(pi, Si), i = 1, . . . , n

assign to the variables Si values identical to si .

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 15: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog

ADT example

Example

The following call converts the term f(a,g(b)) into its internalrepresentation and returns a pointer value in the variable P.

store_term(f(a,g(b)),P).

After this, the following sequence of calls will succeed.

retrieve_term(P,f,[P1,P2]),

retrieve_term(P1,a,[]),

retrieve_term(P2,g,[P21]),

retrieve_term(P21,b,[]),

full_retrieve_term(P,f(a,g(b))).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 16: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog

Implementation with assert

One possible way of implementing the predicates introduced above is tohave store term/2 asserting facts for the retrieve term/3 predicateusing increasing integers as pointers.

Example

The call store term(f(a,g(b)),P) considered in example 1 may assignthe value 100 to P and as a side-effect assert the following facts.

retrieve_term(100,f,[101,102]).

retrieve_term(101,a,[]).

retrieve_term(102,g,[103]).

retrieve_term(103,b,[]).

Notice that Prolog’s indexing on first arguments ensures a constant lookuptime.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 17: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog

Another level of abstraction: lookup pattern/2

Finally we introduce a utility predicate which may simplify the use of therepresentation in application programs. It utilizes a special kind of terms,called patterns, which are not necessarily ground and which may containsubterms of the form lazy(variable).

lookup pattern( +pointer, +pattern)The pattern is matched in a recursive way against the termrepresented by the pointer p in the following way.– lookup pattern(p,X) is treated asfull retrieve term(p,X).– lookup pattern(p,lazy(X)) unifies X with p.– For any other pattern =.. [F,X1,. . . ,Xn] we call

retrieve term(p, F, [P1,. . .,Pn])

followed by lookup pattern(Pi,Xi), i = 1, . . . , n.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 18: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog

A lookup pattern/2 example

Example

Continuing the previous example, we get that after the callstore term(f(a,g(b)),P).

lookup_pattern(100, f(X,lazy(Y)))

leads to X=a and Y=102.

The lookup pattern/2 predicate will be use to simplify the automaticprogram transformation introduced later.Further efficiency can be gained by compiling it out for each specificpattern (i.e. replacing it with calls to retrieve term/2).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 19: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Examples

We consider two applying the workaround to two example programs:

Edit Distance implemented in Prolog

Hidden Markov Models in PRISM

For each of these we measure the impact (in time and space) of atransformed version using the workaround.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 20: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Implementation of store term/2 and retrieve term/2

The programs have been transformed manually for these experimentsbased on the pointer based representation previously introduced, butsimplified slightly for lists.For store term/2 and retrieve term/2 we use the following simpleimplementation:

store_term([],Index) :- assert(retrieve_term([],Index)).

store_term([X|Xs],Idx) :-

Idx1 is Idx + 1,

assert(retrieve_term(Idx,[X,Idx1])),

store_term(Xs,Idx1).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 21: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Edit Distance

Calculate the minimum number of edits (insertions, deletions andreplacements) to transform on list into another.

a minimal edit-distance algorithm written in Prolog which isdependent on tabling for any non-trivial problem.

The theoretical best time complexity of edit distance has been provento be O(N2).

Measure for various problem sizes.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 22: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Edit distance: Implementation in Prolog.

:- table edit/3.

edit([],[],0).

edit([],[Y|Ys],Dist) :-

edit([],Ys,Dist1),

Dist is 1 + Dist1.

edit([X|Xs],[],Dist) :-

edit(Xs,[],Dist1),

Dist is 1 + Dist1.

edit([X|Xs],[Y|Ys],Dist) :-

edit([X|Xs],Ys,InsDist),

edit(Xs,[Y|Ys],DelDist),

edit(Xs,Ys,TailDist),

(X==Y ->

Dist = TailDist

;

% Minimum of insertion,

% deletion or substitution

sort([InsDist,DelDist,TailDist],

[MinDist|_]),

Dist is 1 + MinDist).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 23: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Edit distance, transformed (1).

original version

edit([],[],0).

edit([],[Y|Ys],Dist) :-

edit([],Ys,Dist1),

Dist is 1 + Dist1.

edit([X|Xs],[],Dist) :-

edit(Xs,[],Dist1),

Dist is 1 + Dist1.

transformed versionedit(XIdx,YIdx,0) :-

retrieve_term(XIdx,[]),

retrieve_term(YIdx,[]).

edit(XIdx,YIdx,Dist) :-

retrieve_term(XIdx,[]),

retrieve_term(YIdx,[_,YIdxNext]),

edit(XIdx,YIdxNext,Dist1),

Dist is Dist1 + 1.

edit(XIdx,YIdx,Dist) :-

retrieve_term(YIdx,[]),

retrieve_term(XIdx,[_,XIdxNext]),

edit(XIdxNext,YIdx,Dist1),

Dist is Dist1 + 1.

continued on next slide...

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 24: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Edit distance, transformed (2).

original versionedit([X|Xs],[Y|Ys],Dist) :-

edit([X|Xs],Ys,InsDist),

edit(Xs,[Y|Ys],DelDist),

edit(Xs,Ys,TailDist),

(X==Y ->

Dist = TailDist

;

sort([InsDist,DelDist,TailDist],

[MinDist|_]),

Dist is 1 + MinDist).

transformed versionedit(XIdx,YIdx,Dist) :-

retrieve_term(XIdx,[X,NextXIdx]),

retrieve_term(YIdx,[Y,NextYIdx]),

edit(XIdx,NextYIdx,InsDist),

edit(NextXIdx,YIdx,DelDist),

edit(NextXIdx,NextYIdx,TailDist),

(X==Y ->

Dist = TailDist

;

sort([InsDist,DelDist,TailDist],

[MinDist|_]),

Dist is 1 + MinDist).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 25: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Edit distance: Benchmarking results (time)

●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●●●●●●●●●

●●

●●

●●

●●

0 50 100 150 200 250 300 350

0.0

0.2

0.4

0.6

0.8

1.0

list length (N)

time

usag

e (s

econ

ds)

XSB 3.3B−Prolog 7.5#4Yap 6.2.1index(XSB)index(B−Prolog)index(Yap)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●

●●

●●●●

●●●

●●●

●●●●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●●●●

●●●

●●●●●●●●

●●

●●●●●

●●●●●●

●●●

●●

●●●●

●●●

●●●

●●

●●●

●●

●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●

●●

●●●

●●●●●●

●●●●

●●

●●●●●

●●●●●●●●●●●

●●●●

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 26: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Edit distance: Benchmarking results (space)

●●●●●●●●●●●●●●●●●●●●●

●●●●●●●

●●●●●

●●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

0 50 100 150

0e+0

02e

+04

4e+0

46e

+04

8e+0

41e

+05

list length (N)

spac

e us

age

(kilo

byte

s)●

XSB 3.3B−Prolog 7.5#4Yap 6.2.1index(XSB)index(B−Prolog)index(Yap)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 27: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

PRISM

PRISM ( PRogramming In Statistical Modelling )

Extends Prolog (B-Prolog) with special goals representing randomvariables (msws).

Semantics: Probabilistic Herbrand models.

Supports various probabilistic inferences.

Relies heavily on tabling for the efficiency of the probabilisticinferences.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 28: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

A Hidden Markov Model in PRISM

values(init,[s0,s1]).

values(out(_),[a,b]).

values(tr(_),[s0,s1]).

hmm(L):-

msw(init,S),

hmm(S,L).

hmm(_,[]).

hmm(S,[Ob|Y]) :-

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).

init

s0

0.3: a

0.7: b0.5 s1

0.6: a

0.4: b0.5

0.1

0.9

0.2

0.8

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 29: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Transformed Hidden Markov Model

We only need to consider the recursive predicate, hmm/2.

original version

hmm(_,[]).

hmm(S,[Ob|Y]) :-

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).

transformed version

hmm(S,ObsPtr):-

retrieve_term(ObsPtr,[]).

hmm(S,ObsPtr) :-

retrieve_term(ObsPtr,[Ob,Y]),

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 30: Efficient Tabling of Structured Data Using Indexing and Program Transformation

A workaround implemented in Prolog Examples

Hidden Markov Model: Benchmarking results (time)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●

●●

●●

●●

●●

●●

0 1000 2000 3000 4000 5000

020

4060

8010

012

014

0

b) Running time without indexed lookup

sequence length

Run

ning

tim

e (s

econ

ds)

●●

●●

● ●●

●●

●● ●

●●

●●

● ●

●●

● ●● ●

●●

●●

●●

●●

● ●

● ●●

●●

●●

● ● ●●

0 1000 2000 3000 4000 5000

0.00

0.02

0.04

0.06

0.08

a) Running time with indexed lookup

sequence length

Run

ning

tim

e (s

econ

ds)

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 31: Efficient Tabling of Structured Data Using Indexing and Program Transformation

An automatic program transformation

An automatic program transformation

We introduce an automatic transformation from a tabled program to anefficient version using our approach.To support the transformation, the user must declare modes for whichpredicate arguments that should be indexed.

table_index_mode(hmm(+))

table_index_mode(hmm(-,+))

Each clause whose head is covered by a table mode declaration istransformed and all other clauses are left untouched.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 32: Efficient Tabling of Structured Data Using Indexing and Program Transformation

An automatic program transformation

Transformation of HMM in PRISM

The transformation moves any term appearing in an indexed position inthe head of a clause into a call to the lookup pattern predicate, which isadded to the body. Variables in such terms are marked lazy when they donot occur in any non-indexed argument inside the clause.

original program

hmm(_,[]).

hmm(S,[Ob|Y]) :-

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).

transformed program

hmm(S,ObsPtr):-

lookup_pattern(ObsPtr,[]).

hmm(S,ObsPtr) :-

lookup_pattern(ObsPtr,[Ob | lazy(Y)]),

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 33: Efficient Tabling of Structured Data Using Indexing and Program Transformation

An automatic program transformation

Transformation algorithm

indexed. For the HMM program of section 4.2 this may look as follows; plusmeans transform the argument, minus means keep it unchanged.

table_index_mode(hmm(+))

table_index_mode(hmm(-,+))

The correctness of the transformation depends on the following properties of theprogram.

– The arguments indicated for indexing must be called with ground data only.– Variables that occur in an indexed argument in the head of a clause, cannot

occur in the body of that clause in both an indexed and a non-indexedargument.

– Any argument of a goal within a clause body which is declared to be indexed,must be given as a variable that also occurs in an indexed argument in thehead of that clause.

Each clause whose head predicate is covered by a table mode declaration istransformed using the procedure outlined in algorithm 1, and all other clausesare left untouched. The transformation moves any term appearing in an indexed

for each clause H:-B in original program doif table index mode (M) matching H then

for each argument Hi ∈ H, Mi ∈M doif Mi =’+’ then

H �i ← MarkLazy(Hi, B)

B ← (lookup pattern(Vi, H�i), B)

Hi ← Vi

end

end

where MarkLazy is defined as

MarkLazy(Hi,B) :PotentialLazy = variables in all goals G ∈ B

where G has table index mode declarationNonLazy = variables in all goals G ∈ B

where G has no table index mode declarationLazy = PotentialLazy \ NonLazyfor each variable V ∈ Hi do

if V ∈ Lazy thenV ← lazy(V )

endAlgorithm 1: Program transformation.

position in the head of a clause into a call to the lookup pattern predicate,which is added to the body. Variables in such terms are marked lazy when theydo not occur in any non-indexed argument inside the clause. This transformation

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 34: Efficient Tabling of Structured Data Using Indexing and Program Transformation

An automatic program transformation

Transformation algorithm: MarkLazy

indexed. For the HMM program of section 4.2 this may look as follows; plusmeans transform the argument, minus means keep it unchanged.

table_index_mode(hmm(+))

table_index_mode(hmm(-,+))

The correctness of the transformation depends on the following properties of theprogram.

– The arguments indicated for indexing must be called with ground data only.– Variables that occur in an indexed argument in the head of a clause, cannot

occur in the body of that clause in both an indexed and a non-indexedargument.

– Any argument of a goal within a clause body which is declared to be indexed,must be given as a variable that also occurs in an indexed argument in thehead of that clause.

Each clause whose head predicate is covered by a table mode declaration istransformed using the procedure outlined in algorithm 1, and all other clausesare left untouched. The transformation moves any term appearing in an indexed

for each clause H:-B in original program doif table index mode (M) matching H then

for each argument Hi ∈ H, Mi ∈M doif Mi =’+’ then

H �i ← MarkLazy(Hi, B)

B ← (lookup pattern(Vi, H�i), B)

Hi ← Vi

end

end

where MarkLazy is defined as

MarkLazy(Hi,B) :PotentialLazy = variables in all goals G ∈ B

where G has table index mode declarationNonLazy = variables in all goals G ∈ B

where G has no table index mode declarationLazy = PotentialLazy \ NonLazyfor each variable V ∈ Hi do

if V ∈ Lazy thenV ← lazy(V )

endAlgorithm 1: Program transformation.

position in the head of a clause into a call to the lookup pattern predicate,which is added to the body. Variables in such terms are marked lazy when theydo not occur in any non-indexed argument inside the clause. This transformation

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 35: Efficient Tabling of Structured Data Using Indexing and Program Transformation

An automatic program transformation

Conclusions

All Prolog implementations handle tabling of structured datainefficiently.

We presented a Prolog based program transformation that ensuresO(1) time and space complexity of tabled lookups.

The transformation is data invariant and works with all the existingtabling systems.

The transformation makes it possible to scale to much larger probleminstances.

Some limitations and problems to be solved:

Only applies to ground input arguments.

Abstract pointers in head of clauses circumvents usual pattern basedindexing (constant time overhead).

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Page 36: Efficient Tabling of Structured Data Using Indexing and Program Transformation

An automatic program transformation

The road ahead...

Our program transformation should be seen as workaround, until suchoptimizations find their way into the tabling systems. We hope thatProlog implementors will pick up on this and integrate such optimizationsdirectly in the tabling systems, so that the user does not need to transformhis program, and need not worry about the underlying tabledrepresentation and its implicit complexity.

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data