efficient tabling of structured data using indexing and program transformation

Efficient Tabling of Structured DataUsing Indexing and Program Transformation

Christian Theil Have and Henning Christiansen

Research group PLIS: Programming, Logic and Intelligent SystemsDepartment of Communication, Business and Information Technologies

Roskilde University, P.O.Box 260, DK-4000 Roskilde, Denmark

PADL in Philadelphia, January 23, 2012

Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data

Outline

1 Motivation and backgroundThe trouble with tabling of structured data

2 A workaround implemented in PrologExamples

Example: Edit DistanceExample: Hidden Markov Model in PRISM

3 An automatic program transformation


Motivation and background

Motivation

In the LoSt project we explore the use of Probabilistic Logicprogramming (with PRISM) for biological sequence analysis.

We had some problems analyzing very long sequences..

.. and identified the culprit: Tabling of structured data.

Inspired by earlier work by Christiansen and Gallagher which address asimilar problem related to tabling non-discrimininatory arguments.

Henning Christiansen, John P. GallagherNon-discriminating Arguments and their Uses.ICLP 2009.


Motivation and background The trouble with tabling of structured data

The trouble with tabling of structured data

Tabling in logic programming is an established technique which,

can give a significant speed-up of program execution.

make it easier to write efficient programs in a declarative style.

is similar to memoization in functional programming.

The idea:

The system maintains a table of calls and their answers.

when a new call is entered, check if it is stored in the table

if so, use previously found solution.

Tabling is included in several recognized Prolog systems such as B-Prolog,YAP and XSB.



The trouble with tabling of structured data

An innocent lookingpredicate: last/2

last([X],X).

last([_|L],X) :-

last(L,X).

Traverses a list to findthe last element.

Time/space complexity:O(n).

If we table last/2:n + n − 1 + n − 2 . . . 1≈ O(n2) !

call: last([1,2,3,4,5],X)

last([1,2,3,4,5],X) last([1,2,3,4],X) last([1,2,3],X) last([1,2],X) last([1],X)

call table

last([1,2,3,4,5],X).last([1,2,3,4],X).last([1,2,3],X).last([1,2],X).last([1],X).



Wait a minute..

Tabling systems do employ some advanced techniques to avoid theexpensive copying and which may reduce memory consumption and/ortime complexity.For instance,

B-Prolog uses hashing of goals.

XSB uses a trie data structure.

YAP uses a trie structure, which is refined into a so-called global triewhich applies a sharing strategy for common subterms wheneverpossible.

These techniques may reduce space consumption, but if there is no sharingbetween the tables and the actual arguments of an active call, eachexecution of a call may involve a full traversal (naive copying) of itsarguments.



Benchmarking last/2 for B-Prolog, Yap and XSB

To investigate, we benchmarked the tabled version of last/2 for each ofthe major Prolog tabling engines.

To further investigate whether performance is data dependent, webenchmarked with both repeated data and random data.



Space usage, tabled last/2 (1)

● ● ● ●●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

0 500 1000 1500 2000 2500 3000

050

0000

1000

000

1500

000

2000

000

b) Space usage

list length (N)

spac

e us

age

(kilo

byte

s)●

●

XSB (random data)B−Prolog (random data)Yap (random data)XSB (repeated data)B−Prolog (repeated data)Yap (repeated data)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



Space usage, tabled last/2 (2)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0 5000 10000 15000

020

0040

0060

0080

0010

000

d) Space usage expanded for the four lower curves

list length (N)

spac

e us

age

(kilo

byte

s)● B−Prolog (random data)

XSB (repeated data)B−Prolog (repeated data)Yap (repeated data)



Time usage, tabled last/2 (1)

●●

● ●●

●

●

●●

●

● ●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

0 500 1000 1500 2000 2500 3000

01

23

4

a) Time usage

list length (N)

time

usag

e (s

econ

ds)

●

●

XSB (random data)B−Prolog (random data)Yap (random data)XSB (repeated data)B−Prolog (repeated data)Yap (repeated data)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



Time usage, tabled last/2 (2)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

●

● ● ●●

● ●●

● ●●

●●

●●

●●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

0 10000 20000 30000 40000 50000

0.0

0.2

0.4

0.6

0.8

1.0

c) Time usage expanded for the three lower curves

list length (N)

time

usag

e (s

econ

ds)

● B−Prolog (random data)XSB (repeated data)Yap (repeated data)


A workaround implemented in Prolog


We describe a workaround giving O(1) time and space complexity for tablelookups for programs with arbitrarily large ground structured data as inputarguments.

A term is represented as a set of facts.

A subterm is referenced by a unique integer serving as an abstractpointer.

Matching related to tabling is done solely by comparison of suchpointers.



An abstract data type

The representation is given by the following predicates which all togethercan be understood as an abstract datatype.

store term( +ground-term, pointer)The ground-term is any ground term, and the pointerreturned is a unique reference (an integer) for that term.

retrieve term( +pointer, ?functor, ?arg-pointers-list)Returns the functor and a list of pointers to representationsof the substructures of the term represented by pointer.

full retrieve term( +pointer, ?ground-term)

Returns the term represented by pointer.



ADT properties

Property 1

It must hold for any ground term s, that the query

store term(s, P), full retrieve term(P, S),

assigns to the variable S a value identical to s.

Property 2

It must hold for any ground term s of the form f (s1, . . . ,sn) that

store term(s, P), retrieve term(P, F, Ss),

assigns to the variable F the symbol f , and to Ss a list of ground values[p1,. . .,pn] such that additional queries

full retrieve term(pi, Si), i = 1, . . . , n

assign to the variables Si values identical to si .



ADT example

Example

The following call converts the term f(a,g(b)) into its internalrepresentation and returns a pointer value in the variable P.

store_term(f(a,g(b)),P).

After this, the following sequence of calls will succeed.

retrieve_term(P,f,[P1,P2]),

retrieve_term(P1,a,[]),

retrieve_term(P2,g,[P21]),

retrieve_term(P21,b,[]),

full_retrieve_term(P,f(a,g(b))).



Implementation with assert

One possible way of implementing the predicates introduced above is tohave store term/2 asserting facts for the retrieve term/3 predicateusing increasing integers as pointers.

Example

The call store term(f(a,g(b)),P) considered in example 1 may assignthe value 100 to P and as a side-effect assert the following facts.

retrieve_term(100,f,[101,102]).

retrieve_term(101,a,[]).

retrieve_term(102,g,[103]).

retrieve_term(103,b,[]).

Notice that Prolog’s indexing on first arguments ensures a constant lookuptime.



Another level of abstraction: lookup pattern/2

Finally we introduce a utility predicate which may simplify the use of therepresentation in application programs. It utilizes a special kind of terms,called patterns, which are not necessarily ground and which may containsubterms of the form lazy(variable).

lookup pattern( +pointer, +pattern)The pattern is matched in a recursive way against the termrepresented by the pointer p in the following way.– lookup pattern(p,X) is treated asfull retrieve term(p,X).– lookup pattern(p,lazy(X)) unifies X with p.– For any other pattern =.. [F,X1,. . . ,Xn] we call

retrieve term(p, F, [P1,. . .,Pn])

followed by lookup pattern(Pi,Xi), i = 1, . . . , n.



A lookup pattern/2 example

Example

Continuing the previous example, we get that after the callstore term(f(a,g(b)),P).

lookup_pattern(100, f(X,lazy(Y)))

leads to X=a and Y=102.

The lookup pattern/2 predicate will be use to simplify the automaticprogram transformation introduced later.Further efficiency can be gained by compiling it out for each specificpattern (i.e. replacing it with calls to retrieve term/2).


A workaround implemented in Prolog Examples

Examples

We consider two applying the workaround to two example programs:

Edit Distance implemented in Prolog

Hidden Markov Models in PRISM

For each of these we measure the impact (in time and space) of atransformed version using the workaround.



Implementation of store term/2 and retrieve term/2

The programs have been transformed manually for these experimentsbased on the pointer based representation previously introduced, butsimplified slightly for lists.For store term/2 and retrieve term/2 we use the following simpleimplementation:

store_term([],Index) :- assert(retrieve_term([],Index)).

store_term([X|Xs],Idx) :-

Idx1 is Idx + 1,

assert(retrieve_term(Idx,[X,Idx1])),

store_term(Xs,Idx1).



Edit Distance

Calculate the minimum number of edits (insertions, deletions andreplacements) to transform on list into another.

a minimal edit-distance algorithm written in Prolog which isdependent on tabling for any non-trivial problem.

The theoretical best time complexity of edit distance has been provento be O(N2).

Measure for various problem sizes.



Edit distance: Implementation in Prolog.

:- table edit/3.

edit([],[],0).

edit([],[Y|Ys],Dist) :-

edit([],Ys,Dist1),

Dist is 1 + Dist1.

edit([X|Xs],[],Dist) :-

edit(Xs,[],Dist1),

Dist is 1 + Dist1.

edit([X|Xs],[Y|Ys],Dist) :-

edit([X|Xs],Ys,InsDist),

edit(Xs,[Y|Ys],DelDist),

edit(Xs,Ys,TailDist),

(X==Y ->

Dist = TailDist

;

% Minimum of insertion,

% deletion or substitution

sort([InsDist,DelDist,TailDist],

[MinDist|_]),

Dist is 1 + MinDist).



Edit distance, transformed (1).

original version

edit([],[],0).

edit([],[Y|Ys],Dist) :-

edit([],Ys,Dist1),

Dist is 1 + Dist1.

edit([X|Xs],[],Dist) :-

edit(Xs,[],Dist1),

Dist is 1 + Dist1.

transformed versionedit(XIdx,YIdx,0) :-

retrieve_term(XIdx,[]),

retrieve_term(YIdx,[]).

edit(XIdx,YIdx,Dist) :-

retrieve_term(XIdx,[]),

retrieve_term(YIdx,[_,YIdxNext]),

edit(XIdx,YIdxNext,Dist1),

Dist is Dist1 + 1.

edit(XIdx,YIdx,Dist) :-

retrieve_term(YIdx,[]),

retrieve_term(XIdx,[_,XIdxNext]),

edit(XIdxNext,YIdx,Dist1),

Dist is Dist1 + 1.

continued on next slide...



Edit distance, transformed (2).

original versionedit([X|Xs],[Y|Ys],Dist) :-

edit([X|Xs],Ys,InsDist),

edit(Xs,[Y|Ys],DelDist),

edit(Xs,Ys,TailDist),

(X==Y ->

Dist = TailDist

;


[MinDist|_]),


transformed versionedit(XIdx,YIdx,Dist) :-

retrieve_term(XIdx,[X,NextXIdx]),

retrieve_term(YIdx,[Y,NextYIdx]),

edit(XIdx,NextYIdx,InsDist),

edit(NextXIdx,YIdx,DelDist),

edit(NextXIdx,NextYIdx,TailDist),

(X==Y ->

Dist = TailDist

;


[MinDist|_]),




Edit distance: Benchmarking results (time)

●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●●

●●

●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●

●

●

●

●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●

●

●

●●●●●●●●●●

●●

●

●●

●

●

●

●

●●

●

●●

●

●

0 50 100 150 200 250 300 350

0.0

0.2

0.4

0.6

0.8

1.0

list length (N)

time

usag

e (s

econ

ds)

●

●

XSB 3.3B−Prolog 7.5#4Yap 6.2.1index(XSB)index(B−Prolog)index(Yap)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●

●●

●

●●●●

●

●

●

●

●●●

●●●

●

●●●●●●

●●●

●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●●●●

●●●

●

●●●●●●●●

●●

●●●●●

●●●●●●

●

●●●

●

●

●●

●●●●

●●●

●

●

●●●

●●

●

●

●●●

●●

●

●

●

●●●

●

●

●

●

●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●

●

●

●

●●●●●●●●●●●●●●●●●●●

●●

●●●

●●●●●●

●●●●

●

●●

●

●●●●●

●●●●●●●●●●●

●

●●●●



Edit distance: Benchmarking results (space)

●●●●●●●●●●●●●●●●●●●●●

●●●●●●●

●●●●●

●●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

0 50 100 150

0e+0

02e

+04

4e+0

46e

+04

8e+0

41e

+05

list length (N)

spac

e us

age

(kilo

byte

s)●

●

XSB 3.3B−Prolog 7.5#4Yap 6.2.1index(XSB)index(B−Prolog)index(Yap)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●



PRISM

PRISM ( PRogramming In Statistical Modelling )

Extends Prolog (B-Prolog) with special goals representing randomvariables (msws).

Semantics: Probabilistic Herbrand models.

Supports various probabilistic inferences.

Relies heavily on tabling for the efficiency of the probabilisticinferences.



A Hidden Markov Model in PRISM

values(init,[s0,s1]).

values(out(_),[a,b]).

values(tr(_),[s0,s1]).

hmm(L):-

msw(init,S),

hmm(S,L).

hmm(_,[]).

hmm(S,[Ob|Y]) :-

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).

init

s0

0.3: a

0.7: b0.5 s1

0.6: a

0.4: b0.5

0.1

0.9

0.2

0.8



Transformed Hidden Markov Model

We only need to consider the recursive predicate, hmm/2.

original version

hmm(_,[]).

hmm(S,[Ob|Y]) :-

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).

transformed version

hmm(S,ObsPtr):-

retrieve_term(ObsPtr,[]).

hmm(S,ObsPtr) :-

retrieve_term(ObsPtr,[Ob,Y]),

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).



Hidden Markov Model: Benchmarking results (time)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●

●●

●●

●●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

0 1000 2000 3000 4000 5000

020

4060

8010

012

014

0

b) Running time without indexed lookup

sequence length

Run

ning

tim

e (s

econ

ds)

●●

●●

● ●●

●●

●● ●

●●

●●

●

● ●

●

●●

● ●● ●

●●

●●

●●

●●

● ●

●

●

● ●●

●●

●●

● ● ●●

●

0 1000 2000 3000 4000 5000

0.00

0.02

0.04

0.06

0.08

a) Running time with indexed lookup

sequence length

Run

ning

tim

e (s

econ

ds)


An automatic program transformation


We introduce an automatic transformation from a tabled program to anefficient version using our approach.To support the transformation, the user must declare modes for whichpredicate arguments that should be indexed.

table_index_mode(hmm(+))

table_index_mode(hmm(-,+))

Each clause whose head is covered by a table mode declaration istransformed and all other clauses are left untouched.



Transformation of HMM in PRISM

The transformation moves any term appearing in an indexed position inthe head of a clause into a call to the lookup pattern predicate, which isadded to the body. Variables in such terms are marked lazy when they donot occur in any non-indexed argument inside the clause.

original program

hmm(_,[]).

hmm(S,[Ob|Y]) :-

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).

transformed program

hmm(S,ObsPtr):-

lookup_pattern(ObsPtr,[]).

hmm(S,ObsPtr) :-

lookup_pattern(ObsPtr,[Ob | lazy(Y)]),

msw(out(S),Ob),

msw(tr(S),Next),

hmm(Next,Y).



Transformation algorithm

indexed. For the HMM program of section 4.2 this may look as follows; plusmeans transform the argument, minus means keep it unchanged.



The correctness of the transformation depends on the following properties of theprogram.

– The arguments indicated for indexing must be called with ground data only.– Variables that occur in an indexed argument in the head of a clause, cannot

occur in the body of that clause in both an indexed and a non-indexedargument.

– Any argument of a goal within a clause body which is declared to be indexed,must be given as a variable that also occurs in an indexed argument in thehead of that clause.

Each clause whose head predicate is covered by a table mode declaration istransformed using the procedure outlined in algorithm 1, and all other clausesare left untouched. The transformation moves any term appearing in an indexed

for each clause H:-B in original program doif table index mode (M) matching H then

for each argument Hi ∈ H, Mi ∈M doif Mi =’+’ then

H �i ← MarkLazy(Hi, B)

B ← (lookup pattern(Vi, H�i), B)

Hi ← Vi

end

end

where MarkLazy is defined as

MarkLazy(Hi,B) :PotentialLazy = variables in all goals G ∈ B

where G has table index mode declarationNonLazy = variables in all goals G ∈ B

where G has no table index mode declarationLazy = PotentialLazy \ NonLazyfor each variable V ∈ Hi do

if V ∈ Lazy thenV ← lazy(V )

endAlgorithm 1: Program transformation.

position in the head of a clause into a call to the lookup pattern predicate,which is added to the body. Variables in such terms are marked lazy when theydo not occur in any non-indexed argument inside the clause. This transformation



Transformation algorithm: MarkLazy

indexed. For the HMM program of section 4.2 this may look as follows; plusmeans transform the argument, minus means keep it unchanged.



The correctness of the transformation depends on the following properties of theprogram.

– The arguments indicated for indexing must be called with ground data only.– Variables that occur in an indexed argument in the head of a clause, cannot

occur in the body of that clause in both an indexed and a non-indexedargument.

– Any argument of a goal within a clause body which is declared to be indexed,must be given as a variable that also occurs in an indexed argument in thehead of that clause.

Each clause whose head predicate is covered by a table mode declaration istransformed using the procedure outlined in algorithm 1, and all other clausesare left untouched. The transformation moves any term appearing in an indexed

for each clause H:-B in original program doif table index mode (M) matching H then

for each argument Hi ∈ H, Mi ∈M doif Mi =’+’ then

H �i ← MarkLazy(Hi, B)

B ← (lookup pattern(Vi, H�i), B)

Hi ← Vi

end

end

where MarkLazy is defined as

MarkLazy(Hi,B) :PotentialLazy = variables in all goals G ∈ B

where G has table index mode declarationNonLazy = variables in all goals G ∈ B

where G has no table index mode declarationLazy = PotentialLazy \ NonLazyfor each variable V ∈ Hi do

if V ∈ Lazy thenV ← lazy(V )

endAlgorithm 1: Program transformation.

position in the head of a clause into a call to the lookup pattern predicate,which is added to the body. Variables in such terms are marked lazy when theydo not occur in any non-indexed argument inside the clause. This transformation



Conclusions

All Prolog implementations handle tabling of structured datainefficiently.

We presented a Prolog based program transformation that ensuresO(1) time and space complexity of tabled lookups.

The transformation is data invariant and works with all the existingtabling systems.

The transformation makes it possible to scale to much larger probleminstances.

Some limitations and problems to be solved:

Only applies to ground input arguments.

Abstract pointers in head of clauses circumvents usual pattern basedindexing (constant time overhead).



The road ahead...

Our program transformation should be seen as workaround, until suchoptimizations find their way into the tabling systems. We hope thatProlog implementors will pick up on this and integrate such optimizationsdirectly in the tabling systems, so that the user does not need to transformhis program, and need not worry about the underlying tabledrepresentation and its implicit complexity.


efficient tabling of structured data using indexing and program transformation

Technology

structured

structured

prolog examplesedit

structured

full retrieve

structured

retrieve term2

structured