datalog and its extensions for semantic web databases
DESCRIPTION
TRANSCRIPT
![Page 1: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/1.jpg)
Datalog and its Extensionsfor Semantic Web Databases
Georg Gottlob1 Giorgio Orsi1 Andreas Pieris1 Mantas Šimkus2
1Department of Computer Science, University of Oxford, UK2Institute of Information Systems, Vienna University of Technology, Austria
Reasoning Web Summer School, Vienna, Austria, September 3 - 8, 2012
![Page 2: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/2.jpg)
Structure of the Lecture
¡ Relational Databases and Datalog
¡ Complexity of Datalog
¡ Datalog and Ontological Reasoning
¡ Datalog§
![Page 3: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/3.jpg)
Relational Databases
Predominant technologyfor data storage and processing
![Page 4: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/4.jpg)
Relational Databases
“On the fly” example:
London
Vienna
Larnaca
Glasgow
Edinburgh
Predominant technologyfor data storage and processing
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
![Page 5: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/5.jpg)
Relational Databases (Terminology)
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
VIE, LHR, …
BA, U2, OS
Vienna, London, …
Constants(from a domain)
![Page 6: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/6.jpg)
Relational Databases (Terminology)
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
Relations
VIE, LHR, …
BA, U2, OS
Vienna, London, …
Constants(from a domain)
![Page 7: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/7.jpg)
Relational Databases (Terminology)
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
Relations
Tuples
Relational atoms
Flight(LHR,EDI,BA)
Airport(LGW,London)
VIE, LHR, …
BA, U2, OS
Vienna, London, …
Constants(from a domain)
![Page 8: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/8.jpg)
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
List all the airlines
{BA, U2, OS}
Πairline Flight
Querying: Relational Algebra
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
![Page 9: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/9.jpg)
List the codes of the airports in London
{LHR, LGW}
Πcode (σcity = “London” Airport)
Querying: Relational Algebra
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
![Page 10: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/10.jpg)
Which airlines fly directly from London to Glasgow?
T1 Ã σcity = “London” Airport
Querying: Relational Algebra
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
T2 Ã σcity = “Glasgow” Airport
![Page 11: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/11.jpg)
Querying: Relational Algebra
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
T1 code city
LHR London
LGW London
T2 code city
GLA Glasgow
T1 Ã σcity = “London” Airport
T2 Ã σcity = “Glasgow” Airport
Which airlines fly directly from London to Glasgow?
![Page 12: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/12.jpg)
T3 Ã Flight origin = code T1
Querying: Relational Algebra
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
T1 code city
LHR London
LGW London
T2 code city
GLA Glasgow
Which airlines fly directly from London to Glasgow?
![Page 13: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/13.jpg)
Querying: Relational Algebra
T2 code city
GLA Glasgow
T3 origin destination airline code city
LHR EDI BA LHR London
LGW GLA U2 LGW London
Which airlines fly directly from London to Glasgow?
T3 Ã Flight origin = code T1
![Page 14: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/14.jpg)
Querying: Relational Algebra
T2 code city
GLA Glasgow
T3 origin destination airline code city
LHR EDI BA LHR London
LGW GLA U2 LGW London
T4 Ã T3 destination = code T2
Which airlines fly directly from London to Glasgow?
![Page 15: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/15.jpg)
Querying: Relational Algebra
T4 origin destination airline code city code city
LGW GLA U2 LGW London GLA Glasgow
Which airlines fly directly from London to Glasgow?
T4 Ã T3 destination = code T2
![Page 16: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/16.jpg)
Querying: Relational Algebra
T4 origin destination airline code city code city
LGW GLA U2 LGW London GLA Glasgow
Πairline T4
Which airlines fly directly from London to Glasgow?
![Page 17: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/17.jpg)
List all the airlines
Querying: FOL and SQL Representation
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
FOL Representation SQL Representation
9X9Y Flight(X,Y,Z)SELECT airlineFROM Flight
Z is free
![Page 18: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/18.jpg)
List the codes of the airports in London
Querying: FOL and SQL Representation
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
FOL Representation SQL Representation
Airport(X,London)SELECT codeFROM AirportWHERE city = “London”
![Page 19: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/19.jpg)
Querying: FOL and SQL Representation
Flight origin destination airline
VIE LHR BA
LHR EDI BA
LGW GLA U2
LCA VIE OS
Airport code city
VIE Vienna
LHR London
LGW London
LCA Larnaca
GLA Glasgow
EDI Edinburgh
9X9Y Airport(X,London) ^
Airport(Y,Glasgow) ^
Flight(X,Y,Z) ^
SELECT F.airlineFROM Flight as F,
Airport as A1,Airport as A2
WHERE A1.code = F.origin ANDA2.code = F.destination ANDA1.city = “London” ANDA2.city = “Glasgow”
Which airlines fly directly from London to Glasgow?
Z is free
![Page 20: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/20.jpg)
London
Vienna
Larnaca
Glasgow
Edinburgh
What we Cannot Ask?
Is Glasgow reachable from Vienna?
Looks like this FOL query can do the job…
9X9Y9Z9W9V Airport(X,Vienna) ^ Airport(Y,Glasgow) ^ Flight(X,Z,W) ^ Flight(Z,Y,V)
YES
![Page 21: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/21.jpg)
London
Larnaca
Glasgow
Edinburgh
What we Cannot Ask?
Is Glasgow reachable from Vienna?
Looks like this FOL query can do the job…
Intermediate
NO
Vienna
9X9Y9Z9W9V Airport(X,Vienna) ^ Airport(Y,Glasgow) ^ Flight(X,Z,W) ^ Flight(Z,Y,V)
![Page 22: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/22.jpg)
London
Larnaca
Glasgow
Edinburgh
What we Cannot Ask?
Is Glasgow reachable from Vienna?
Intermediate
¡ List all the pairs of airports (A1,A2) such that A2 is reachable from A1
¡ Is their a pair (A1,A2) such that A1 is in Vienna and A2 is in Glasgow?
Vienna
![Page 23: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/23.jpg)
What we Cannot Ask?
¡ List all the pairs of airports (A1,A2) such that A2 is reachable from A1
Flight origin destination airline Airport code city
¡ Is their a pair (A1,A2) such that A1 is in Vienna and A2 is in Glasgow?
Reachable(X,Y) Flight(X,Y,Z)
Reachable(X,W) Flight(X,Y,Z), Reachable(Y,W)
Ans() Airport(X,Vienna), Airport(Y,Glasgow), Reachable(X,Y)
![Page 24: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/24.jpg)
What we Cannot Ask?
RECURSION
RA, FOL, SQL not enough
¡ List all the pairs of airports (A1,A2) such that A2 is reachable from A1
Flight origin destination airline Airport code city
Reachable(X,Y) Flight(X,Y,Z)
Reachable(X,W) Flight(X,Y,Z), Reachable(Y,W)
![Page 25: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/25.jpg)
What we Cannot Ask?
DATALOG
Select-Project-Join + Recursion
¡ List all the pairs of airports (A1,A2) such that A2 is reachable from A1
Flight origin destination airline Airport code city
Reachable(X,Y) Flight(X,Y,Z)
Reachable(X,W) Flight(X,Y,Z), Reachable(Y,W)
![Page 26: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/26.jpg)
Datalog at First Glance
A B C D
Transitive closure of a graph:
TrClosure(X,Y) Graph(X,Y)
TrClosure(X,Y) Graph(X,Z),TrClosure(Z,Y)
![Page 27: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/27.jpg)
Datalog at First Glance
Transitive closure of a graph:
A B C D
Graph
A B
B C
C D
TrClosure
A B
A C
A D
B C
B D
C D
TrClosure(X,Y) Graph(X,Y)
TrClosure(X,Y) Graph(X,Z),TrClosure(Z,Y)
![Page 28: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/28.jpg)
Datalog at First Glance
¡ Semantics - a mapping from instances over body-relations to instances
over head-relations
¡ Equivalent approaches for defining the semantics:
¡ Model-Theoretic - logical sentences asserting a property of the result
¡ Fixpoint - solution of a fixpoint equation
¡ Proof-Theoretic - obtaining proofs of facts
Graph
A B
B C
C D
TrClosure
A B
A C
A D
B C
B D
C D
![Page 29: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/29.jpg)
Datalog
¡ Formal Syntax
¡ Fixpoint Semantics
¡ Complexity Results
Note: For details on the model-theoretic and proof-theoretic semantics see:
- Foundations of Databases by Abiteboul, Hull and Vianu
- Logic Programming and Databases by Ceri, Gottlob and Tanca
![Page 30: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/30.jpg)
Syntax of Datalog
R0(X0) R1(X1),…,Rn(Xn)
head body
¡ n ¸ 0 (empty body)
¡ R0,…,Rn are relation symbols (or predicates)
¡ X0,…,Xn are tuples of terms (constants or variables)
¡ Safe - each variable occurring in head must occur in body
A Datalog rule is an expression
![Page 31: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/31.jpg)
Syntax of Datalog
¡ Datalog program P - a (finite) set of Datalog rules
¡ Extensional predicate - does not occur in the head of a rule of P
¡ Intensional predicate - occurs in the head of some rule of P
¡ edb(P) - extensional predicates of P
¡ idb(P) - intensional predicates of P
¡ sch(P) - the set of predicates edb(P) [ idb(P)
![Page 32: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/32.jpg)
Syntax of Datalog
¡ Datalog program P - a (finite) set of Datalog rules
¡ Extensional predicate - does not occur in the head of a rule of P
¡ Intensional predicate - occurs in the head of some rule of P
¡ edb(P) - extensional predicates of P
¡ idb(P) - intensional predicates of P
¡ sch(P) - the set of predicates edb(P) [ idb(P)
not necessary
![Page 33: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/33.jpg)
Example: Recursive Program P
Flight origin destination airline Airport code city
Reachable(X,Y) Flight(X,Y,Z)
Reachable(X,W) Flight(X,Y,Z),Reachable(Y,W)
Ans() Airport(X,Vienna),Airport(Y,Glasgow),Reachable(X,Y)
Is Glasgow reachable from Vienna?
dom(P) = {Vienna, Glasgow}
sch(P) = {Flight, Airport, Reachable, Ans}
edb(P) = {Flight, Airport}
idb(P) = {Reachable, Ans}
![Page 34: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/34.jpg)
Fixpoint Semantics
¡ Relies on the immediate consequence operator TP
¡ Given a database D and a Datalog program P, an atom R(c1,…,cn)
¡ is an immediate consequence for D and P if:
- R(c1,…,cn) 2 D, or
- exists R0(X0) R1(X1),…,Rn(Xn) in P, and a homomorphism h:
{h(R1(X1)),…,h(Rn(Xn))} µ D and h(R0(X0)) = R(c1,…,cn)
¡ TP - mapping from databases for sch(P) to databases for sch(P)
¡ TP (D) = { immediate consequences for D and P }
![Page 35: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/35.jpg)
Fixpoint Semantics
¡ A crucial fact - for each P and D for edb(P):
TP has a minimum fixpoint containing D
¡ The semantics of P on D, denoted P(D), is this minimum fixpoint
¡ How do we compute it?
I is a fixpoint of TP if TP(I) = I
Note: For the proof see, e.g., Theorem 12.3.2
in Foundations of Databases
![Page 36: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/36.jpg)
Fixpoint Semantics
¡ TP,0(I) = I and TP,i+1(I) = TP(TP,i(I))
¡ Compute TP,ω(I) = [i ¸ 0 TP,i(I)
![Page 37: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/37.jpg)
Fixpoint Semantics: Example
¡ Let P the program which computes the transitive closure of a graph:
TrClosure(X,Y) Graph(X,Y)
TrClosure(X,Y) Graph(X,Z),TrClosure(Z,Y)
¡ Consider the input database D = {Graph(A,B),Graph(B,C),Graph(C,D)}
¡ TP,1(D) = D [ {TrClosure(A,B),TrClosure(B,C),TrClosure(C,D)}
¡ TP,2(D) = TP,1(D) [ {TrClosure(A,C),TrClosure(B,D)}
¡ TP,3(D) = TP,2(D) [ {TrClosure(A,D)}
¡ TP,4(D) = TP,3(D)
¡ Thus, TP,ω(D) = TP,3(D)
A B C D
![Page 38: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/38.jpg)
Fixpoint Semantics
¡ TP,0(I) = I and TP,i+1(I) = TP(TP,i(I))
¡ Compute TP,ω(I) = [i ¸ 0 TP,i(I)
¡ TP,ω(D) is the minimum fixpoint of TP containing D
¡ TP,ω(D) is the µ-minimal model of P containing D
Note: For the proof see, e.g., Theorem 12.3.4
in Foundations of Databases
![Page 39: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/39.jpg)
Model-Theoretic Approach
Transitive closure of a graph:
Graph
A B
B C
C D
TrClosure
A B
A C
A D
B C
B D
C D
TrClosure(X,Y) Graph(X,Y)
TrClosure(X,Y) Graph(X,Z),TrClosure(Z,Y)
µ-minimal model
M
A A
A B
A C
A D
D D
not µ-minimal model
![Page 40: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/40.jpg)
Complexity of Datalog
¡ Fact inference problem:
- Input: program P, database D for edb(P), atom α
- Question: P [ D ² α, or, equivalently, α 2 P(D)?
¡ Data complexity - P fixed, D part of the input[Vardi, STOC 1982]
¡ Combined complexity - both P and D part of the input[Vardi, STOC 1982]
![Page 41: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/41.jpg)
Data Complexity of Datalog
Theorem: Datalog is PTIME-complete in data complexity
Proof (in PTIME):
¡ consider a program P and a database D
¡ |P(D)| · |sch(P)| ¢ (|dom(P) [ dom(D)|)maxarity
¡ maxarity is a constant ) P(D) can be constructed in PTIME
maximum number oftuples using constants of
dom(P) [ dom(D)
dom(P) - constants in P
dom(D) - constants in D
![Page 42: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/42.jpg)
Data Complexity of Datalog
Proof (PTIME-hard): reduction from fact inference for propositional LP(2)
Theorem: Datalog is PTIME-complete in data complexity
![Page 43: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/43.jpg)
Data Complexity of Datalog
¡ Propositional LP(2) - set of rules R0 R1,R2
¡ Fact inference for propositional LP(2):
- Input: propositional LP(2) program P, propositional atom Q
- Question: P ² Q?
![Page 44: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/44.jpg)
PTIME-hardness of LP(2)
Theorem: LP(2) is PTIME-hard
Proof: Logspace reduction from Monotone Circuit Value Problem
Æ Ç
Ç
g4 g5
g6
g1 g2 g3
1 0 1
Does the circuit evaluate to true?
![Page 45: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/45.jpg)
PTIME-hardness of LP(2)
Theorem: LP(2) is PTIME-hard
Proof: Logspace reduction from Monotone Circuit Value Problem
Æ Ç
Ç
g4 g5
g6
g1 g2 g3
1 0 1
encoding of the circuit as LP(2) program P
g1
g3
g6 g4
g6 g5
g4 g1, g2
g5 g3
g5 g2
Circuit evaluates to true iff P ² g6Does the circuit evaluate to true?
![Page 46: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/46.jpg)
Data Complexity of Datalog
¡ Propositional LP(2) - set of rules R0 R1,R2
¡ Fact inference for propositional LP(2):
- Input: propositional LP(2) program P, propositional atom Q
- Question: P ² Q?
¡ PTIME-hard - we can also simulate a PTIME Turing machinesee, e.g., [Dantsin, Eiter, Gottlob & Voronkov, ACM Computing Surveys 2001]
![Page 47: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/47.jpg)
Data Complexity of Datalog
Theorem: Datalog is PTIME-complete in data complexity
Proof (PTIME-hard): reduction from fact inference for propositional LP(2)
¡ For each R0 add in D the atom True(R0)
¡ For each R0 R1,R2 add in D the atom S(R0,R1,R2)
¡ Construct the fixed Datalog program PDAT:
¡ P ² Q iff T(Q) 2 PDAT(D)
encode program
P in database D
T(X) True(X)
T(Z) T(X),T(Y),S(Z,X,Y)meta-interpreter for LP(2)
![Page 48: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/48.jpg)
Combined Complexity of Datalog
Theorem: Datalog is EXPTIME-complete in combined complexity
Proof (in EXPTIME):
¡ consider a program P and a database D
¡ |P(D)| · |sch(P)| ¢ (|dom(P) [ dom(D)|)maxarity
¡ P(D) can be constructed in EXPTIME
maximum number oftuples using constants of
dom(P) [ dom(D)
![Page 49: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/49.jpg)
Combined Complexity of Datalog
Theorem: Datalog is EXPTIME-complete in combined complexity
Proof (EXPTIME-hard):
by simulating an EXPTIME Turing machine
![Page 50: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/50.jpg)
Deterministic Turing Machine (DTM)
M = (S, Σ, t, δ, s0, sacc)
states tapesymbols
blanksymbol
Sn{sacc} £ Σ ! S £ Σ £ {-1,0,1}
initial state
accepting state
![Page 51: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/51.jpg)
Deterministic Turing Machine (DTM)
M = (S, Σ, t, δ, s0, sacc)
states tapesymbols
blanksymbol
Sn{sacc} £ Σ ! S £ Σ £ {-1,0,1}
initial state
accepting state
δ(s1,a) = (s2,b,1)
IF at some time instant τ the machine is in sate s1, the cursor
points to cell κ, and this cell contains a
THEN at instant τ+1 the machine is in state s2, cell κ contains b,
and the cursor points to cell κ+1
![Page 52: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/52.jpg)
EXPTIME-hardness of Datalog
The goal: encode the EXPTIME computation of a DTM M on input
string I with a Datalog program P, a database D, and an atom α such
that α 2 P(D) iff M accepts I in at most N = 2m steps, where m =|I|k
![Page 53: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/53.jpg)
The Relational Schema
Time points and tape positions from 0 to N-1, are encoded using m-ary tuples
{0,1}m (recall that N = 2m) such that 0 = (0,…,0), 1 = (0,…,1), …, N-1 = (1,…,1)
¡ Symbol[a](T,C) - at time instant T, cell C contains a
¡ Cursor(T,C) - at time instant T, cursor points to cell C
¡ State[s](T) - at time instant T, the machine is in state s
¡ Accept(T) - at time instant T, the machine accepts
where T = T1,…,Tm and C = C1,…,Cm
![Page 54: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/54.jpg)
Initialization Rules
¡ Assume that I = a1…an
¡ Assume that we have the relations Firstm, Succm and Ám (will be defined later)
Symbol[ai](T,ti) Firstm(T)
Cursor(T,T) Firstm(T)
Symbol[t](T,Y) Firstm(T), Ám(t,C)
the number i
State[s0](T) Firstm(T)
the number n
![Page 55: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/55.jpg)
Transition Rules
Symbol[b](T0,C) State[s1](T), Symbol[a](T,C), Cursor(T,C), Succm(T,T0)
δ(s1,a) = (s2,b,1)
Cursor(T0,C0) State[s1](T), Symbol[a](T,C), Cursor(T,C), Succm(T,T0),
Succm(C,C0)
State[s2](T0) State[s1](T), Symbol[a](T,C), Cursor(T,C), Succm(T,T0)
![Page 56: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/56.jpg)
Inertia Rules
Symbol[a](T0,C) Symbol[a](T,C), Cursor(T,C0), Ám(C,C0), Succm(T,T0)
Cells which are not changed during the transition keep their old values
Symbol[a](T0,C) Symbol[a](T,C), Cursor(T,C0), Ám(C0,C), Succm(T,T0)
![Page 57: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/57.jpg)
Accepting Rule
Once we reach the accepting state we accept
Accept State[sacc](T)
![Page 58: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/58.jpg)
Succi+1(Z,X,Z,Y) Succi(X,Y)
Defining Firstm, Succm and Ám
We assume that D = {First0(0), Last1(1), Succ1(0,1)}
Z 2 {0,1}
Succi+1(Z,X,W,Y) Succ1(Z,W), Lasti(X), Firsti(Y)
Firsti+1(Z,X) First1(Z), Firsti(X)
Lasti+1(Z,X) Last1(Z), Lasti(X)
inductive definitionof Firsti+1 and Succi+1
Ám(X,Y) Succm(X,Y)
Ám(X,Y) Succm(X,Z), Ám(Z,Y)definition of Ám
![Page 59: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/59.jpg)
Concluding EXPTIME-hardness of Datalog
¡ Several rules but polynomially many ) feasible in PTIME
¡ Accept 2 P(D) iff M accepts I in at most N steps
¡ Can be formally shown by induction on the time steps
![Page 60: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/60.jpg)
Datalog as an Ontology Language
¡ Ontology languages are usually based on description logics (prev. lecture)
¡ Much is possible with Datalog
DL Axiom Datalog Rule
Parent u Male v Father Father(X) Parent(X),Male(X)
MetalDevice v 8hasPart.Metal Metal(Y) MetalDevice(X), hasPart(X,Y)
brotherOf v relativeOf relativeOf(X,Y) brotherOf(X,Y)
parentOf inv childOf childOf(Y,X) parentOf(X,Y)
trans(ancestorOf) ancOf(X,Z) ancOf(X,Y), ancOf(Y,Z)
SeniorEmp £ Emp v moreThan moreThan(X,Y) SeniorEmp(X), Emp(Y)
![Page 61: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/61.jpg)
Datalog as an Ontology Language
¡ Much is not possible with Datalog[Patel-Schneider & Horrocks, Journal of Web Semantics 2007]
DL Axiom ?
Employee v reportsTo 9Y reportsTo(X,Y) Employee(X)
funct(reportsTo) Y = Z reportsTo(X,Y), reportsTo(X,Z)
Employee disj Customer ? Employee(X), Customer(X)
¡ Ontology languages are usually based on description logics (prev. lecture)
![Page 62: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/62.jpg)
Datalog§
¡ Extend Datalog by allowing in the head:
- Existential quantification (9)
- Equality atoms (=)
- Constant false (?)
¡ As we shall see, Datalog[9] is undecidable
¡ Datalog[9,=,?] is syntactically restricted ! Datalog§
Datalog[9,=,?]
![Page 63: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/63.jpg)
Datalog Extensions
¡ Formal Syntax of Datalog[9]
¡ Fixpoint Semantics of Datalog[9]
¡ Undecidability of Datalog[9]
¡ Guardedness, Linearity and Stickiness
¡ Additional Features (=,?)
![Page 64: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/64.jpg)
Syntax of Datalog[9]
Y R0(X0,Y) R1(X1),…,Rn(Xn)
head body
¡ n ¸ 0 (empty body)
¡ R0,…,Rn are relation symbols (or predicates)
¡ X0,…,Xn are tuples of terms (constants or variables)
¡ Y is a tuple of variables (disjoint from X0 […[ Xn)
A Datalog[9] rule is an expression
![Page 65: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/65.jpg)
Syntax of Datalog[9]
¡ Datalog[9] program P - a (finite) set of Datalog[9] rules
¡ sch(P) - the set of predicates ocurring in P
¡ Note: sch(P) is no longer partitioned into idb(P) and edb(P) - why?
![Page 66: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/66.jpg)
Syntax of Datalog[9]
¡ Datalog[9] program P - a (finite) set of Datalog[9] rules
¡ sch(P) - the set of predicates ocurring in P
¡ Note: sch(P) is no longer partitioned into idb(P) and edb(P) - why?
DL Ontology Datalog[9] Program P
Person v hasFather 9Y hasFather(X,Y) Person(X)
hasFather ¡ v Person Person(Y) hasFather(X,Y)
All predicates of sch(P) appear in the body and in the head
![Page 67: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/67.jpg)
Fixpoint Semantics
Analogous to the fixpoint semantics of Datalog - chase procedure
Input: Database D, Datalog[9] program P
Output: Instance for sch(P) that satisfies P
Person(John)
9Y hasFather(X,Y) Person(X) Person(Y) hasFather(X,Y)
chase(D,P) = D [ ?
![Page 68: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/68.jpg)
Fixpoint Semantics
Analogous to the fixpoint semantics of Datalog - chase procedure
Input: Database D, Datalog[9] program P
Output: Instance for sch(P) that satisfies P
Person(John)
9Y hasFather(X,Y) Person(X) Person(Y) hasFather(X,Y)
chase(D,P) = D [ {hasFather(John,z1)
![Page 69: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/69.jpg)
Fixpoint Semantics
Analogous to the fixpoint semantics of Datalog - chase procedure
Input: Database D, Datalog[9] program P
Output: Instance for sch(P) that satisfies P
Person(John)
9Y hasFather(X,Y) Person(X) Person(Y) hasFather(X,Y)
chase(D,P) = D [ {hasFather(John,z1), Person(z1)
![Page 70: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/70.jpg)
Fixpoint Semantics
Analogous to the fixpoint semantics of Datalog - chase procedure
Input: Database D, Datalog[9] program P
Output: Instance for sch(P) that satisfies P
Person(John)
9Y hasFather(X,Y) Person(X) Person(Y) hasFather(X,Y)
chase(D,P) = D [ {hasFather(John,z1), Person(z1), hasFather(z1,z2)
![Page 71: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/71.jpg)
Fixpoint Semantics
Analogous to the fixpoint semantics of Datalog - chase procedure
Input: Database D, Datalog[9] program P
Output: Instance for sch(P) that satisfies P
Person(John)
9Y hasFather(X,Y) Person(X) Person(Y) hasFather(X,Y)
chase(D,P) = D [ {hasFather(John,z1), Person(z1), hasFather(z1,z2) …
![Page 72: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/72.jpg)
Fixpoint Semantics
Ι = {S(a,b), R(a)}
9Y S(X,Y) R(X)
h = {X! a}μ = {X! a, Υ! b}
Ι = {S(b,a), R(a)}
9Y S(X,Y) R(X)
h = {X! a}£
¡ Chase rule - the building block of the chase procedure
¡ A rule ρ = Y R0(X0,Y) R1(X1),…,Rn(Xn) is applicable to instance I if:
- exists homomorphism h such that {h(R1(X1)),…,h(Rn(Xn))} µ I
- but no μ ¶ h such that μ(R0(X0,Y)) 2 Ι
![Page 73: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/73.jpg)
Fixpoint Semantics
¡ Chase rule - the building block of the chase procedure
¡ A rule ρ = Y R0(X0,Y) R1(X1),…,Rn(Xn) is applicable to instance I if:
- exists homomorphism h such that {h(R1(X1)),…,h(Rn(Xn))} µ I
- but no μ ¶ h such that μ(R0(X0,Y)) 2 Ι
¡ Let J = I [ {μ(R0(X0,Y))}, where μ ¶ h and μ(Yi) is a fresh value not in I
¡ The result of applying ρ to Ι is J, denoted - chase stepρ,h
Ι J
![Page 74: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/74.jpg)
¡ A finite chase of D w.r.t. to P is a finite sequence
¡ and chase(D,P) is defined as the instance Im
Fixpoint Semantics
D I1ρ1,h1
I2ρ2,h2 ρ3,h3
Imρm,hm
¡ An infinite chase of D w.r.t. to P is an infinite sequence
¡ and chase(D,P) is defined as the instance [j ¸ 0 Ij (with I0 = D)
D I1ρ1,h1
I2ρ2,h2 ρ3,h3
Imρm,hm
¡ The semantics of P on D, denoted P(D), is defined as chase(D,P)
![Page 75: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/75.jpg)
8I (I model of D and P ) chase(D,P) I)
Chase: A Universal Model
Implicit in [ Fagin, Kolaitis, Miller & Popa, Theoretical Computer Science 2005]
D
. . .
C = chase(D,P)
I1I2
h1h2
h1(C) h2(C)
hom
![Page 76: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/76.jpg)
¡ In general is not unique - depends on the order of rule application
Chase: Uniqueness Property
D = {R(a)} ρ1 = 9Y S(Y) R(X) ρ2 = S(X) R(X)
Solution1 = {R(a), S(z), S(a)}
Solution2 = {R(a), S(a)}
ρ1 then ρ2
ρ2 then ρ1
¡ Unique up to homomorphic equivalence
C2C1 C3
h12
h21
h23
h32
![Page 77: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/77.jpg)
¡ In general is infinite
Chase: The Challenge of Infinity
D = {R(a,b)} 9Z R(Y,Z) R(X,Y)
Solution = {R(a,b),R(b,z1),R(z1,z2),R(z2,z3),…}
¡ For plain Datalog, the fixpoint semantics provides an algorithm
¡ The situation changes dramatically for Datalog[9] - undecidable
![Page 78: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/78.jpg)
Undecidability of Datalog[9]
Theorem: Datalog[9] is undecidable
Proof :
by simulating a deterministic Turing machine with an empty tape
![Page 79: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/79.jpg)
Build an Infinite Grid
i-th horizontal line represents the
i-th configuration of the machine
Node(X) Start(X)
Initial(X) Start(X)
Node(Y) H(X,Y)
9Y H(X,Y) Node(X)
9Y V(X,Y) Node(X)
Node(Y) V(X,Y)
V(Y,W) H(X,Y), H(Z,W), V(X,Z)
Start(c) fixesthe starting point
H
V
c
X Y
Z W
![Page 80: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/80.jpg)
Initialization Rules
Initial(Y) Initial(X), H(X,Y)
Cursor[s0](X) Start(X)
Symbol[t](X) Initial(X)
t t ts0
![Page 81: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/81.jpg)
Transition Rules
Cursor[s2](Z) Cursor[s1](X), Symbol[a](X), V(X,Y), H(Y,Z)
δ(s1,a) = (s2,b,1)
Symbol[b](Y) Cursor[s1](X), Symbol[a](X), V(X,Y)
Mark(X) Cursor[s1](X), Symbol[a](X)
s1a
s2
b
![Page 82: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/82.jpg)
Inertia Rules
MarkRight(Y) Mark(X), H(X,Y)
MarkRight(Y) MarkRight(X), H(X,Y)
Symbol[a](Y) MarkRight(X), Symbol[a](X), V(X,Y)
a b c d
MarkLeft MarkRight
a b c d
We need similar rules for the cells before the cursor
![Page 83: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/83.jpg)
Accepting Rule
Once we reach the accepting state we accept
Accept Cursor[sacc](X)
Accept 2 P(D) iff the DTM accepts
![Page 84: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/84.jpg)
Undecidability of Datalog[9]
Theorem: Datalog[9] is undecidable
Proof :
by simulating a deterministic Turing machine with an empty tape
… syntactic restrictions are needed!
![Page 85: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/85.jpg)
Guarded Datalog[9]
¡ Inspired by the guarded fragment of first-order logic
[Andréka, van Benthem & Németi, Journal of Philosophical Logic 98]
¡ There exists a body-atom that contains all the body-variables - guard
Manager(X) Employee(X), supervisorOf(X,Y), Manager(Y)
¡ Chase has finite treewidth ) Guarded Datalog[9] is decidable
[Calì, Gottlob & Kifer, KR 2008]
![Page 86: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/86.jpg)
Treewidth of the Chase
¡ Tree decomposition - a mapping of a graph into a tree
A B F
C
D E
G
H
A B C
B C E
C D E B E G
B F G E G H
¡ Treewidth - number of graph vertices mapped to any treenode (in fact, -1)
Treewidth = 2
![Page 87: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/87.jpg)
Treewidth of the Chase
¡ An instance I can be represented as a graph - Gaifman graph
¡ Treewidth of I is defined as the treewidth of its Gaifman graph
R(a,b,c)
S(c,d)
T(c,d,e)
a
b
c
d
e
¡ Chase has finite treewidth ) is a tree-like structure
![Page 88: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/88.jpg)
Guarded Datalog[9]
¡ Inspired by the guarded fragment of first-order logic
[Andréka, van Benthem & Németi, Journal of Philosophical Logic 98]
¡ There exists a body-atom that contains all the body-variables - guard
Manager(X) Employee(X), supervisorOf(X,Y), Manager(Y)
¡ Chase has finite treewidth ) Guarded Datalog[9] is decidable
[Calì, Gottlob & Kifer, KR 08]
¡ What about the complexity of Guarded Datalog[9]? - now we consider
sdontological query answering (previous lecture)
![Page 89: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/89.jpg)
Complexity of Guarded Datalog[9]
¡ Ontological query answering problem:
- Input: program P, database D for sch(P), conjunctive query Q
- Question: P [ D ² Q, or, equivalently, P(D) ² Q?
¡ Data complexity - P and Q fixed, D part of the input
¡ Combined complexity - everything part of the input
TBox ABox
![Page 90: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/90.jpg)
Ontological Query Answering: Example
… Father(z,John) Person(z) …
Person(John)
D9Y hasFather(X,Y) Person(X)
Person(Y) hasFather(X,Y)
P
P(D)
![Page 91: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/91.jpg)
Ontological Query Answering: Example
Q1 ← Father(X,John), Person(X)
… Father(z,John) Person(z) …
Person(John)
D9Y hasFather(X,Y) Person(X)
Person(Y) hasFather(X,Y)
P
P(D)
![Page 92: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/92.jpg)
Ontological Query Answering: Example
… Father(z,John) Person(z) …
Q2 ← Father(John,X)
Person(John)
D9Y hasFather(X,Y) Person(X)
Person(Y) hasFather(X,Y)
P
P(D)
Q1 ← Father(X,John), Person(X)
![Page 93: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/93.jpg)
Data Complexity of Guarded Datalog[9]
Theorem: Guarded Datalog[9] is PTIME-complete in data complexity
Proof (in PTIME):
¡ Guarded Datalog[9] enjoys the bounded guard-depth property
¡ Construct in PTIME the finite part C of the guarded chase forest
¡ Evaluate the given query over C
[Calì, Gottlob & Lukasiewicz, Journal of Web Semantics 2012]
![Page 94: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/94.jpg)
Bounded Guard-Depth Property
¡ Chase graph
D = {R(a,b), S(b)}
ρ1 = 9Z R(Z,X) R(X,Y),S(Y)
ρ2 = S(X) R(X,Y)P =
R(a,b) S(b)
R(z1,b) S(a)
R(z2,z1) S(z1)
R(z3,z2) S(z2)
![Page 95: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/95.jpg)
Bounded Guard-Depth Property
¡ Guarded chase forest
R(a,b) S(b)
R(z1,b) S(a)
R(z2,z1) S(z1)
R(z3,z2) S(z2)
R(a,b) S(b)
R(z1,b) S(a)
R(z2,z1) S(z1)
R(z3,z2) S(z2)
0
1
2
3
restriction to guardsand their children
![Page 96: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/96.jpg)
Bounded Guard-Depth Property
Q
D
guarded chase forestof D w.r.t. P
C
constant depth
w.r.t. D
P(D) ² Q ) C ² Q
![Page 97: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/97.jpg)
Data Complexity of Guarded Datalog[9]
Theorem: Guarded Datalog[9] is PTIME-complete in data complexity
Proof (in PTIME):
¡ Guarded Datalog[9] enjoys the bounded guard-depth property
¡ Construct in PTIME the finite part C of the guarded chase forest
¡ Evaluate the given query over C
[Calì, Gottlob & Lukasiewicz, Journal of Web Semantics 2012]
![Page 98: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/98.jpg)
Data Complexity of Datalog
Theorem: Datalog[] is PTIME-complete in data complexity
Proof (PTIME-hard): reduction from fact inference for propositional LP(2)
¡ For each R0 add in D the atom True(R0)
¡ For each R0 R1,R2 add in D the atom S(R0,R1,R2)
¡ Construct the fixed Guarded Datalog[] program PDAT:
¡ P ² Q iff T(Q) 2 PDAT(D)
encode program
P in database D
T(X) True(X)
T(Z) T(X),T(Y),S(Z,X,Y)meta-interpreter for LP(2)
![Page 99: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/99.jpg)
Combined Complexity of Guarded Datalog[9]
Theorem: Guarded Datalog[9] is 2EXPTIME-complete in comb. complexity
Proof:
¡ Upper bound: alternating EXPSPACE algorithm
¡ Lower bound: by simulating an alternating EXPSPACE TM
[Calì, Gottlob & Kifer, KR 2008]
![Page 100: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/100.jpg)
DLs vs Guarded Datalog[9]
ELH: Popular DL (for biological applications) with PTIME data complexity[Baader, IJCAI 2003 and Rosati, DL 2007]
ELH TBox Datalog[9] Representation
A v B B(X) A(X)
A u B v C C(X) A(X), B(X)
9R.A v B B(X) R(X,Y), A(Y)
A v 9R.B 9Y Aux(X,Y) A(X) R(X,Y) Aux(X,Y) B(Y) Aux(X,Y)
R v S S(X,Y) R(X,Y)
![Page 101: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/101.jpg)
DLs vs Guarded Datalog[9]
DL-LiteR TBox Datalog[9] Representation
A v B B(X) A(X)
A v 9R 9Y R(X,Y) A(X)
9R v A A(X) R(X,Y)
R v S S(X,Y) R(X,Y)
DL-Lite: Popular family of DLs with AC0 data complexity (OWL 2 QL)[Calvanese, De Giacomo, Lembo, Lenzerini & Rosati, Journal of Automated Reasoning 2007]
Note: Disjointness assertions are harmless - will be treated differently
![Page 102: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/102.jpg)
Linear Datalog[9]
¡ Goal: Lightweight Datalog[9] fragment which is highly tractable
¡ There exists only one atom in the body
9Y hasFather(X,Y) Person(X)
¡ Enjoys first-order rewritability
[Calì, Gottlob & Lukasiewicz, Journal of Web Semantics 2012]
![Page 103: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/103.jpg)
First-Order Rewritability
D
PQ
QP
compilation
Q*
translationevaluation
first-orderquery
SQL
8D (P(D) ² Q , D ² Q*)
[Calvanese, De Giacomo, Lembo, Lenzerini & Rosati, Journal of Automated Reasonig 2007]
![Page 104: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/104.jpg)
Bounded Derivation-Depth Property (BDDP)
Q
D
C
constant depth
w.r.t. D
chase graph of D w.r.t. P(not the guarded chase forest)
P(D) ² Q ) C ² Q
![Page 105: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/105.jpg)
Sufficient condition for first-order rewritability
Bounded Derivation-Depth Property (BDDP)
Q
D
C
constant depth
w.r.t. D
chase graph of D w.r.t. P(not the guarded chase forest)
![Page 106: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/106.jpg)
Linear Datalog[9]
¡ Goal: Lightweight Datalog[9] fragment which is highly tractable
¡ There exists only one atom in the body
9Y hasFather(X,Y) Person(X)
¡ What about the complexity of Linear Datalog[9]?
¡ Enjoys first-order rewritability
[Calì, Gottlob & Lukasiewicz, Journal of Web Semantics 2012]
![Page 107: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/107.jpg)
Data Complexity of Linear Datalog[9]
Theorem: Linear Datalog[9] is in AC0 in data complexity
Proof:
¡ We exploit first-order rewritability of Linear Datalog[9]
¡ Construct the first-order rewritten query QFO (in constant time)
¡ Evaluate QFO over the given database (in AC0)[Vardi, PODS 1995]
[Calì, Gottlob & Lukasiewicz, Journal of Web Semantics 2012]
![Page 108: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/108.jpg)
Combined Complexity of Linear Datalog[9]
Theorem: Linear Datalog[9] is PSPACE-complete in comb. complexity
Proof:
¡ Upper bound: exploit the BDDPimplicit in [Johnson & Klug, JCSS 1984]
¡ Lower bound: by simulating a PSPACE Turing machine
Q
D
...
![Page 109: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/109.jpg)
PSPACE-hardness of Linear Datalog[9]
¡ Assume that the tape alphabet is {0,1,t}
¡ Suppose that M halts on I = a1…am using n = mk cells, for k > 0
¡ Initial configuration (the database D)
Config(sinit,a1,…,am,t,…,t,1,0,…,0)
n - m n - 1
¡ Transition rule - δ(s1,a) = (s2,b,1)
Config(s1,X1,…,Xi-1,a, Xi+1,…,Xn,0,…,0,1, 0,…,0)
n - ii - 1
Config(s2,X1,…,Xi-1,b, Xi+1,…,Xn,0,…,0,1, 0,…,0)
i n - i - 1
¡ Accepting rule
Accept Config(sacc,X1,…,Xn,Y1,…,Yn)
![Page 110: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/110.jpg)
Datalog[9,=,?]: Overview
LinearGuarded
DL-LiteR
ELH
PTIME-complete in AC0
![Page 111: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/111.jpg)
But…
¡ What about joins in rule bodies?
biggerThan(E,M) Elephant(E), Mouse(M)
¡ What about the DL assertion concept product?
9E Employee(E,D,P,A) Runs(D,P), Area(P,A)
![Page 112: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/112.jpg)
biggerThan(E,M) Elephant(E), Mouse(M)
9E Employee(E,D,P,A) Runs(D,P), Area(P,A)
Sticky Datalog[9]
![Page 113: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/113.jpg)
Sticky Datalog[9]
9W T(X,Y,W) R(X,Y),P(Y,Z)
9W S(Y,W) T(X,Y,Z)
![Page 114: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/114.jpg)
Sticky Datalog[9]
9W T(X,Y,W) R(X,Y),P(Y,Z)
9W S(Y,W) T(X,Y,Z)
9W T(X,Y,W) R(X,Y),P(Y,Z)
9W S(X,W) T(X,Y,Z)
£
![Page 115: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/115.jpg)
Data Complexity of Sticky Datalog[9]
Theorem: Sticky Datalog[9] is in AC0 in data complexity
Proof:
¡ Sticky Datalog[9] enjoys the BDDP
¡ Thus, Sticky Datalog[9] is first-order rewritable ) in AC0
[Calì, Gottlob & Pieris, VLDB 2010]
Q
D
...
![Page 116: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/116.jpg)
Combined Complexity of Sticky Datalog[9]
Theorem: Sticky Datalog[9] is EXPTIME-complete in combined complexity
Proof:
¡ EXPTIME-membership: construct a proof of the query by applying
an APSPACE procedure
¡ EXPTIME-hardness: fact inference for lossless Datalog programs
over {0,1} is EXPTIME-hard
S(X,Y,Z) P(X,Y),P(Y,Z),R(Z)
[Calì, Gottlob & Pieris, VLDB 2010]
Q
D
![Page 117: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/117.jpg)
Datalog as an Ontology Language
¡ Much is not possible with Datalog[Patel-Schneider & Horrocks, Journal of Web Semantics 2007]
DL Axiom ?
Employee v reportsTo 9Y reportsTo(X,Y) Employee(X)
funct(reportsTo) Y = Z reportsTo(X,Y), reportsTo(X,Z)
Employee disj Customer ? Employee(X), Customer(X)
¡ Ontology languages are based on description logics (previous lecture)
![Page 118: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/118.jpg)
Equality Atom
¡ Linear Datalog[9,=] is already undecidableimplicit in [Chandra & Vardi, SIAM Journal on Computing 1985]
¡ Separability: given P = P [ P=,
[Calì, Lembo & Rosati, PODS 2003]
8D8Q: D ² P= ) P(D) ² Q iff P(D) ² Q
¡ Non-conflicting Datalog[9,=]: sufficient condition for separabilitysee, e.g., [Calì, Gottlob & Lukasiewicz, Journal of Web Semantics 2012]
Xi = Xj R1(X1),…,Rn(Xn)
![Page 119: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/119.jpg)
Truth Constant False
¡ Preliminary check without adding complexity - given P = P [ P?
P(D) ² Q
m
P(D) ² Q OR P(D) ² Qρ, for some ρ 2 P?
? R1(X1),…,Rn(Xn)
Qρ ← body(ρ)
![Page 120: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/120.jpg)
Non-Conflicting Datalog[9,=,?]: Overview
LinearGuarded
DL-LiteR
ELH
PTIME-complete in AC0
Sticky
in AC0
9W S(Y,W) R(X,Y),P(Y,Z)9W S(X,W) R(X,Y,Y)
![Page 121: Datalog and its Extensions for Semantic Web Databases](https://reader034.vdocument.in/reader034/viewer/2022042601/54753a52b4af9fb90a8b5a4e/html5/thumbnails/121.jpg)
Further Reading (Partial List)
Guarded Datalog[9] (and extensions)- Andrea Calì, Georg Gottlob, and Michael Kifer. Taming the infinite chase: Query answering under expressive
relational constraints. In Proceedings of KR, pages 70-80, 2008.
- Andrea Calì, Georg Gottlob, and Thomas Lukasiewicz. A general Datalog-based framework for tractable queryanswering over ontologies. J. Web Sem., 14:57-83, 2012.
- Jean-François Baget, Marie-Laure Mugnier, Sebastian Rudolph, and Michaël Thomazo. Walking thecomplexity lines for generalized guarded existential rules. In Proceedings of IJCAI, pages 712-717, 2011.
Sticky Datalog[9] (and extensions)- Andrea Calì, Georg Gottlob, Andreas Pieris. Advanced processing for ontological queries. PVLDB 3(1): 554-
565, 2010.
- Andrea Calì, Georg Gottlob, Andreas Pieris. Query answering under non-guarded rules in Datalog+/-. InProceedings of RR, pages 1-17, 2010.
- Georg Gottlob, Giorgio Orsi, Andreas Pieris. Ontological queries: Rewriting and optimization. In Proceedingsof ICDE, pages 2-13, 2011.
Weakly-acyclic Datalog[9] (and extensions)- Ronald Fagin, Phokion G. Kolaitis, Renée J. Miller, Lucian Popa. Data exchange: Semantics and query answering.
Theor. Comput. Sci., 336(1): 89-124, 2005.
- Alin Deutsch, Alan Nash, Jeffrey B. Remmel. The chase revisited. In Proceedings of PODS, pages 149-158, 2008.
- Bruno Marnette. Generalized schema-mappings: From termination to tractability. In Proceedings of PODS, pages13-22, 2009.
Shy Datalog[9]- Nicola Leone, Marco Manna, Giorgio Terracina, Pierfrancesco Veltri. Efficiently computable Datalog∃ Programs.
In Proceedings of KR, 2012.