relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational...

54
Relational Model Advanced Topics in Foundations of Databases, University of Edinburgh, 2019/20

Upload: others

Post on 17-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

RelationalModel

AdvancedTopicsinFoundationsofDatabases,UniversityofEdinburgh,2019/20

Page 2: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

RelationalModel

• Manyadhocmodelsbefore1970

− Hardtoworkwith

− Hardtoreasonabout

• 1970:RelationalModelbyEdgarFrankCodd

− Dataarestoredinrelations (ortables)

− Queriedusingadeclarativelanguage

− DBMSconvertsdeclarativequeriesintoproceduralqueries thatare

optimizedandexecuted

• KeyAdvantages

− Simpleandcleanmathematicalmodel(basedonlogic)

− Separationofdeclarativeandprocedural

Page 3: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

RelationalDatabases

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

Constants

VIE,LHR,…

BA,U2,…

Vienna,London,…

Page 4: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

RelationalDatabases

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

Relations

Constants

VIE,LHR,…

BA,U2,…

Vienna,London,…

Page 5: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

RelationalDatabases

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

Constants

VIE,LHR,…

BA,U2,…

Vienna,London,…

Relations

Tuples

Relationalatoms

Flight(LHR,EDI,BA)

Airport(LGW,London)

Page 6: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:RelationalAlgebra

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

Listalltheairlines

Page 7: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:RelationalAlgebra

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

Listalltheairlines

πairline Flight

{BA,U2,OS}

Page 8: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:RelationalAlgebra

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

ListthecodesoftheairportsinLondon

Page 9: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:RelationalAlgebra

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

ListthecodesoftheairportsinLondon

πcode (σcity=‘London’Airport)

{LHR,LGW}

Page 10: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:RelationalAlgebra

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

ListtheairlinesthatflydirectlyfromLondontoGlasgow

Page 11: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:RelationalAlgebra

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

ListtheairlinesthatflydirectlyfromLondontoGlasgow

πairline ((Flight⋈origin=code(σcity=‘London’ Airport))⋈destination=code(σcity=‘Glasgow’Airport))

Page 12: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:RelationalAlgebra

Aux origin destination airline code city code city

LGW GLA U2 LGW London GLA Glasgow

ListtheairlinesthatflydirectlyfromLondontoGlasgow

πairline ((Flight⋈origin=code(σcity=‘London’ Airport))⋈destination=code(σcity=‘Glasgow’Airport))

{U2}

definestheauxiliaryrelationAux

Page 13: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

• Selection:σ

• Projection:π

• Crossproduct:×

• Naturaljoin:⋈

• Rename:ρ

• Difference:∖

• Union:∪

• Intersection:∩

RelationalAlgebra

inboldaretheprimitiveoperators

Formaldefinitionscanbefoundinanydatabasetextbook

Page 14: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:DomainRelationalCalculus

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

Listalltheairlines

Page 15: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

{z|∃x∃yFlight(x,y,z)}

Querying:DomainRelationalCalculus

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

Listalltheairlines

{BA,U2,OS}

Page 16: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:DomainRelationalCalculus

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

ListthecodesoftheairportsinLondon

Page 17: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

{x|∃yAirport(x,y) ∧ y=London}

Querying:DomainRelationalCalculus

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

ListthecodesoftheairportsinLondon

{LHR,LGW}

Page 18: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Querying:DomainRelationalCalculus

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

ListtheairlinesthatflydirectlyfromLondontoGlasgow

Page 19: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

{z|∃x∃y∃u∃v Airport(x,u) ∧ u=London∧ Airport(x,u) ∧ u=London}∧ Flight(x,y,z)}

Querying:DomainRelationalCalculus

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

ListtheairlinesthatflydirectlyfromLondontoGlasgow

{U2}

Page 20: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

{x|∀yR(x,y)}{x|¬R(x)}{x,y |R(x)∨ R(y)}

DomainRelationalCalculus

{x1,…,xk |φ}

first-orderformulawith

freevariables{x1,…,xk}

But,wecanexpress“problematic”queries,i.e.,dependonthedomain

Page 21: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

DomainRelationalCalculus

domain ={1,2,3}

D ={R(1,1),R(1,2)}Ans={}

{x1,…,xk |φ}

first-orderformulawith

freevariables{x1,…,xk}

But,wecanexpress“problematic”queries,i.e.,dependonthedomain

{x|∀yR(x,y)}{x|¬R(x)}{x,y |R(x)∨ R(y)}

Page 22: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

DomainRelationalCalculus

domain ={1,2}

D ={R(1,1),R(1,2)}Ans={1}

{x1,…,xk |φ}

first-orderformulawith

freevariables{x1,…,xk}

But,wecanexpress“problematic”queries,i.e.,dependonthedomain

{x|∀yR(x,y)}{x|¬R(x)}{x,y |R(x)∨ R(y)}

Page 23: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

DomainRelationalCalculus

…thus,weadopttheactivedomainsemantics - quantifiedvariablesrangeover

theactivedomain,i.e.,theconstantsoccurringintheinputdatabase

{x1,…,xk |φ}

first-orderformulawith

freevariables{x1,…,xk}

But,wecanexpress“problematic”queries,i.e.,dependonthedomain

{x|∀yR(x,y)}{x|¬R(x)}{x,y |R(x)∨ R(y)}

Page 24: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Algebra=Calculus

Afundamentaltheorem(assumingtheactivedomainsemantics):

Theorem: Thefollowingquerylangaugesareequallyexpressive

• RelationalAlgebra(RA)

• DomainRelationalCalculus(DRC)

• TupleRelationalCalculus(TRC)

Note:Tuplerelationalcalculus isthedeclarativelanguageintroducebyCodd.Domainrelational

calculushasbeenintroducedlaterasaformalismclosertofirst-orderlogic

Page 25: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

{|∃x∃y∃z∃w∃vAirport(x,Vienna) ∧ Airport(y,Glasgow) ∧

Flight(x,z,w)∧ Flight(z,y,v)

Quiz!

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

IsGlasgowreachablefromVienna?

London

Vienna

Larnaca

Glasgow

Edinburgh

YES

Page 26: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Quiz!

London

Vienna

Larnaca

Glasgow

Edinburgh

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

IsGlasgowreachablefromVienna?

NO

{|∃x∃y∃z∃w∃vAirport(x,Vienna) ∧ Airport(y,Glasgow) ∧

Flight(x,z,w)∧ Flight(z,y,v)

Page 27: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

{|∃x∃y∃z∃w∃vAirport(x,Vienna) ∧ Airport(y,Glasgow) ∧

Flight(x,z,w)∧ Flight(z,y,v)

Quiz!

London

Vienna

Larnaca

Glasgow

Edinburgh

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

IsGlasgowreachablefromVienna?

YES∧ Flight(z,z1,w1) ∧

∃z1∃w1

Page 28: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Quiz!

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

IsGlasgowreachablefromVienna?

NO

London

Vienna

Larnaca

Glasgow

Edinburgh

{|∃x∃y∃z∃w∃vAirport(x,Vienna) ∧ Airport(y,Glasgow) ∧

Flight(x,z,w)∧ Flight(z,y,v)

∧ Flight(z,z1,w1) ∧

∃z1∃w1

Page 29: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Quiz!

Flight origin destination airline

VIE LHR BA

LHR EDI BA

LGW GLA U2

LCA VIE OS

Airport code city

VIE Vienna

LHR London

LGW London

LCA Larnaca

GLA Glasgow

EDI Edinburgh

IsGlasgowreachablefromVienna?

London

Vienna

Larnaca

Glasgow

Edinburgh

Recursivequery- notexpressibleinRA/DRC/TRC

(unlessweboundthenumberofintermediatestops)

Page 30: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

ComplexityofQueryLanguages

• Thegoalistounderstandthecomplexityofevaluatingaqueryoveradatabase

• Ourmaintechnicaltooliscomplexitytheory

• Whattomeasure?Queriesmayhavealargeoutput,anditwouldbeunfairto

counttheoutputas“complexity”

• Wethereforeconsiderthefollowingdecisionproblems:

− QueryOutputTuple(QOT)

− BooleanQueryEvaluation(BQE)

Page 31: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

AFewWordsonComplexityTheory

detailscanbefoundinthestandardtextbooks

seealsonotesonthewebpageofthecourse

Page 32: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

ComplexityClasses

TIME(f(n)) = {Π |ΠisdecidedbysomeDTMintimeO(f(n))}

NTIME(f(n)) = {Π |ΠisdecidedbysomeNTMintimeO(f(n))}

SPACE(f(n)) = {Π |ΠisdecidedbysomeDTMusingspaceO(f(n))}

NSPACE(f(n)) = {Π |ΠisdecidedbysomeNTMusingspaceO(f(n))}

Page 33: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

PTIME = ⋃k>0 TIME(nk)

NP = ⋃k>0 NTIME(nk)

EXPTIME = ⋃k>0 TIME(2nk)

NEXPTIME = ⋃k>0 NTIME(2nk)

LOGSPACE = SPACE(logn)

NLOGSPACE = NSPACE(logn)

PSPACE = ⋃k>0 SPACE(nk)

EXPSPACE = ⋃k>0 SPACE(2nk)

ComplexityClasses

• Wecannowrecallthestandardtimeandspacecomplexityclasses:

thesedefinitionsarerelyingontwo-

tapeTuringmachineswitharead-

onlyandaread/writetape

• ForeverycomplexityclassCwecandefineitscomplementaryclasscoC

Page 34: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

RelationshipAmongComplexityClasses

LOGSPACE⊆ NLOGSPACE⊆ PTIME⊆ NP,coNP ⊆

PSPACE⊆ EXPTIME⊆ NEXPTIME,coNEXPTIME ⊆⋯

Someusefulnotes:

• ForadeterministiccomplexityclassC,coC =C

• coNLOGSPACE =NLOGSPACE

• ItisgenerallybelievedthatPTIME≠ NP,butwedon’tknow

• PTIME⊂ EXPTIME⇒ atleastonecontainmentbetweenthemisstrict

• PSPACE=NPSPACE,EXPSPACE=NEXPSPACE,etc.

• But,wedon’tknowwhetherLOGSPACE=NLOGSPACE

Page 35: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

CompleteProblems

• Thesearethehardestproblemsinacomplexityclass

• AproblemthatiscompleteforaclassC,itisunlikelytobelonginalowerclass

• AproblemΠiscomplete foracomplexityclassC,orsimplyC-complete,if:

1. Π ∈ C

2. ΠisC-hard,i.e.,everyproblemΠ’∈ Ccanbeefficientlyreduced toΠ

• ToshowthatΠisC-harditsufficestoreducesomeC-hardproblemΠ’toit

thereexistsalogspace algorithmthatcomputesafunctionfsuchthat

w ∈ Π’ iff f(w) ∈ Π - inthiscasewewriteΠ’≤L Π

Page 36: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

SomeCompleteProblems

• NP-complete

‒ SAT(satisfiabilityofpropositionalformulas)

‒ Manygraph-theoreticproblems(e.g.,3-colorability)

‒ Travelingsalesman

‒ etc.

• PSPACE-complete

‒ QuantifiedSAT(orsimplyQSAT)

‒ Equivalenceoftworegularexpressions

‒ Manygames(e.g.,Geography)

‒ etc.

Page 37: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

BacktoQueryLanguages

Page 38: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

ComplexityofQueryLanguages

• Thegoalistounderstandthecomplexityofevaluatingaqueryoveradatabase

• Ourmaintechnicaltooliscomplexitytheory

• Whattomeasure?Queriesmayhavealargeoutput,anditwouldbeunfairto

counttheoutputas“complexity”

• Wethereforeconsiderthefollowingdecisionproblems:

− QueryOutputTuple(QOT)

− BooleanQueryEvaluation(BQE)

Page 39: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

ComplexityofQueryLanguages

QOT(L)

Input:adatabaseD,aqueryQ/k ∈ L,atupleofconstantst∈ adom(D)k

Question: t ∈ Q(D)?

Someusefulnotation:

• GivenadatabaseD,andaqueryQ,Q(D)istheanswer toQ overD

• adom(D)istheactivedomain ofD - theconstantsoccurringinD

• WewriteQ/k forthefactthatthearity ofQ isk≥ 0

L issomequerylanguage;forexample,RA,DRC,etc. - wewillseeseveralquerylanguages

Page 40: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

ComplexityofQueryLanguages

BQE(L)

Input:adatabaseD,aBooleanqueryQ ∈ L

Question: isQ(D)non-empty?

Someusefulnotation:

• GivenadatabaseD,andaqueryQ,Q(D)istheanswer toQ overD

• adom(D)istheactivedomain ofD - theconstantsoccurringinD

• WewriteQ/k forthefactthatthearity ofQ isk≥ 0

L issomequerylanguage;forexample,RA,DRC,etc. - wewillseeseveralquerylanguages

Page 41: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

(≡Lmeanslogspace-equivalent)

Theorem: QOT(L)≡L BQE(L),whereL ∈ {RA,DRC,TRC}

BQE(L)

Input:adatabaseD,aBooleanqueryQ ∈ L

Question: isQ(D)non-empty?

QOT(L)

Input:adatabaseD,aqueryQ/k ∈ L,atupleofconstantst∈ adom(D)k

Question: t ∈ Q(D)?

ComplexityofQueryLanguages

Page 42: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Theorem: QOT(DRC)≡L BQE(DRC)

Proof: (≤L)ConsideradatabaseD,ak-ary queryQ ={x1,…,xk |φ},andatuple(t1,…,tk)

LetQbool ={|∃x1⋯∃xk (φ ∧ x1=t1 ∧ x2=t2 ∧⋯ ∧ xk=tk)}

Clearly,(t1,…,tk) ∈ Q(D) iff Qbool (D)isnon-empty

(≥L)Trivial- aBooleandomainRCqueryisadomainRCquery

ComplexityofQueryLanguages

(letusshowthisfordomainrelationalcalculus)

Page 43: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

BQE[D](L)

Input:aBooleanqueryQ ∈ L

Question: isQ(D)non-empty?

ComplexityMeasures

• Combinedcomplexity - bothD andQ arepartoftheinput

• Querycomplexity - fixedD,inputQ

• Datacomplexity - inputD,fixedQ

BQE[Q](L)

Input:adatabaseD

Question: isQ(D)non-empty?

Page 44: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Theorem: ForL ∈ {RA,DRC,TRC}thefollowinghold:

• BQE(L)isPSPACE-complete(combinedcomplexity)

• BQE[D](L)isPSPACE-complete,forafixeddatabaseD(querycomplexity)

• BQE[Q](L)isinLOGSPACE,forafixedqueryQ ∈ L (datacomplexity)

ComplexityofRA,DRC,TRC

Proof hints:

• Recursive algorithm that uses polynomial space in Q and logarithmic space in D

• Reduction from QSAT (a standardPSPACE-hardproblem)

Page 45: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Evaluating(Boolean)DRCQueries

Eval(D,φ) - forbrevitywewriteφ insteadof{|φ}

• Ifφ= R(t1,…,tk), then YESiff R(t1,…,tk) ∈ D

• Ifφ= ψ1∧ψ2,then YESiff Eval(D,ψ1) =YESandEval(D,ψ2) =YES

• Ifφ= ¬ψ,thenNOiff Eval(D,ψ) =YES

• Ifφ= ∃xψ(x),then YESiff forsomet∈ adom(D),Eval(D,ψ(t)) =YES

Lemma: It holds that

• Eval(D,φ) always terminates - this is trivial

• Eval(D,φ) = YES iff Q(D) is non-empty, whereQ = { | φ}

• Eval(D,φ) uses O(|φ| ⋅ log |φ| + |φ|2 ⋅ log |D|) space

Page 46: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Theorem: ForL ∈ {RA,DRC,TRC}thefollowinghold:

• BQE(L)isPSPACE-complete(combinedcomplexity)

• BQE[D](L)isPSPACE-complete,forafixeddatabaseD(querycomplexity)

• BQE[Q](L)isinLOGSPACE,forafixedqueryQ ∈ L (datacomplexity)

ComplexityofRA,DRC,TRC

Proof hints:

• Recursive algorithm that uses polynomial space in Q and logarithmic space in D

• Reduction from QSAT (a standardPSPACE-hardproblem)

Page 47: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

EQUIV(L)

Input:twoqueriesQ1 ∈ Land Q2 ∈ L

Question:Q1≡Q2? orQ1(D)=Q2(D)forevery(finite)databaseD?

SAT(L)

Input:aqueryQ ∈ L

Question: istherea(finite)databaseD suchthatQ(D)isnon-empty?

OtherImportantAlgorithmicProblems

CONT(L)

Input:twoqueriesQ1 ∈ Land Q2 ∈ L

Question:Q1⊆Q2? orQ1(D)⊆Q2(D)forevery(finite)databaseD?

Page 48: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

EQUIV(L)

Input:twoqueriesQ1 ∈ Land Q2 ∈ L

Question:Q1≡Q2? orQ1(D)=Q2(D)forevery(finite)databaseD?

SAT(L)

Input:aqueryQ∈ L

Question: istherea(finite)databaseDsuchthatQ(D)isnon-empty?

OtherImportantAlgorithmicProblems

CONT(L)

Input:twoqueriesQ1 ∈ Land Q2 ∈ L

Question:Q1⊆Q2? orQ1(D)⊆Q2(D)forevery(finite)databaseD?

theseproblemsareimportant

foroptimizationpurposes

Page 49: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

SAT(L)

Input:aqueryQ ∈ L

Question: istherea(finite)databaseD suchthatQ(D)isnon-empty?

OtherImportantAlgorithmicProblems

• If the answer is no, then the input query Q makes no sense

• Query evaluation becomes trivial - the answer is always NO!

Page 50: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

EQUIV(L)

Input:twoqueriesQ1 ∈ Land Q2 ∈ L

Question:Q1≡Q2? orQ1(D)=Q2(D)forevery(finite)databaseD?

OtherImportantAlgorithmicProblems

• Replace a query Q1 with a query Q2 that is easier to evaluate

• But, we have to be sure that Q1(D) = Q2(D) for every database D

Page 51: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

• Approximate a query Q1 with a query Q2 that is easier to evaluate

• But, we have to be sure that Q1(D)⊆Q2(D) for every database D

CONT(L)

Input:twoqueriesQ1 ∈ Land Q2 ∈ L

Question:Q1⊆Q2? orQ1(D)⊆Q2(D)forevery(finite)databaseD?

OtherImportantAlgorithmicProblems

Page 52: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Theorem: ForL ∈ {RA,DRC,TRC},SAT(L)isundecidable

Proof hint: By reduction from the halting problem.

Given a Turing machine M, we can construct a query QM ∈ L such that:

Mhaltsontheemptystringiff thereexistsadatabaseD suchthatQ(D)isnon-empty

SATisUndecidable

Note:Actually,thisresultgoesbacktothe1950when

BorisA.Trakhtenbrotprovedthattheproblemofdeciding

whetherafirst-ordersentencehasafinitemodelisundecidable

Page 53: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

AneasyconsequenceofthefactthatSATisundecidableisthat:

Theorem: ForL ∈ {RA,DRC,TRC},EQUIV(L)andCONT(L)areundecidable

Proof: By reduction from the complement of SAT(L)

• Consider a query Q ∈ L - i.e., an instance of SAT(L)

• LetQ’ beaquerythatisunsatisfiable,i.e.,Q’(D)isemptyforeveryD

• Forexample,whenL =DRC,Q’ canbethequery{|∃xR(x)∧¬R(x)}

• Clearly,Qisunsatisfiable iff Q≡Q’(orevenQ⊆Q’)

EQUIVandCONTareUndecidable

Page 54: relational model - homepages.inf.ed.ac.ukhomepages.inf.ed.ac.uk/apieris/courses/atfd2020/relational model.pdf · Relational Model • Many ad hoc models before 1970 − Hard to work

Recap

• The main languages for querying relational databases are:

− Relational Algebra (RA)

− Domain Relational Calcuclus (DRC)

− Tuple Relational Calculus (TRC)

• Evaluation is decidable, and highly tractable indata complexity

− Foundations of the database industry

− The core of SQL is equally expressive to RA/DRC/TRC

RA =DRC =TRC

(undertheactivedomainsemantics)

• Satisfiability, equivalence and containment are undecidable

− Perfect query optimization is impossible