integrating databases into the semantic web through an ontology-based framework

34
1 Integrating Databases into the Integrating Databases into the Semantic Web through an Ontology- Semantic Web through an Ontology- based Framework based Framework Dejing Dou, Paea LePendu, Shiwoong Kim Computer and Information Science, University of Oregon, USA Peishen Qi Computer Science Department, Yale University, USA April, 2006 @ SWDB’06

Upload: sheila

Post on 14-Jan-2016

22 views

Category:

Documents


1 download

DESCRIPTION

Integrating Databases into the Semantic Web through an Ontology-based Framework. Dejing Dou , Paea LePendu, Shiwoong Kim Computer and Information Science, University of Oregon, USA Peishen Qi Computer Science Department, Yale University, USA April, 2006 @ SWDB’06. Outline. Introduction - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Integrating Databases into the Semantic Web through an Ontology-based Framework

1

Integrating Databases into the Semantic Web Integrating Databases into the Semantic Web through an Ontology-based Frameworkthrough an Ontology-based Framework

Dejing Dou, Paea LePendu, Shiwoong Kim

Computer and Information Science, University of Oregon, USA

Peishen Qi

Computer Science Department, Yale University, USA

April, 2006 @ SWDB’06

Page 2: Integrating Databases into the Semantic Web through an Ontology-based Framework

2

OutlineOutline Introduction

– The status of the Semantic Web– Realizing SW needs existing databases

OntoGrate: An Ontology-based Information Integration Framework– Some previous work– Modules in OntoGrate Architecture

Case Study for integrating Databases into SW– Without an existing domain ontology– With an existing domain ontology

Conclusion and Future Work

Page 3: Integrating Databases into the Semantic Web through an Ontology-based Framework

3

The Semantic Web The Semantic Web One major goal of the Semantic Web is that web-based agents can

process and “understand” data [Berners-Lee etal01]. Ontologies formally describe the semantics of data and web-based

agents can take SW documents (e.g. in RDF/OWL) as a set of assertions (true statements) and draw inferences from them.

human

SW

Web-based agents

Page 4: Integrating Databases into the Semantic Web through an Ontology-based Framework

4

What we have now?What we have now? DAML+OIL OWL (Web ontology language)

More and more domain ontologies are defined in DAML+OIL/OWL, even for some specific domains (e.g., GO)

We are developing some tools, agents, services

See http://www.semwebcentral.org, http://knowledgeweb.semanticweb.org/

http://www.daml.org/

Page 5: Integrating Databases into the Semantic Web through an Ontology-based Framework

5

Two things are importantTwo things are important

Real Data for sharing– relational databases (may be the biggest resource)– Other kinds of databases– WWW/XML data– Some knowledge bases

Better Semantic Web Services/Agents

Page 6: Integrating Databases into the Semantic Web through an Ontology-based Framework

6

Semantic Annotation for Data?Semantic Annotation for Data? It is good for small size data resources

It is not that good for large size data resources (relational databases) – “Redundant” copies – Time consuming for query answering.

E.g. it currently works as loading OWL data into a knowledge base then answering queries with DL ABox reasoning. (Can it compete with existing DBMS which has well developed indexing and query optimization techniques?)

It is better that relational databases can be accessed/queried directly by SW agents/services

Page 7: Integrating Databases into the Semantic Web through an Ontology-based Framework

7

The difficultiesThe difficulties

The Semantic Web The Relational DBs

Ontologies define the semantics of data

Schemas define the structure and integrity constraints

Page 8: Integrating Databases into the Semantic Web through an Ontology-based Framework

8

A more general questionA more general question

How can we make databases, SW resources, WWW/XML data, KBs work together?

The problem is similar– SW resources and KBs are defined by ontologies, which

are more expressive and focus on semantics– Databases and XML documents are defined by schemas,

which focus on structure– Syntax difference (e.g., OWL vs. SQL)

Page 9: Integrating Databases into the Semantic Web through an Ontology-based Framework

9

OntoGrate: An Ontology-based Information OntoGrate: An Ontology-based Information Integration SystemIntegration System

Page 10: Integrating Databases into the Semantic Web through an Ontology-based Framework

10

Some Previous WorkSome Previous Work Schemas (e.g., stores7 DB in IBM informix),

Page 11: Integrating Databases into the Semantic Web through an Ontology-based Framework

11

Some Previous WorkSome Previous WorkSchemas, Ontologies and Web-PDDL

Relation Type/Class

Attribute Predicate/Property

Integrity Constrain Axiom/Rule

Primary Key Fact/Instance

Page 12: Integrating Databases into the Semantic Web through an Ontology-based Framework

12

Some Previous WorkSome Previous WorkMerging Ontologies with Bridging Axioms

Page 13: Integrating Databases into the Semantic Web through an Ontology-based Framework

13

Some Previous WorkSome Previous WorkThe Bridge Axiom/mapping on customerfname/customerlname vs. customercontactname :

(forall (c - @stores7:Customer f l - @sql:varchar) (if (and (@stores7:customerfname c f)

(@stores7:customerlname c l))

(@nwind:customercontactname c (@sql:concat f l))))

Page 14: Integrating Databases into the Semantic Web through an Ontology-based Framework

14

Some Previous WorkSome Previous WorkThe Bridge Axiom/mapping on customerregion vs. customerstatecode/statename/statecode :

(forall (x - @nwind:Customer y - @sql:varchar) (if (@nwind:customerregion x y) (exists (z - @stores7:State t - @sql:varchar) (and (@stores7:customerstatecode x t) (@stores7:statename z y) (@stores7:statecode z t)))))

Page 15: Integrating Databases into the Semantic Web through an Ontology-based Framework

15

Some Previous WorkSome Previous WorkInferential Data Integration with OntoEngine

– Data Translation:

View data as true statements, e.g., (statecode S#28 “OR”)

(Ms_t; s) D t only if (Ms_t; s) ╞ t

(Ms_t; s) D t (Ms_t; s) ├ t (Ms_t; s) ╞ t

– Query Translation:

(Ms_t; s) Q t only if (Ms_t; (t)) ╞ (s)

(Ms_t; s) Q t (Ms_t; (t)) ├ (s)

(Ms_t; (t)) ╞ (s)

Page 16: Integrating Databases into the Semantic Web through an Ontology-based Framework

16

OntoGrate Architecture RevisitedOntoGrate Architecture Revisited

Page 17: Integrating Databases into the Semantic Web through an Ontology-based Framework

17

Modules in OntoGrate ArchitectureModules in OntoGrate ArchitectureThe Syntax Translators (Wrappers)

– e.g., PDDSQL (SQLWeb-PDDL), PDDOWL(OWL Web-PDDL)

The Matching (correspondence) Generation– e.g., name, structure (tree, graph) similarity,synonyms and

is-a (part of) relationships using thesauri and dictionary, such as Wordnet

The Data Mining ModuleThe Machine Learning ModuleThe Inference Engine (OntoEngine)The User Interface

Page 18: Integrating Databases into the Semantic Web through an Ontology-based Framework

18

Learning the mappings from Learning the mappings from domain expertsdomain experts

(forall (x - @A1:Invertebrate)

(if (is @A1:Insect x)

(and (@A2:legs x 6)

(@A2:bodySegments x 3))))

Page 19: Integrating Databases into the Semantic Web through an Ontology-based Framework

19

Mining the mappings from large Mining the mappings from large datasetsdatasets

For example, two Medical databases in the same hospital: DB1 list blood pressure of patients with nominal values, such as low, normal, at risk, and high, while the other DB2 may record the exact numerical values for systolic and diastolic pressure.

By association rule mining, we may get the rule/mapping like:

@DB2:SystolicPressure 140 @DB2:DiastolicPressure 90

@DB2:BloodPressure = `High‘

(support = 40%, confidence = 90%)

Page 20: Integrating Databases into the Semantic Web through an Ontology-based Framework

20

Case Study in Two ScenariosCase Study in Two ScenariosIntegrating DBs into SW without an

existing domain ontology

Integrating DBs into SW with an existing domain ontology

Page 21: Integrating Databases into the Semantic Web through an Ontology-based Framework

21

Without an existing domain ontologyWithout an existing domain ontology

Page 22: Integrating Databases into the Semantic Web through an Ontology-based Framework

22

Generating OWL ontologies from DB SchemasGenerating OWL ontologies from DB Schemas

SQL schema Web-PDDL (by using PDDSQL) Web-PDDL OWL (by using PDDOWL)

– E.g., Stores7.sql Stores7.pddl Stores7.owl ... <owl:Class rdf:ID="Customer"> <rdfs:subClassOf rdf:resource=“http://www.cs.uoregon.edu/~paea/sql#Relation"/> </owl:Class> <owl:DatatypeProperty rdf:ID="customercity"> <rdfs:domain rdf:resource="#Customer"/> <rdfs:range rdf:resource="#String"/> </owl:DatatypeProperty> ...

Page 23: Integrating Databases into the Semantic Web through an Ontology-based Framework

23

An OWL-QL query based on Stores7.owl An OWL-QL query based on Stores7.owl <owl-ql:query xmlns:owl-ql=“http://www.w3.org/2003/10/owl-ql-syntax#"...> <owl-ql:premise> <rdf:RDF> <rdf:Description rdf:about="#C"> <rdf:type rdf:resource="#Customer"/> <customercity rdf:resource="#Eugene"/> </rdf:Description> </rdf:RDF> </owl-ql:premise> <owl-ql:queryPattern> <rdf:RDF> <rdf:Description rdf:about="#C"> <customerfname rdf:resource="http://www.w3.org/2003/10/ owl-ql-variables #x"/> <customerlname rdf:resource=" http://www.w3.org/2003/10/ owl-ql-variables##y"/> </rdf:Description> </rdf:RDF> </owl-ql:queryPattern> <owl-ql:answerKBPattern> <owl-ql:kbRef rdf:resource="...stores7.owl"/>…

Page 24: Integrating Databases into the Semantic Web through an Ontology-based Framework

24

The corresponding Web-PDDL and SQL queriesThe corresponding Web-PDDL and SQL queries

(and (customercity ?C - Customer "Eugene") (customerfname ?C - Customer ?x - String) (customerlname ?C - Customer ?y - String))

PDDSQL

SELECT C.customerfname, C.customerlname

FROM Customer C

WHERE C.customercity = "Eugene"

PDDOWL

Page 25: Integrating Databases into the Semantic Web through an Ontology-based Framework

25

Getting Answers from Stores7 DBGetting Answers from Stores7 DB

{?x/Paea, ?y/LePendu}{?x/Dejing, ?y/Dou}{?x/Shiwoong, ?y/Kim}

PDDOWL

PDDSQL

customerfname customerlname

Paea LePendu

Dejing Dou

Shiwoong Kim

<owl-ql:answerBundle

xmlns:owl-ql=" http://www.w3.org/2003/10/ owl-ql-syntax#" ...>

<owl-ql:answer>

<owl-ql:binding-set>

<var:x rdf:resource="#Paea"/>

<var:y rdf:resource="#LePendu"/>

</owl-ql:binding-set>

<owl-ql:answerPatternInstance>

<rdf:RDF>

<rdf:Description rdf:about="#C">

<customerfname rdf:resource="#Paea"/>

(1000 bindings/3 secs)

(1000/100,000/3secs)

Page 26: Integrating Databases into the Semantic Web through an Ontology-based Framework

26

With an existing domain ontologyWith an existing domain ontology

Order ontology: http://www.dayf.de/2004/owl/order.owl

Page 27: Integrating Databases into the Semantic Web through an Ontology-based Framework

27

An OWL-QL query based on order.owl An OWL-QL query based on order.owl <owl-ql:query xmlns:owl-ql=“http://www.w3.org/2003/10/owl-ql-syntax#"...> <owl-ql:premise> <rdf:RDF> <<rdf:type rdf:resource="#Person"/> <hasAddress rdf:resource="#A"/> </rdf:Description> <rdf:Description rdf:about="#A"> <rdf:type rdf:resource="#Address"/> <City rdf:resource="#Eugene"/> </rdf:Description> </rdf:Description> </rdf:RDF> <owl-ql:queryPattern> <rdf:RDF> <rdf:Description rdf:about="#C"> <FirstName rdf:resource="http://www.w3.org/2003/10/ owl-ql-variables #x"/> <LastNname rdf:resource=" http://www.w3.org/2003/10/ owl-ql-variables##y"/> … <owl-ql:kbRef rdf:resource=" http://www.dayf.de/2004/owl/order.owl"/>…

Page 28: Integrating Databases into the Semantic Web through an Ontology-based Framework

28

The Bridging Axioms/Mappings between The Bridging Axioms/Mappings between Stores7.pddl and Order.pddl Stores7.pddl and Order.pddl

(T-> @stores7:Customer @order:Person)

(forall (P - @order:Person A - @order:Address z - String)

(if (and (@order:hasAddress P A)

(@order:City A z))

(@stores7:customercity P z)))

(forall (C - @stores7:Customer z - String)

(if (@stores7:customercity P z)

(exists (A - @order:Address)

(and (@order:hasAddress P A)

(@order:City A z)))))

Page 29: Integrating Databases into the Semantic Web through an Ontology-based Framework

29

The Bridging Axioms/Mappings between The Bridging Axioms/Mappings between Stores7.pddl and Order.pddl Stores7.pddl and Order.pddl

(T-> @stores7:Customer @order:Person)

(forall (C - @stores7:Customer x - String)

(iff (@stores7:customerfname C x)

(@order:FirstName C x)))

(forall (C - @stores7:Customer y - String)

(iff (@stores7:customerlname C y)

(@order:LastName C y)))

Page 30: Integrating Databases into the Semantic Web through an Ontology-based Framework

30

The Query Translation between Stores7 and OrderThe Query Translation between Stores7 and Order

(and (hasAddress ?C - Person ?A - Address) (City ?A "Eugene") (FirstName ?C - Person ?x - String) (LastName ?C - Person ?y - String))

OntoEngine ( < 1 sec)

(and (customercity ?C - Customer "Eugene") (customerfname ?C - Customer ?x - String) (customerlname ?C - Customer ?y - String))

PDDOWL

Bridging Axioms

OWL-QL query in order.owl

Page 31: Integrating Databases into the Semantic Web through an Ontology-based Framework

31

Final Answers in the order ontologyFinal Answers in the order ontology

(customerfname C1 Paea)(customerlname C2 LePendu)(customerfname C1 Dejing)…

PDDOWL (10,000 facts/11 secs)

PDDSQL

customerfname customerlname

Paea LePendu

Dejing Dou

Shiwoong Kim

<owl-ql:answer>

<owl-ql:binding-set>

<var:x rdf:resource="#Paea"/>

<var:y rdf:resource="#LePendu"/>

</owl-ql:binding-set>

<owl-ql:answerPatternInstance>

<rdf:RDF>

<rdf:Description rdf:about="#C">

<FirstName rdf:resource="#Paea"/>

<LastName rdf:resource="#LePendu"/>

OntoEngine (40,000facts/30 secs)

(FirstName C1 Paea)(LastName C2 LePendu)(FirstName C1 Dejing)…

Page 32: Integrating Databases into the Semantic Web through an Ontology-based Framework

32

Some related workSome related work Semantic Annotation

– [Stojanovic etal@SAC02] maps relational model to frame logic/RDF.– DOGMA[Verheyden etal@SWDB04] translates a ontology query to SQL

Schema and Ontology mapping– Similarity matching, machine learning… useful for

generating candidate matchings– Semi-automatic tool (Clio)

Data integration and query answering– Federated databases[Sheth&Larson 90], data warehouse, peer

to peer management [Halevy etal@ICDE03] , MiniCon [PottingerLevy@VLDB00] uses query rewriteing at GLV

Logic and Databases– Reiter’s reconstruction of relational model in FOL.– Carnot, SIMS, Information Manifold by using a global

ontology, DL or Datalog

Page 33: Integrating Databases into the Semantic Web through an Ontology-based Framework

33

Conclusion and Future workConclusion and Future work We applied OntoGrate, an ontology-based information

integration framework, to integrate relational databases with the Semantic Web. The testing result based on two scenarios is promising.

We are developing other modules (e.g., learning/mapping/UI) in OntoGrate.

The scalability and efficiency need to be investigated in larger-size data resources.

Extending the current work to integrate XML (with/without XML schemas or DTD) and the Semantic Web.

Page 34: Integrating Databases into the Semantic Web through an Ontology-based Framework

34

Thank you for your attentionThank you for your attention ! !