ontologies and db schema: what's the difference?

24
Ontologies and Database Schema: What’s the Difference? Michael Uschold, PhD Semantic Arts .

Upload: michael-uschold

Post on 18-Dec-2014

11.884 views

Category:

Documents


3 download

DESCRIPTION

"What is the difference between an ontology and a database schema"? Since the early days of the now maturing ontology field, this has been a persistent question that has never been adequately answered. We define each concept for a high level comparison and then ask the following questions about each concept: 1) What is it for? 2) What does it look like? 3) How do you build one? 4) How is it implemented and used? and 5) Where are the semantics?This gives rise to many other questions. For example: What is the role of constraints? of instances? Is there an analogy in ontology development for the process or database schema normalization? How is change management handled?The differences between database schema and ontologies are many, varied and illuminating. Most arise from their different purposes and historical origins. There are also striking similarities. We wondered whether database schema and ontologies were more alike than different. We reached a surprising conclusion!

TRANSCRIPT

Page 1: Ontologies and DB Schema: What's the Difference?

Ontologies and

Database Schema:

What’s the Difference?

Michael Uschold, PhDSemantic Arts

.

Page 2: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Objective

To settle once and for all the question:

What is the difference between an

ontology and a database schema?

• Often asked.

• Never adequately answered.

Page 2

What is the same?

What is different?

Do we need both?

Page 3: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Example Ontology

Page 3

Hydraulics

Page 4: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

part-of

connected-to

supplies-fuel-to

has-part

done-by Mechanical Device

Pump Engine

Hydraulic Pump Fuel Pump

Aircraft Engine Driven Pump

Fuel SystemHydraulic System

Jet Engine

Fuel Filter

Pumping

Example: Logical DB Schema (IDEF1X)

Page 4

Hydraulics

Page 5: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Example: Hydraulics

Is this similarity just superficial?

Ontology

part-of

connected-to

supplies-fuel-to

has-part

done-by Mechanical Device

Pump Engine

Hydraulic Pump Fuel Pump

Aircraft Engine Driven Pump

Fuel SystemHydraulic System

Jet Engine

Fuel Filter

Pumping

Logical DB Schema: IDEF1X

Page 5

Page 6: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

The Basic Idea for Each

Ontology:

• defines a set of concepts and relationships

• that represent the content and structure

• of some subject matter

• in a formal language.

Database Schema:

• defines the structure of a database in a formal language.

(loosely refers to any of: conceptual, logical, physical)

part-of

connected-to

supplies-fuel-to

has-part

done-by Mechanical Device

Pump Engine

Hydraulic Pump Fuel Pump

Aircraft Engine Driven Pump

Fuel SystemHydraulic System

Jet Engine

Fuel Filter

Pumping

For shared understanding.

For a database.

Page 6

Page 7: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Five Questions to get a Better Understanding

1. What is it for?

2. What does it look like?

3. How do you build one?

4. How is it implemented and used?

5. Where are the semantics?

“it”: ontology or

database schema

Quick Poll: more different?

Are they more alike or different?

Quick Poll: more alike?

Page 7

Page 8: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

What is it for?

Focus

OntologyDB Schema

Data Meaning

Shared Understanding

Core

Purpose(s)

Structure

instances for

efficient storage

and querying.

Human communication,

interoperability, search,

software engineering, …

Also for structuringinstances.

Single purpose.

By the way… Meaning lost. Instances optional.

Page 8

Page 9: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

What does it look like? Notation

Notation:

syntax

OntologyDB Schema

ER diagrams;

no standard

serialization

syntax.

Minimal focus on

formal semantics.

Strong focus on

formal semantics.

Notation:

semantics

Logic;

no standard

diagram notation

syntax.

Page 9

Page 10: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

What does it look like? Expressivity

OntologyDB Schema

Expressivity overlap Entities

Attributes,

Relations

Constraints

Classes

Properties

Axioms

Expressivity differences No

taxonomy.

Constraints for

integrity,

foreign key,

delete.

Taxonomy

is backbone.

Constraints for

meaning,

consistency &

integrity.

Cardinality constraints.

Page 10

Page 11: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

How to Build One?

OntologyDB Schema

Scratch,

rarely reuse.

Starting point: Reuse if

possible.

Normalization: Standard rules in

natural language,

little tool support.

No standard

rules or

guidelines.

OntoClean

Fundamental step.

Manual, geared to

specific queries

for specific DB.

Ontology

independent;

Inference engine

developers.

Optimization:

Toss expensive constraints. Lost meaning.

Some tuning may be required. A black art.

Page 11

Page 12: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

How is it Implemented and Used?

OntologyDB Schema

Locked into specific

set of queries per

DB.

Tight coupling.

Lost meaning.

Hard to evolve and

maintain.

ETL tools to help.

Change Management,

Agility, Flexibility

No query lock-in.

Queries usable on

other systems.

Looser coupling.

Semantics explicit.

Potentially easier to

evolve & maintain.

Few tools.

Semantics hardwired in procedural code.

Still no picnic!

Page 12

Page 13: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

How is it Implemented and Used?

OntologyDB Schema

SQL Engines

Queries

Reasoning with

Views

Data integrity

Standardized on SQL

Processing Engines Theorem Provers

Derive new

information from

existing information.

Consistency and

integrity

Less standardization

Primary Focus:

Both handle complex logical expressions.

Page 13

Page 14: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

How is it Implemented and Used?

OntologyDB Schema

Highly tuned for

performance and

scale.

Not work well with

too many joins.

Performance Full inferencing:

much smaller scale.

Reduced inferencing:

reaching large scale.

RDB and Triple Stores both have a Niche.

Tradeoff: Performance vs. Function & Flexibility

Page 14

Page 15: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

How is it Implemented and Used?

Building Databases & Applications

• Gather Requirements

• Build Conceptual Schema

(e.g., ER or UML model)

• Refine to Logical Schema

(e.g., normalize, still ER or UML)

• Refine to Physical Schema…

Page 15

Page 16: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

How is it Implemented and Used?

Refind to Physical Schema

• Define tables, columns, keys.

• Optimize to Specific Kinds of Queries.

• Create Data Dictionary (semantics for humans).

• Integrity Constraints • Domain, Referential & Semantic integrity

• Do the best you can, little automation.

• Where are the Semantics?

Page 16

Page 17: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Five Questions to get a Better Understanding

1. What is it for?

2. What does it look like?

3. How do you build one?

4. How is it implemented and used?

5. Where are the semantics?

“it”: ontology or

database schema

Page 17

Page 18: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Where are the Semantics for Database Schema?

• Mainly in Conceptual Schema and Data Dictionary

• Designed for Humans

• Semantics don’t evolve as DB and applications change

• Conceptual Schema semantics thrown away when

building Physical Schema.

• Integrity constraints hardwired in procedural code

Semantics are hardwired, lost, tossed, or out of date.

You cannot maintain what you do not understand!

Page 18

Page 19: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Quick Summary

Focus on DATA

DB Constraints

• to ensure integrity

• may hint at meaning

No ISA hierarchy

SQL Engines

• querying, views

• data integrity

Instances Central

Data Dictionary

• separate artifact

FOCUS on Meaning

Ontology Axioms

• to specify meaning

• maybe for integrity

ISA Hierarchy is Backbone

Theorem Provers

• infer new information

• ensure consistency

Instances Optional

'Comments'

• part of the ontology

Database Schema Ontology

Page 19

Page 20: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

More Detailed Summary: 24 Features/Aspects

Core

for DB Schema

Secondary

for DB Schema

Unimportant

for DB Schema

Core

for Ontology

1. Represented using a formal language.

2. Expressivity: types, properties, constraints.

3. Constraints for consistency checking.

4. Shared meaning of some

subject matter.

5. Taxonomy.

6. Multi-purpose.

7. Embedded natural

language definitions.

8. Constraints for meaning.

9. Constraints for ensuring

self-consistency

(not data).

16. Abstract types w/no

instances.

17. Reused to build new ones.

18. Reused in unexpected ways.

19. Formal model-theoretic

semantics.

Secondary

for Ontology

10. Efficient querying and storage for data.

11. Standardized diagram notation.

12. Separate natural language definitions

(data dictionary).

13. Constraints for data integrity.

14. Industry-wide construction guidelines

(normalization).

15. Scale to huge sizes.

Unimportant

for Ontology

20. Cardinality constraints for getting foreign

keys right and ensuring tables created for

many-to-many relationships.

21. Toss semantics after conceptual modeling.

22. Optimization for specific set of queries.

23. Sophisticated tool support for migrating data

when schema evolve (ETL).

24. SQL for querying data.

More Alike?• Over 60% of features are common to both.• The 3 features core to both are the most

important: what is expressed and how.

More Different?Only 12% of features are core to both.

Can you convert one into the other?

Page 20

Page 21: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

More Alike? or More Different?

Conversion?

• Conceptual Schema & Ontology

No harder than between two different ontology languages

• Ontology & Logical Schema

Some loss of information, a design artifact

• Ontology & Physical Schema

Much loss of information, an implementation artifact

… even though we are apples and oranges, we are all fruit.

My Big Fat Greek Wedding Toast:

Page 21

Page 22: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Conclusion

Speaking of weddings… What do you get if you cross:

• DB schema technology/community?

• Ontology technology/community?

• They are similar beasts. • They evolved in different communities.

Two cultures divided by a common idea?

Page 22

Page 23: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Ontology/Model-Driven Architecture & Development

Basic Idea:

• Explicitly capture the semantics as formal ontologies.

• Base design- and implementation-level artifacts on the ontologies.

• Ontologies drive the applications, directly or indirectly.

Benefits:

• Looser coupling.

• Semantics is explicit.

• Integration/Interoperability by design.

• Inference gives automated consistency checking.

• Easier to evolve and maintain:• Less hardwiring of semantics means easier to understand and change

• Ontologies evolve with the DB and applications.

Conceptual, Real world

Logical, Design world

Physical, Implementation

world

Page 23

Page 24: Ontologies and DB Schema: What's the Difference?

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Copyright © 2011 Michael Uschold. All rights reserved.

Acknowledgements

For answering countless questions as a cube-mate:

• John Thompson

For reviews of a companion unpublished paper:

• Phil Bernstein,

• Tim Wilmering,

• Jun Yuan,

• Anhai Doan,

• Bill Andersen,

• Amit Sheth.

For ideas on model-driven development:

• Simon Robe.

Page 24