1 developing ontologies (and more) peter fox (ncar) esip winter meeting (tiwg) january 9, 2008,...

Post on 27-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Developing Ontologies(and more)

Peter Fox (NCAR)

ESIP Winter Meeting (TIWG)

January 9, 2008, Washington, D.C.

2

Ontology Spectrum

Catalog/ID

SelectedLogical

Constraints(disjointness,

inverse, …)

Terms/glossary

Thesauri“narrower

term”relation

Formalis-a

Frames(properties)

Informalis-a

Formalinstance Value

Restrs.

GeneralLogical

constraints

Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; – updated by McGuinness.Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html

3

Ontology - declarative knowledge• The triple: {subject-predicate-object}

interferometer is-a optical instrument

Fabry-Perot is-a interferometer

Optical instrument has focal length

Optical instrument is-a instrument

Instrument has instrument operating mode

Data archive has measured parameter

SO2 concentration is-a concentration

Concentration is-a parameter

4

Semantic Web Layers

http://www.w3.org/2003/Talks/1023-iswc-tbl/slide26-0.html, http://flickr.com/photos/pshab/291147522/

5

Terminology• Ontology (n.d.). The Free On-line Dictionary of Computing.

http://dictionary.reference.com/browse/ontology– An explicitformal specification of how to represent the objects,

conceptsand other entities that are assumed to exist in some area ofinterest and the relationships that hold among them.

• Semantic Web– An extension of the current web in which information is given well-

defined meaning, better enabling computers and people to work in cooperation, www.semanticweb.org

– Primer: http://www.ics.forth.gr/isl/swprimer/ • Languages

– OWL 1.0 (Lite, DL, Full) - Web Ontology Language (W3C)– RDF - Resource Description Framework (W3C)– OWL-S/SWSL - Web Services (W3C)– WSMO/WSML - Web Services (EC/W3C)– SWRL - Semantic Web Rule Language, RIF- Rules Interchange Format– Editors: Protégé, SWOOP, CoE, VOM, Medius, SWeDE, …

6

OWL and RDF• OWL

– Lite– DL– Full

• RDF• Services

– OWL-S – SWSL– WSML– SAWSDL - (WSDL-S)

• Rules– SWRL

7

Developing Ontologies• Approach:

– Bottom-up– Top-down (upper-level or foundational)– Mid-level (use case)

• Using tools

• Coding and testing

• Iterating

• Maintaining and evolving (curation, preservation)

8

GRDDL - bottom up• GRDDL - Gleaning Resource Descriptions

from Dialects of Languages

• Pretty much = “XML/XHTML (for e.g.) into RDF via XSLT”

• Good support, e.g. Jena

• Handles microformats

• Active community

• How to categorize, use, re-use (parts of)?

9

Collecting• RDFa extends XHTML by:

– extending the link and meta to include child elements

– add metadata to any elements (a bit like the class in micro-formats, but via dedicated properties)

– It is very similar to micro-formats, but with more rigor:

• it is a general framework (instead of an “agreement” on the meaning of, say, a class attribute value)

• terminologies can be mixed more easily

• ATOM (used with RSS)

10

Foundational Ontologies

CONTENTS

General concepts and relations that apply in all domainsphysical object, process, event,…, inheres, participates,…

Rigorously definedformal logic, philosophical principles, highly structured

ExamplesDOLCE, BFO, GFO, SUMO, CYC, (Sowa)

Courtesy: Boyan Brodaric

11

Foundational Ontologies

PURPOSE: help integrate domain ontologies

Geophysics ontology

Marine ontology

Water ontology

Planetary ontology

Geology ontology

Struc ontology

Rock ontology

“…and then there was one…”

Foundational ontology

Courtesy: Boyan Brodaric

12

Foundational Ontologies

PURPOSE: help organize domain ontologies

“…a place for everything, and everything in its place…”

Foundational ontology

shale rock formation

lithification

Courtesy: Boyan Brodaric

13

Problem scenario

Little work done on linking foundational ontologies with geoscience ontologies

Such linkage might benefit various scenarios requiring cross-disciplinary knowledge, e.g.:

water budgets: groundwater (geology) and surface water (hydro)

hazards risk: hazard potential (geology, geophysics) and items at threat (infrastructure, people, environment, economic)

health: toxic substances (geochemistry) and people, wildlife

many others…

Courtesy: Boyan Brodaric

14DOLCE

15

DOLCE + SWEETDOLCE = SWEET < SWEET

Physical-body BodyofGround, BodyofWater,…

Material-Artifact Infrastructure, Dam, Product,…

Physical-Object LivingThing, MarineAnimal

Amount-of-Matter Substance

Activity HumanActivity

Physical-Phenomenon Phenomena

Process Process

State StateOfMatter

Quality Quantity, Moisture,…

Physical-Region Basalt,…

Temporal-Region Ordovician,…

Benefitsfull coverage

rich relations

home for orphans

single superclasses

Issuesindividuals (e.g. Planet Earth)

roles (contaminant)

features (SeaFloor)

Courtesy: Boyan Brodaric

16

Conclusions

Surprisingly good fit amongst ontologiesso far: no show-stopper conflicts, a few difficult conflicts

DOLCE richness benefits geoscience ontologiesgood conceptual foundation helps clear some existing problems

Unresolved issues in modeling science entitiesmodeling classifications, interpretations, theories, models,…

Courtesy: Boyan Brodaric

Same procedure with GeoSciML

17

• Physical • Object

• SelfConnectedObject • ContinuousObject • CorpuscularObject • Collection

• Process • Abstract

• SetClass • Relation

• Proposition • Quantity

• Number • PhysicalQuantity

• Attribute

SUMO - Standard Upper Merged Ontology

18

19

20

Using SNAP/ SPAN

21

GeoSciOnt?

22

23

Using SWEET• Plug-in (import) domain detailed modules

• Lots of classes, few relations (properties)

24

Mix-n-Match• The IRI example:

– Collect a lot of different ontologies representing different terms, levels of concepts, etc. into a base form: RDF

– See Benno’s talk in session 1b.

• MMI

• Others

25

CF attributes

SWEET Ontologies(OWL)

Search Terms

CF Standard Names(RDF object)

IRIDL Terms

NC basic attributes

IRIDLattributes/objects

SWEET as Terms

CF Standard NamesAs Terms

Gazetteer Terms

CF data objects

Location

Blumenthal

26

Data ServersOntologies

MMI

JPL

StandardsOrganizations

Start Point

RDF Crawler

RDFS SemanticsOwl SemanticsSWRL Rules

SeRQL CONSTRUCT

Search Queries

LocationCanonicalizer

TimeCanonicalizer

Sesame

Search Interface

bibliography

IRI RDF Architecture

Blumenthal

27

Mid-Level: Developing ontologies• Use cases and small team (7-8; 2-3 domain experts, 2

knowledge experts, 1 software engineer, 1 facilitator, 1 scribe)

• Identify classes and properties (leverage controlled vocab.)– Start with narrower terms, generalize when needed or

possible– Adopt a suitable conceptual decomposition (e.g. SWEET) – Import modules when concepts are orthogonal

• Review, vet, publish • Only code them (in RDF or OWL) when needed

(CMAP, …)• Ontologies: small and modular

28

Use Case example• Plot the neutral temperature from the Millstone-Hill

Fabry Perot, operating in the vertical mode during January 2000 as a time series.

• Plot the neutral temperature from the Millstone-Hill Fabry Perot, operating in the vertical mode during January 2000 as a time series.

• Objects: – Neutral temperature is a (temperature is a) parameter– Millstone Hill is a (ground-based observatory is a) observatory– Fabry-Perot is a interferometer is a optical instrument is a instrument– Vertical mode is a instrument operating mode– January 2000 is a date-time range– Time is a independent variable/ coordinate– Time series is a data plot is a data product

29

Class and property example• Parameter

– Has coordinates (independent variables)

• Observatory– Operates instruments

• Instrument– Has operating mode

• Instrument operating mode– Has measured parameters

• Date-time interval• Data product

30

31

32

33

Higher level use case• Find data which represents the state of the

neutral atmosphere above 100km, toward the arctic circle at any time of high geomagnetic activity

• Find data which represents the state of the neutral atmosphere above 100km, toward the arctic circle at any time of high geomagnetic activity

34

Translating the Use-Case - non-monotonic?

Input

Physical properties: State of neutral atmosphere

Spatial:

• Above 100km

• Toward arctic circle (above 45N)

Conditions:

• High geomagnetic activity

Action: Return Data

Specification needed for query to CEDARWEB

Instrument

Parameter(s)

Operating Mode

Observatory

Date/time

Return-type: data

GeoMagneticActivity has ProxyRepresentation

GeophysicalIndex is a ProxyRepresentation (in Realm of Neutral Atmosphere)

Kp is a GeophysicalIndex hasTemporalDomain: “daily”

hasHighThreshold: xsd_number = 8

Date/time when KP => 8

35

Translating the Use-Case - ctd.

Input

Physical properties: State of neutral atmosphere

Spatial:

Above 100km

Toward arctic circle (above 45N)

Conditions:

High geomagnetic activity

Action: Return Data

Specification needed for query to CEDARWEB

Instrument

Parameter(s)

Operating Mode

Observatory

Date/time

Return-type: data

NeutralAtmosphere is a subRealm of TerrestrialAtmosphere

hasPhysicalProperties: NeutralTemperature, Neutral Wind, etc.

hasSpatialDomain: [0,360],[0,180],[100,150]

hasTemporalDomain:

NeutralTemperature is a Temperature (which) is a Parameter

FabryPerotInterferometer is a Interferometer, (which) is a Optical Instrument (which) is a Instrument

hasFilterCentralWavelength: Wavelength

hasLowerBoundFormationHeight: Height

ArcticCircle is a GeographicRegion

hasLatitudeBoundary:

hasLatitudeUpperBoundary:

GeoMagneticActivity has ProxyRepresentation

GeophysicalIndex is a ProxyRepresentation (in Realm of Neutral Atmosphere)

Kp is a GeophysicalIndex hasTemporalDomain: “daily”

hasHighThreshold: xsd_number = 8

Date/time when KP => 8

36

Tools - Using Protégé

37

Creating Ontologies - visual

• UML - new release of ODM/MOF– Ontology Definition Metamodel/Meta Object

Facility (OMG) for UML– Provides standardized notation

• CMAP Ontology Editor (concept mapping tool from IHMC)– Drag/drop visual development of classes,

subclass (is-a) and property relationship– Read and writes OWL– Formal convention (OWL/RDF tags, etc.)

• White board, text file

38

Using CMAP/COE

39

40

Is OWL the only option? No…• SKOS - Simple Knowledge Organization

Scheme• Annotations (RDFa)• Atom• Natural Language (read results from a web

search and transform to a usable form)– CL (common logic)– Rabbit, e.g. ShellfishCourse is a Meal Course

that (if has drink) always has drink Potable Liquid that has Full body and which either has Moderate or Strong flavour

– PENG (processable English)

41

Is OWL the only option II? No…• Natural Language (NL)

– Read results from a web search and transform to a usable form

– Find/filter out inconsistencies, concepts/relations that cannot be represented

• Popular options– CLCE (common logic controlled english)– Rabbit, e.g. ShellfishCourse is a Meal Course that (if has

drink) always has drink Potable Liquid that has Full body and which either has Moderate or Strong flavour

– PENG (processable English)

• Really need PSCI - process-able science

42

Creating Ontologies - verbal

• Translating use cases

• E.g. Find data which represents the state of the neutral atmosphere above 100km, toward the arctic circle at any time of high geomagnetic activity

• Can this be expressed as an ontology?– CLCE, Rabbit, PENG, Sydney syntax

• Notice something about the next examples?

43

Sydney syntax

If X has Y as a father then Y is the only father of X.

The class person is equivalent to male or female, and male and female are mutually exclusive.

equivalent toThe classes male and female are

mutually exclusive. The class person is fully defined as anything that is a male or a female.

44

PENG - Processible English

1. If X is a research programmer then X is a programmer.

2. Bill Smith is a research programmer who works at the CLT.

3. Who is a programmer and works at the CLT?

45

CLCE - Common Logic Controlled English

CLCE: If a set x is the set of (a cat, a dog, and an elephant), then the cat is an element of x, the dog is an element of x, and the elephant is an element of x.

PC:~(∃x:Set)(∃x1:Cat)(∃x2:Dog)(∃x3:Elephant)(Set(x,x1,x2,x3) ∧ ~(x1∈x ∧ x2∈x ∧ x3∈x))

46

Use Case• Provide a decision support capability for an

analyst to determine an individual’s susceptibility to avian flu without having to be precise in terminology (-nyms)

47

48

49

Using ThManager

50

Services• Ontologies of services, provides:

– What does the service provide for prospective clients? The answer to this question is given in the "profile," which is used to advertise the service. To capture this perspective, each instance of the class Service presents a ServiceProfile.

– How is it used? The answer to this question is given in the "process model." This perspective is captured by the ServiceModel class. Instances of the class Service use the property describedBy to refer to the service's ServiceModel.

– How does one interact with it? The answer to this question is given in the "grounding." A grounding provides the needed details about transport protocols. Instances of the class Service have a supports property referring to a ServiceGrounding.

51

Developing a service ontology• Use case: find and display in the same projection,

sea surface temperature and land surface temperature from a global climate model.

• Find and display in the same projection, sea surface temperature and land surface temperature from a global climate model.

• Classes/ concepts: – Temperature– Surface (sea/ land)– Model– Climate– Global– Projection– Display …

52

Service ontology• Climate model is a model• Model has domain• Climate Model has component representation• Land surface is-a component representation• Ocean is-a component representation• Sea surface is part of ocean• Model has spatial representation (and temporal)• Spatial representation has dimensions• Latitude-longitude is a horizontal spatial representation• Displaced pole is a horizontal spatial representation• Ocean model has displaced pole representation• Land surface model has latitude-longitude representation• Lambert conformal is a geographic spatial representation• Reprojection is a transform between spatial representation• ….

53

Service ontology• A sea surface model has grid representation displaced pole

and land surface model has grid representation latitude-longitude and both must be transformed to Lambert conformal for display

54

Best practices• Ontologies/ vocabularies must be shared and

reused - swoogle.umbc.edu, www.planetont.org• Examine ‘core vocabularies’ to start with

– SKOS Core: about knowledge systems– Dublin Core: about information resources, digital libraries,

with extensions for rights, permissions, digital right management

– FOAF: about people and their organizations – DOAP: on the descriptions of software projects– DOLCE seems the most promising to match science

ontologies

• Go “Lite” as much as possible, then DL and only if you have to Full - balancing expressibility vs. implementability

• Minimal properties to start, add only when needed

55

Tutorial Summary• Many different options for ontology

development and encoding

• Tools are in reasonable shape, no killer-tool

• Best practices DO exist– PLEASE DO NOT just start coding OWL!

• Use case should drive the functional requirements of both your ontology and how you will ‘build’ one

• PARTNER with someone already familiar

56

More information• OWL-S - http://www.w3.org/Submission/OWL-S• SWSO/F/L - Semantic Web Services Ontology/Framework/Language -

http://www.w3.org/Submission/SWSF/ • WSMO/X/L - Web Services Modeling Ontology/Exection/Language -

http://www.w3.org/Submission/WSMX/ www.wsmo.org, www.wsmx.org• SAWSDL - (WSDL-S)

57

Other tools

• Reasoners– Pellet, Racer, Medius KBS, FACT++, fuzzyDL, KAON2,

MSPASS, QuOnto

• Query Languages– SPARQL, XQUERY, SeRQL, OWL-QL, RDFQuery

• Other Tools for Semantic Web– Search: SWOOGLE swoogle.umbc.edu– Collaboration: www.planetont.org– Other: Jena, SeSAME/SAIL, Mulgara, Eclipse, KOWARI– Semantic wiki: OntoWiki, SemanticMediaWiki

58

Editors• Protégé (http://protégé.stanford.edu)• SWOOP (http://mindswap.org/2004/SWOOP)• Altova SemanticWorks (http://www.altova.

com/download/semanticworks/semantic_web_rdf_owl_editor.html)

• SWeDE (http://owl-eclipse.projects.semwebcentral.org/InstallSwede.html), goes with Eclipse

• Medius• TopBraid Composer and other commercial tools• Visual Ontology Modeler (VOM) - Sandpiper• CMAP Ontology Editor (COE)

(http://cmap.ihmc.us/coe)

59

What about Earth Science?• SWEET (Semantic Web for Earth and Environmental

Terminology) – http://sweet.jpl.nasa.gov – based on GCMD terms– modular using faceted and integrative concepts

• VSTO (Virtual Solar-Terrestrial Observatory)– http://vsto.hao.ucar.edu – captures observational data (from instruments)– modular using domains

• MMI– http://marinemetadata.org– captures aspects of marine data, ocean observing systems– partly modular, mostly by developed project

• GeoSciML– http://www.opengis.net/GeoSciML/– is a GML (Geography ML) application language for Geoscience– modular, in ‘packages’

top related