mapping lo dto proton revised [compatibility mode]

36
Mapping the Central LOD Ontologies to PROTON Upper- Level Ontology Mariana Damova, Atanas Kiryakov, Kiril Simov, Svetoslav Petrov ISWC’2010

Upload: mariana-damova

Post on 11-May-2015

1.275 views

Category:

Documents


2 download

DESCRIPTION

Presentation at the Ontology Matching Workshop held at ISWC\'2010

TRANSCRIPT

Mapping the Central LOD

Ontologies to PROTON Upper-

Level Ontology

Mariana Damova, Atanas Kiryakov, Kiril

Simov, Svetoslav Petrov

ISWC’2010

Outline

• Introduction

• Problem and Conceptual Solution

• Approaches to Matching Ontologies

• Mapping Methods• Mapping Methods

• Proton Extension

• Statistics

• Experimentation

• Future Work

Linking Open Data (LOD)

FactForge (http://factforge.net)

• a reason-able view of the web of data

• the biggest and most heterogeneous body of factual knowledge on

which inference is performed

• 8 datasets from the LOD cloud (DBPedia, Freebase, UMBEL, CIA

World Factbook, MusicBrainz, Wordnet, Lingvoj, Geonames)

• an overall of 1,4 billion loaded statements

• 2,2 billion stored statements (indexed)

• 9,8 billion distinct retrievable statements

Access to FactForge

Forest Interface

- keyword search (for molecules in the RDF graph)

- auto-suggestion of URI

- SPARQL queries using facts from different datasets

(formulating SPARQL queries requires in depth knowledge about the datasets and the

schemata in FactForge)

- exploration

Outline

• Introduction

• Problem and Conceptual Solution

• Approaches to Matching Ontologies

• Mapping Methods• Mapping Methods

• Proton Extension

• Statistics

• Experimentation

• Future Work

Current State

Target State

SELECT * WHERE {

?Person dbp-ont:birthPlace ?BirthPlace ;

rdf:type opencyc:Entertainer ;

?BirthPlace geo-ont:parentFeature dbpedia:Germany .

}

Target State

SELECT * WHERE {

?Person prot:birthPlace ?BirthPlace ;

rdf:type prot:Entertainer ;

?BirthPlace prot:subRegionOf dbpedia:Germany .

}

Benefit

easier and simpler access to the wealth of data (one needs to know a single ontology instead of learning the

vocabularies of multiple datasets)

higher degree of interoperability offering only

one single schema

better integration of the datasets in FactForge

(so the schemata of the different datasets are mapped through

a single ontology)

information from many datasets via a unified

vocabulary

(a constraint which uses a single predicate from the ontology

can match data from different datasets)

Applications

semantic search and annotation using

the entities from FactForge

semantic browsing and navigationsemantic browsing and navigation

querying FactForge in natural language

? many others ?

PROTON - a modular, lightweight, upper-level ontology defining

about 300 classes and 100 properties

Solution

PROTON

DBPedia Freebase Geonames

about 300 classes and 100 properties

DBPedia - the RDFized version of Wikipedia, data-driven ontology,

an overall of 592 classes and 720 properties, and about 1 478 000

instances

Freebase - a social database, no ontology, hidden class hierarchy

in structured predicate names, 19632 predicates

Geonames – a geographical database that covers 6 million

geographical features with ontology of 5 classes, and 26

properties, and about 645 location denotators

Outline

• Introduction

• Problem and Conceptual Solution

• Approaches to Matching Ontologies

• Mapping Methods• Mapping Methods

• Proton Extension

• Statistics

• Experimentation

• Future Work

Approaches to Matching Ontologies

• syntactic vs. semantic mapping

• harmonizing semantics (interlingua ontology)

• bidirectional vs. unidirectional

• automated vs. manual• automated vs. manual

• OAEI - ontology matching competitions

– benchmark with best F-measure result of 80%

2009

Ontological Heterogeneity

• Semantic matching approaches cannot cope with

ontological heterogeneity

• The classes and the properties may be described in

different unrelated ontologies

• The algorithms cannot discover hidden relationships • The algorithms cannot discover hidden relationships

that hold between unrelated entities.

FactForge presents exactly such a reality of

heterogeneous facts. This makes their automated

processing inconvenient.

Our Approach

unidirectional semantic manual alignment between

PROTON PROTON

and

the schema ontologies of the selected datasets of

FactForge

Outline

• Introduction

• Problem and Conceptual Solution

• Approaches to Matching Ontologies

• Mapping Methods• Mapping Methods

• Proton Extension

• Statistics

• Experimentation

• Future Work

Mapping Methods

• A series of iterations of enrichments at conceptual and

at data levels

(b) Extending PROTON with classes and

properties

(a) Subsumption relations for classes

and properties

FactForge entity PROTON entitysubClassOf

(c) Using OWL class and property construction capabilities to

represent classes and properties from FactForge and map

them to PROTON classes

Class restricted from

FactForge predicatesPROTON entity

subClassOf

(d) Extending FactForge with instances to account for the

conceptual representation of the matching

FactForge Instance 1

FactForge Instance 2 (new)

FactForge Instance 3 (new)

Mapping rules

(a) dbp:Place

owl:subClassOf proton:Location .

(b) dbp-prop:location

rdfs:subPropertyOf proton:locatedIn .

(c) pfb:Location (c) pfb:Location

rdf:type owl:Restriction ;

owl:onProperty <http://rdf.freebase.com/ns/type.object.type> ;

owl:hasValue <http://rdf.freebase.com/ns/location.location> ;

rdfs:subClassOf proton:Location .

(d) p <rdf:type> <dbp-ont:PrimeMinister>

----------------------------------------------------

p <proton:hasPosition> j

j <proton:hasTitle> <proton:PrimeMinister>

Conceptual mismatches

dbp:Architect

rdfs:subClassOf

[ rdf:type owl:Restriction ;

owl:onProperty

proton:hasProfession ;

owl:hasValue proton:Architect

] .

Expression matching

Multiple Matching

DBPedia predicate

DBPedia ontology

predicate

PROTON predicate

dbp:place

rdfs:subPropertyOf proton:locatedIn .

dbp-prop:location

rdfs:subPropertyOf proton:locatedIn .

<http://rdf.freebase.com/ns/time.event.locations>

rdfs:subPropertyOf proton:locatedIn .

Freebase predicate

Outline

• Introduction

• Problem and Conceptual Solution

• Approaches to Matching Ontologies

• Mapping Methods• Mapping Methods

• Proton Extension

• Statistics

• Experimentation

• Future Work

PROTON Extension

• Preserve the OntoClean approach in PROTON design

• Obtain coverage of the rich data in FactForge

• Keep an optimal degree of granularity of the concept hierarchy

Proton was split into modules :

- 19 modules reflecting the conceptual divisions which surfaced

during the analysis of the data, e.g. proton event, proton social

abstraction, proton location, proton biological substance, etc.

Outline

• Introduction

• Problem and Conceptual Solution

• Approaches to Matching Ontologies

• Mapping Methods• Mapping Methods

• Proton Extension

• Statistics

• Experimentation

• Future Work

Statistics

• PROTON Extension– 141 new Classes

– 3 new Datatype properties

– 12 new Object properties

• Mapping PROTON to FactForge

– 536 subClassOf relations

– 36 subPropertyOf relations

Outline

• Introduction

• Problem and Conceptual Solution

• Approaches to Matching Ontologies

• Mapping Methods• Mapping Methods

• Proton Extension

• Statistics

• Experimentation

• Future Work

Experiment

Data Loading

BigOWLIM – the most scalable OWL Engine

http://www.ontotext.com/owlim/

FactForge Standard

June 2010

FactForge with

mappings June

2010

NumberOfStatements 1,782,541,506 2,630,453,334

NumberOfExplicitStatements 1,143,317,531 1,942,349,578

NumberOfEntities 354,635,159 404,798,593

FactForge Extension Statistics

FactForge FactForge with

Inference Rules

FactForge with IR

and PROTON

FactForge with IR,

PROTON and DBP

mappings

FactForge with

IR, PROTON,

DBP mappings

and Freebase

mappings

NumberOfStatements

October 2010

2,237,550,385 2,237,550,617 2,237,578,643 2,255,543,166 2,375,287,183NumberOfExplicitStatements

1,357,013,227 2,027,992,627 2,027,995,363 2,027,995,651 2,027,995,750NumberOfEntities

524,120,454 524,120,465 524,121,955 524,121,996 524,122,009

FactForge Extension Statistics

Difference

between FactForge and Each Adding Iteration

FactForgeFactForge and

FactForge with

FactForge and

FactForge with

factForge and

FactForge with

factForge and

FactForge with

October 2010

FactForge with

Inference Rules

FactForge with

IR and

PROTON

FactForge with

IR, PROTON

and DBP

mappings

FactForge with

IR, PROTON,

DBP mappings

and Freebase

mappings

NumberOfStatements

0 232 28,258 17,992,781 137,736,798

NumberOfExplicitStatem

ents 0 670,979,400 670,982,136 670,982,424 670,982,523

NumberOfEntities

0 11 1,501 1,542 1,555

FactForge Extension Statistics

Difference

between Each Adding Iteration

FactForge FactForge and

FactForge with

Inference Rules

FactForge with

Inference Rules

and FactForge

with IR and

FactForge with IR

and PROTON and

FactForge with

IR, PROTON and

FactForge with

IR, PROTON and

DBP mappings

and FactForge

October 2010

with IR and

PROTON

IR, PROTON and

DBP mappings

and FactForge

with IR, PROTON,

DBP mappings

and Freebase

mappings

NumberOfStatements0 232 28,026 17,964,523 119,744,017

NumberOfExplicitStatements0 670,979,400 2,736 288 99

NumberOfEntities0 11 1490 41 13

Experimentation: PROTON query

US non-profit organizations founded after 1950

PREFIX p-ext: <http://proton.semanticweb.org/protonue#>

PREFIX ptop: <http://proton.semanticweb.org/protont#>

PREFIX dbpedia: <http://dbpedia.org/resource/>

SELECT distinct ?s ?date ?l WHERE {

?s a p-ext:Non-ProfitOrganisation .

?s ptop:establishmentDate ?date .

?s ptop:locatedIn ?l .

?l ptop:subRegionOf dbpedia:United_States .

FILTER (?date > "1950")

}

Query: Prime Ministers born in the United Kingdom

PREFIX dbp-ont: <http://dbpedia.org/ontology/>

PREFIX dbpedia: <http://dbpedia.org/resource/>

SELECT ?PrimeMinister WHERE {

?PrimeMinister rdf:type dbp-ont:PrimeMinister .

?PrimeMinister dbp-ont:birthPlace dbpedia:United_Kingdom .

}

PREFIX pupp: <http://proton.semanticweb.org/protonu#>

PREFIX p-ext: <http://proton.semanticweb.org/protonue#>

PREFIX dbpedia: <http://dbpedia.org/resource/>

PREFIX ptop: <http://proton.semanticweb.org/protont#>

SELECT ?PrimeMinister WHERE {

?PrimeMinister ptop:hasPosition ?pos .

?pos pupp:hasTitle dbpedia:British_prime_minister .

?PrimeMinister p-ext:birthPlace dbpedia:United_Kingdom .

}

Query: Cities around the world which have Modigliani art work

PREFIX fb: <http://rdf.freebase.com/ns/>

PREFIX dbpedia: <http://dbpedia.org/resource/>

PREFIX dbp-prop: <http://dbpedia.org/property/>

PREFIX dbp-ont: <http://dbpedia.org/ontology/>

PREFIX umbel-sc: <http://umbel.org/umbel/sc/>

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX ot: <http://www.ontotext.com/>

SELECT DISTINCT ?painting_l ?owner_l ?city_fb_con ?city_db_loc ?city_db_cit

WHERE {WHERE {

?p fb:visual_art.artwork.artist

dbpedia:Amedeo_Modigliani ;

fb:visual_art.artwork.owners [ fb:visual_art.artwork_owner_relationship.owner ?ow ] ;

ot:preferredLabel ?painting_l.

?ow ot:preferredLabel ?owner_l .

OPTIONAL { ?ow fb:location.location.containedby [ ot:preferredLabel ?city_fb_con ] }

OPTIONAL { ?ow dbp-prop:location ?loc.

?loc rdf:type umbel-sc:City ;

ot:preferredLabel ?city_db_loc }

OPTIONAL { ?ow dbp-ont:city [ ot:preferredLabel ?city_db_cit ] }

}

Query: Cities around the world which have Modigliani art work

PREFIX dbpedia: <http://dbpedia.org/resource/>

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX ot: <http://www.ontotext.com/>

PREFIX ptop: <http://proton.semanticweb.org/protont#>

PREFIX p-ext: <http://proton.semanticweb.org/protonue#>

SELECT DISTINCT ?painting ?owner ?city

WHERE {

?p p-ext:author dbpedia:Amedeo_Modigliani ;

p-ext:ownership [ ptop:isOwnedBy ?ow ] ;

ot:preferredLabel ?painting .

?ow ot:preferredLabel ?owner .

?ow ptop:locatedIn [ rdf:type pupp:City ;

ot:preferredLabel ?city].

}

Outline

• Introduction

• Problem

• Conceptual Solution

• Approaches to Matching Ontologies• Approaches to Matching Ontologies

• Mapping Methods

• Proton Extension

• Statistics

• Experimentation

• Future Work

Future Work

• Two-level intermediary layer to access FactForge

mapping PROTON to UMBEL

Dataset1 Dataset2 Dataset3 Dataset4Datasets

Upper Level Ontology 2Two level

Ontology 1 Ontology 2 Ontology 3 Ontology 4

Upper Mapping and Binding Exchange LayerA lightweight, subject concept reference structure for the Web

Upper Mapping and Binding Exchange LayerA lightweight, subject concept reference structure for the Web

Query

triples

?S P O

?S P1 ?O1

?O1 P2 O2

Upper Level Ontology 2Two level

Intermediate

layer Upper Level Ontology 1

http://www.ontotext.com/news.html#umb_25oct10

UMBEL - Upper Mapping and Binding Exchange Layer

A lightweight, subject concept reference structure for the Web

20 000 concepts mapped to OpenCyc

Strategic partnership Ontotext – Structured Dynamics

Future Work

• Official FactForge release with the presented

mapping

• Publish Proton and mapping as LOD

• Cover more datasets from the LOD cloud• Cover more datasets from the LOD cloud

• Experiment with the balance between the

datasets and the ontologies describing them

• Extend the property mapping

http://factforge.net

(a version with Proton mapping is currently available at

http://ldsr4.ontotext.com)

Service available at

Thank you for your attention!

Questions?

Contact:

[email protected]