towards transfer learning of link specifications

34

Upload: geoknow

Post on 29-Jun-2015

94 views

Category:

Education


0 download

DESCRIPTION

ICSC 2013 (http://ieee-icsc.org/icsc2013/) presentation on transfer learning

TRANSCRIPT

Page 1: Towards Transfer Learning of Link Specifications

Towards Transfer Learning of Link Speci�cations

Axel-Cyrille Ngonga Ngomo Jens Lehmann Mofeed Hassan

2013-09-16

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 1 / 29

Page 2: Towards Transfer Learning of Link Specifications

Outline

1 Motivation

2 Transfer Learning Framework

3 Experimental Setup

4 Results

5 Conclusions and Future Work

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 2 / 29

Page 3: Towards Transfer Learning of Link Specifications

Outline

1 Motivation

2 Transfer Learning Framework

3 Experimental Setup

4 Results

5 Conclusions and Future Work

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 3 / 29

Page 4: Towards Transfer Learning of Link Specifications

Why Link Discovery?

1 Fourth Linked Dataprinciple

2 Links are central for

Cross-ontology QAData IntegrationReasoningFederated Queries...

3 2011 topology of theLOD Cloud:

31+ billion triples≈ 0.5 billion linksowl:sameAs in mostcases

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 4 / 29

Page 5: Towards Transfer Learning of Link Specifications

Why is it di�cult?

De�nition (Link Discovery)

Given sets S and T of resources and relation RTask: Find M = {(s, t) ∈ S × T : R(s, t)}Common approaches:

Find M ′ = {(s, t) ∈ S × T : σ(s, t) ≥ θ}Find M ′ = {(s, t) ∈ S × T : δ(s, t) ≤ θ}

1 Time complexity

Large number of triplesQuadratic a-priori runtime69 days for mapping cities fromDBpedia to Geonames (1ms percomparison)Decades for linking DBpedia and LGD. . .

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 5 / 29

Page 6: Towards Transfer Learning of Link Specifications

Why is it di�cult?

De�nition (Link Discovery)

Given sets S and T of resources and relation RTask: Find M = {(s, t) ∈ S × T : R(s, t)}Common approaches:

Find M ′ = {(s, t) ∈ S × T : σ(s, t) ≥ θ}Find M ′ = {(s, t) ∈ S × T : δ(s, t) ≤ θ}

1 Time complexity

Large number of triplesQuadratic a-priori runtime69 days for mapping cities fromDBpedia to Geonames (1ms percomparison)Decades for linking DBpedia and LGD. . .

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 5 / 29

Page 7: Towards Transfer Learning of Link Specifications

Why is it di�cult?

2 Complexity of speci�cations

Combination of several attributes required for high precisionTedious discovery of most adequate mappingDataset-dependent similarity functions

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 6 / 29

Page 8: Towards Transfer Learning of Link Specifications

LIMES Framework

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 7 / 29

Page 9: Towards Transfer Learning of Link Specifications

Link Speci�cation

Detection of accurate link speci�cation is keyLink Speci�cations has three components:

Two sets of restrictions RS1... RS

m resp. RT1... RT

kthat specify the

sets S resp. T ,A speci�cation of a complex similarity metric σ via the combination ofseveral atomic similarity measures σ1, ..., σn andA set of thresholds τ1, ..., τn such that τi is the threshold for σi .

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 8 / 29

Page 10: Towards Transfer Learning of Link Specifications

Transfer Learning

Different Linking Tasks

Classical Learning of Link Specs Transfer Learning of Link Specs

Learning System Learning SystemLearning System

Current Linking Task

Transfer Learning System

spec accuracy: α

Task Repository

class similarity: ζproperty similarity: π

In our approach we use Transductive Transfer Learning

Class and property matching is assumed to be known already(numerous approaches from ontology matching can be employed) -the goal is to �nd the complex similarity metric

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 9 / 29

Page 11: Towards Transfer Learning of Link Specifications

Outline

1 Motivation

2 Transfer Learning Framework

3 Experimental Setup

4 Results

5 Conclusions and Future Work

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 10 / 29

Page 12: Towards Transfer Learning of Link Specifications

Transfer Learning Framework I

Transfer Learning of link speci�cations is reduce to three subproblems:

Restrictions/class similarity ζ : 2C × 2C 7→ [0, 1]e.g. ζ({City ,Village}, {Town}) = 0.6

Property similarity: ξ : 2P × 2P 7→ [0, 1]e.g. ξ({rdfs : label}, {rdfs : label}) = 1.0

Accuracy of link speci�cations: α : Q 7→ [0, 1]

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 11 / 29

Page 13: Towards Transfer Learning of Link Specifications

Transfer Learning Framework II

Overall similarity measure for transfer learning:ω(t, t ′) = α(q′) · ζ(ψ(q′), C) · ζ(ψ′(q′), C′) · ξ(sp(q′),PL) · ξ(tp(q′),P ′

L)(details in paper)

Each similarity measure can be implemented in manifold approaches

Implementations of class similarity function ζ in framework:

label-based similarityname-based similarity (URI similarity)data-centric similarity

Properties similarities ξ are de�ned analogously

Similarities between single classes/properties can be extended to sets(e.g. using arithmetic / geometric mean of max. similarity)

Spec can be transferred by replacing properties with most similarproperties in PL and P ′

L

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 12 / 29

Page 14: Towards Transfer Learning of Link Specifications

Transfer Learning Framework II

Overall similarity measure for transfer learning:ω(t, t ′) = α(q′) · ζ(ψ(q′), C) · ζ(ψ′(q′), C′) · ξ(sp(q′),PL) · ξ(tp(q′),P ′

L)(details in paper)

Each similarity measure can be implemented in manifold approaches

Implementations of class similarity function ζ in framework:

label-based similarityname-based similarity (URI similarity)data-centric similarity

Properties similarities ξ are de�ned analogously

Similarities between single classes/properties can be extended to sets(e.g. using arithmetic / geometric mean of max. similarity)

Spec can be transferred by replacing properties with most similarproperties in PL and P ′

L

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 12 / 29

Page 15: Towards Transfer Learning of Link Specifications

Transfer Learning Framework II

Overall similarity measure for transfer learning:ω(t, t ′) = α(q′) · ζ(ψ(q′), C) · ζ(ψ′(q′), C′) · ξ(sp(q′),PL) · ξ(tp(q′),P ′

L)(details in paper)

Each similarity measure can be implemented in manifold approaches

Implementations of class similarity function ζ in framework:

label-based similarityname-based similarity (URI similarity)data-centric similarity

Properties similarities ξ are de�ned analogously

Similarities between single classes/properties can be extended to sets(e.g. using arithmetic / geometric mean of max. similarity)

Spec can be transferred by replacing properties with most similarproperties in PL and P ′

L

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 12 / 29

Page 16: Towards Transfer Learning of Link Specifications

Transfer Learning Framework II

Overall similarity measure for transfer learning:ω(t, t ′) = α(q′) · ζ(ψ(q′), C) · ζ(ψ′(q′), C′) · ξ(sp(q′),PL) · ξ(tp(q′),P ′

L)(details in paper)

Each similarity measure can be implemented in manifold approaches

Implementations of class similarity function ζ in framework:

label-based similarityname-based similarity (URI similarity)data-centric similarity

Properties similarities ξ are de�ned analogously

Similarities between single classes/properties can be extended to sets(e.g. using arithmetic / geometric mean of max. similarity)

Spec can be transferred by replacing properties with most similarproperties in PL and P ′

L

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 12 / 29

Page 17: Towards Transfer Learning of Link Specifications

Example (New Link Task)

Example link speci�cation for mapping drugs in two datasets DBpedia andDrugbank (DBpedia-Drugbank.xml):

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 13 / 29

Page 18: Towards Transfer Learning of Link Specifications

Example (Restriction part)

Three parts of link specs:

Restrictions part

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 14 / 29

Page 19: Towards Transfer Learning of Link Specifications

Example (Properties Part)

Three parts of link specs:

Restrictions partProperties part

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 15 / 29

Page 20: Towards Transfer Learning of Link Specifications

Example (Similarities Measures Part)

Three parts of link specs:

Restrictions part

Properties part

Similarity Measures part: similarity metric and thresholds

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 16 / 29

Page 21: Towards Transfer Learning of Link Specifications

Example (Link Repository)

Transfer learning is applied using a repository → restrictions and relevantproperties are assumed to be known → �nd the similarity measure bycomparing with all specs in the repository, e.g. DBpedia-SiderDrugs.xml

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 17 / 29

Page 22: Towards Transfer Learning of Link Specifications

Example (Restriction Similarities)

Restrictions in both speci�cations �les

Type DBpedia-Drugbank.xml DBpedia-SiderDrugs.xml

Source rdf:type dbpedia-owl:Drug rdf:type dbpedia-owl:DrugTarget rdf:type drug:drugs rdf:type sider:drugs

Straightforward label/URI similarityFor instance, trigram metric in URI similarity without pre�xes:

ζ({dbpedia-owl:Drug}, {dbpedia-owl:Drug}) = 1.0ζ({sider:drugs}, {drug:drugs}) = 1.0

Data-centric: ζd (s, s′) = 1

|P(s)||P(s′)|∑

x∈P(s)

∑y∈P(s′)

sim(x , y) where

P(s) = {x : s p x ∧ p rdf:type owl:DatatypeProperty}(extends similarity to instances)

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 18 / 29

Page 23: Towards Transfer Learning of Link Specifications

Example (Restriction Similarities)

Restrictions in both speci�cations �les

Type DBpedia-Drugbank.xml DBpedia-SiderDrugs.xml

Source rdf:type dbpedia-owl:Drug rdf:type dbpedia-owl:DrugTarget rdf:type drug:drugs rdf:type sider:drugs

Straightforward label/URI similarityFor instance, trigram metric in URI similarity without pre�xes:

ζ({dbpedia-owl:Drug}, {dbpedia-owl:Drug}) = 1.0ζ({sider:drugs}, {drug:drugs}) = 1.0

Data-centric: ζd (s, s′) = 1

|P(s)||P(s′)|∑

x∈P(s)

∑y∈P(s′)

sim(x , y) where

P(s) = {x : s p x ∧ p rdf:type owl:DatatypeProperty}(extends similarity to instances)

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 18 / 29

Page 24: Towards Transfer Learning of Link Specifications

Example (Property Similarities)

type DBpedia-Drugbank.xml DBpedia-SiderDrugs.xml

Source rdfs:label rdfs:labelfoaf:name

Target rdfs:label rdfs:labeldrug:genericName

Applying similarity function to all properties:For instance trigram based on URIs and arithmetic mean asaggregation:ξ({rdfs : label}, {rdfs : label , foaf : name}) = 0.9ξ({rdfs : label , drug : genericName}, {rdfs : label}) = 0.8

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 19 / 29

Page 25: Towards Transfer Learning of Link Specifications

Example (Overall Similarity)

Based on, e.g. F-score assign quality value to q′ =DBpedia-SiderDrugs.xml, in our case α(q′) = 0.89

The �nal step is calculating the overall similarity measureω(DBpedia − Drugbank .xml ,DBpedia − SiderDrugs.xml) =

0.89 * 1.0 * 1.0 * 0.9 * 0.8 = 0.64

The steps are repeated for all link speci�cations in the repository

Most similar link spec can be transferred by replacing its propertieswith the most similar ones in the computed property matching

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 20 / 29

Page 26: Towards Transfer Learning of Link Specifications

Outline

1 Motivation

2 Transfer Learning Framework

3 Experimental Setup

4 Results

5 Conclusions and Future Work

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 21 / 29

Page 27: Towards Transfer Learning of Link Specifications

Experimental Setup I

The goal of evaluation is two-fold:

Evaluating whether transfer learning can be used to build templatesfor link spec

Discover whether the transferred templates can be used directly

113 speci�cations were retrieved from LATC, each has manual linksevaluation

10% 2%

66%

3%

1% 3%

15%

Persons

Events

Locations

Diseases

Drugs

Organizations

Misc

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 22 / 29

Page 28: Towards Transfer Learning of Link Specifications

Experimental Setup II

Leave-one-out evaluation

1.) Compare top-scored speci�cation (most similar) and checkwhether it uses the same combination of similarity functions � assign 1for match and 0 for no match

2.) Compute F-measure of learned link specs directly � works only onspecs with both endpoints alive (only 12 out of 113)

Used URI similarity

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 23 / 29

Page 29: Towards Transfer Learning of Link Specifications

Outline

1 Motivation

2 Transfer Learning Framework

3 Experimental Setup

4 Results

5 Conclusions and Future Work

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 24 / 29

Page 30: Towards Transfer Learning of Link Specifications

First Experiments Set Results

Detecting right speci�cation in 81% of all cases

In geo-spatial domain 91%

In persons domain 58%

Averag

ePer

sons

Events

Locatio

nsDise

ases

Drugs

Organiz

ations Mis

c0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 25 / 29

Page 31: Towards Transfer Learning of Link Specifications

Second Experiments Set Results

In the second Experiments series, source and target endpoints need tobe alive such that we can execute transferred link spec (12 out of 113)

In general low F-measures

dblp-datasemanticweb-researcher

euraxess-eures-country

rkbcrime-dbpedia-constabularies

dbpedia-datagovUK-city

eventseer-dblp_l3s-person

eventseer-dogfood-event

dbpedia-linkedgeodata-airport

stad-rmon-person

eventseer-dogfood-person

dbpedia-linkedgeodata-university

dbpedia-gutendata-texts

dbpedia-openei-country

0%

20%

40%

60%

80%

100%

PrecisionRecallF-Measure

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 26 / 29

Page 32: Towards Transfer Learning of Link Specifications

Outline

1 Motivation

2 Transfer Learning Framework

3 Experimental Setup

4 Results

5 Conclusions and Future Work

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 27 / 29

Page 33: Towards Transfer Learning of Link Specifications

Summary

Conclusions:

Detecting right template in 81% of all cases

Transfer learning cannot replace the learning of thresholds inspeci�cations

Future Work:

Combination with machine-learning approaches for link speci�cations(e.g., EAGLE, COALA), in particular for learning thresholds

More sophisticated class and property similarity approaches

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 28 / 29

Page 34: Towards Transfer Learning of Link Specifications

The End

Jens [email protected]/Uni Leipzig

Questions GeoKnow

http://geoknow.eu

Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 29 / 29