walking linked data: a graph traversal approach to explain clusters

Post on 27-Jun-2015

473 Views

Category:

Presentations & Public Speaking

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

My slides at the Consuming Linked Data workshop (COLD2014) at ISWC2014

TRANSCRIPT

s

+

Walking Linked Data:a graph traversal approach to explain clusters Ilaria Tiddi, Mathieu d’Aquin, Enrico Motta

+Problem: explaining patterns

Data: women/men literacy rate from UNESCO [1]

In which countries are men more educated than women?

+Problem: explaining patterns

Data: women/men literacy rate from UNESCO [1]

In which countries are men more educated than women?

The yellow countries ( )

Education : Men Women Equal

+Problem: explaining patterns

Data: women/men literacy rate from UNESCO [1]

In which countries are men more educated than women?

Education : Men Women Equal

How do you know?

+Problem: explaining behaviors

We explain thanks to our own (background) knowledge.

Can we do the same with the knowledge from Linked Data?

+Linked Data contain explanations

but where?

:India

:UK

:Ethiopia

:US

:Somalia

+Linked Data contain explanations

but where?

:India

:UK

:Ethiopia

:US

:Somalia db:Somalia

db:Ethiopia

db:India

db:UK

db:US

sameAs

sameAs

sameAs

sameAs

sameAs

+Linked Data contain explanations

but where?

:India

:UK

:Ethiopia

:US

:Somalia db:Somalia

db:Ethiopia

db:India

db:UK

db:US

sameAs

dc:subject

dc:subject

dc:subject

sameAs

sameAs

sameAs

sameAs

dc:subject

dc:subject

+Linked Data contain explanations

but where?

:India

:UK

:Ethiopia

:US

:Somalia db:Somalia

db:Ethiopia

db:India

db:UK

db:US

db:Category:LeastDevelopedCountries

db:Category:LiberalCountries

sameAs

dc:subject

dc:subject

dc:subject

sameAs

sameAs

sameAs

sameAs

dc:subject

dc:subject

skos:relatedMatch

skos:relatedMatch

skos:relatedMatch

+Linked Data contain explanations

but where?

:India

:UK

:Ethiopia

:US

:Somalia db:Somalia

db:Ethiopia

db:India

db:UK

db:US

600/pp

3,800/pp

36,000/ppdbp:gdp

49,000/pp

1,200/pp

dbp:gdpsameAs

sameAs

sameAs

sameAs

sameAs

dbp:gdp

dbp:gdp

dbp:gdp

+Linked Data contain explanations

but where?

:India

:UK

:Ethiopia

:US

:Somalia db:Somalia

db:Ethiopia

db:India

db:UK

db:US

600/pp

3,800/pp

36,000/pp

3,800/pp

36,000/pp

49,000/pp ≥

1,200/pp≤

sameAs

sameAs

sameAs

sameAs

sameAs

dbp:gdp

dbp:gdp

dbp:gdp

dbp:gdp

dbp:gdp

+Looking for explanations in graph

:India

:Ethiopia

:SomaliasameAs  

4,000/pp

cat:LeastDeveloped

Countries

Given a graph of Linked Data where URI are nodes RDF properties are edges

sameAs

sameAs

dc:subject

dbp:gdp

dc:subject

dbp:gdp

dc:subject

dbp:gdp

skos:related

skos:related

+Looking for explanations in graph

:India

:Ethiopia

:SomaliasameAs  

4,000/pp

cat:LeastDeveloped

Countries

sameAs

sameAs

dc:subject

dbp:gdp

dc:subject

dbp:gdp

dc:subject

dbp:gdp

skos:related

skos:related

Find the ending value most pointed by entities in the

cluster the best path in order to further expand the graph

+A* algorithm for Linked Data Best-first search algorithm

Given an initial node and a final node

find the least expensive path between them

Path cost function f(path) = actual cost g(path)+ future cost h(path)

Without knowledge of the graph

Search in the graph for the best path and explanation

The graph is iteratively build by URI dereferencing

No need to know the Linked Data graph a priori

+Dedalo: an A* process for Linked Data

Building graph(URI dereferencing)

Choosing thebest path

Finding thebest explanation

Iteratively building a Linked Data graph and looking for an explanation of the pattern

+Dedalo: an A* process for Linked Data

Dereference URIs through HTTP GET

take an entity

read its properties and values

add them to the graph

db:Ethiopia

db:Ethiopiadb:Category:AfricanCountries

dc:subject

1,200dbp:gdp

:India

:India

:India

:India

:India

db:Ethiopia

owl:sameAs

……

+Dedalo: an A* process for Linked Data

Dereference URIs through HTTP GET

take an entity

read its properties and values

add them to the graph

db:Ethiopia

db:Ethiopiadb:Category:AfricanCountries

dc:subject

1,200dbp:gdp

:India

:India

:India

:India

:India

db:Ethiopia

db:Category:AfricanCountriesdc:subject

1,200dbp:gdp

owl:sameAs

……

+Dedalo: an A* process for Linked DataCollect new paths (sequences of edges)

add the new property to the previous pathowl:sameAsdc:subject

owl:sameAsdbp:gdp

evaluate new paths with Entropy1

ent(owl:sameAsdc:subject)

ent(owl:sameAsdbp:gdp)

add to the pile of paths (the first one is chosen)owl:sameAsdc:subject

owl:sameAsdbp:gdp

owl:sameAs

[1] Tiddi et al., ESWC 2014

:India

:India

:India

:India

:India

……

+Dedalo: an A* process for Linked DataBuild explanations (path + final nodes)

Each of the values the new path points to e1= owl:sameAsdc:subject e2= owl:sameAsdc:subject

Compare numerical value if the property is a datatype e2= owl:sameAsdc:gdp ≥ e3= owl:sameAsdc:gdp ≤ 1,200

Rank explanations according to the

F-Measure

db:Category:SouthAsianCountries

1,200

initial URIs (countries)

URIs pointing to

URIs in

1,200

db:Category:AfricanCountries

+Dedalo: experiments

Countries where men are more educated than women

skos:exactMatchdbp:hdiRank ≥ 126 87.8% 197”

skos:exactMatchdc:subject db:Category:Least_Developed_Countries

74.7% 524’’

skos:exactMatchdbp:gdpPPPPerCapitaRank ≥ 89

68.3% 269”

Countries where women are more educated than men

skos:exactMatchdbp:hdiRank ≤ 119 63.4% 198”

skos:exactMatchdbp:gdpPPPPerCapitaRank ≤ 56

62.3% 236’’

Countries where education is equal

skos:exactMatchdbp:gdpPPPRank ≥ 64 62.0% 234”

skos:exactMatchdbp:gdpPPPPerCapitaRank ≥ 29

61.0% 268’’

+Conclusions and future work Dedalo, A* process to search explanation within Linked

Data From a pattern to explain Finds the path to the best explanation Using Entropy and F-Measure

Focusing on the bias introduced by incomplete data2

Combining atomic explanations3

Evaluating Dedalo on a large use case: Google Trends

[2, 3] Tiddi et al., EKAW 2014

s

+

Thanks! Questions?

top related