ranking the linked data: the case of dbpedia - icwe 2010

10th International Conference on Web Engineering, ViennaJuly 5-9, 2010

Ranking the LinkedData: the case of DBpedia

Roberto Mirizzi1, Azzurra Ragone1,2, Tommaso Di Noia1, Eugenio Di Sciascio1

1Politecnico di BariVia Orabona, 470125 Bari (ITALY)

2University of TrentoVia Sommarive, 14

38100 Trento (ITALY)


Outline

• Tags are all around• NOT (Not Only Tag): what is it?• NOT a look behind the curtains:– Ranking of RDF resources: an hybrid approach

• Evaluation• Conclusion and Future Work


Tags are all around


Tag cloud

and many

more…

http://www.amazon.com/gp/tagging/cloud/

http://delicious.com/search?p=enrico+motta&chk=&context=main%7C&fr=del_icio_us&lc=

http://www.flickr.com/photos/tags/

http://www.faviki.com/tag/Tim%20Berners-Lee

http://www.programmableweb.com/mashup-tag-cloud


Tagging: a double face

Annotation phase Retrieval phase


Problems with annotation

• Insert as much as possible tags (time consuming):– different versions of the same tag to catch all the

possible searches– Multilingual tags


Problem with retrieval

• Exactly (syntactic) match among tags: web service is different from web services, webservices,…


Why not to use Semantic tags?

Plugged into the Web 3.0DisambiguationRelations among tagsMachine understandable

NOT: Not Only Tag

http://sisinflab.poliba.it/not-only-tag/



Demo

• Let’s imagine to tag the book:


NOT



Smarter taggingAn

nota

tion

phas

eRe

trie

val p

hase


What is behind NOT?

• DBpedia graph exploration• Computation of similarity value between each

pair of RDF resources using external information sources (search engines, bookmarking systems)


What is behind NOT? (II)


What is behind NOT? (III)


What is behind NOT? (IV)

Semantic_Web XML-based_standards

Knowledge_representation Data_management Internet_architecture

Triplestores Folksonomy

…

…

XML Computer_and_telecommunication_stantards

Web_services User_interface_markup_languages Scalable_Vector_GraphicsMicroformats

skos:subject skos:broaderCategoryArticle

Legend

……

…

Resource Description Framework

Microformat

RDFa

…

…


DBpedia-Ranker: hybrid ranking

?r1 ?r2isSimilar

v

hasValue

)(

),(

)(

),(),(

2

21

1

2121 rf

rrf

rf

rrfrrsim

viceversaand r and rbetween wikilink,2

saor vicever r and rbetween k wikilin,1

r and rbetween wikilink no ,0

),(

21

21

21

21 rrorewikilinkSc

)(

),(),(

2

1221 rl

rrlrroreabstractSc

Graph-based ranking

External sources-based ranking


Functional Architecture

Back-end

Query engine

Storage

Cloud GeneratorGUI

Ext.

Info

Sou

rces

DBpedia Lookup Service

Delicious

Yahoo!

Bing

Graph Explorer

SPARQLContext Analyzer

Ranker

Offline computation

Linked Data graph exploration

Rank nodes exploiting external information

Store results as pairs of nodes together with their similarity

Runtime SearchStart typing a tag

Query the system for relevant tags (corresponding to DBpedia resources)

Show the semantic tag cloud

1

2

3

1

2

3

1

Offl

ine

com

puta

tion

2

3

1

2

3

GoogleGoogle

Runti

me

sear

ch


Evaluation

We evaluate five different algorithms:1. DBpediaRanker2. DBpediaRanker minus Wikipedia info3. DBpediaRanker minus ext info sources4. Co-occurrence 5. Similarity Distance

),()()(

),(),(

2121

2121 rrfrfrf

rrfrrcoOcc

)}(log),(min{loglog

),(log)(log),(logmax),(

21

212121 rfrfN

rrfrfrfrrngd


Evaluation (II)

http://sisinflab.poliba.it/evaluation

50 volunteersResearchers in the ICT area244 votes collected (on average 5 votes for each users)Time to vote: 1min and 40secs


Evaluation (III)

http://sisinflab.poliba.it/evaluation/data

3.91 - Good


Conclusion

• NOT *is* useful in the annotation phase: – suggestions of semantically related tags– Tags enrichment

• NOT *is* useful in the retrieval phase:– Semantic match among tags


Future Work


Impakt Revolution

http://sisinflab.poliba.it/impakt-revolution/


Inspiration: Google Wonder Wheel

Exploratory Search in Google……nice, but there is no “semantics” in it.

You can not discover new knowledge exploiting the meaning of a term (keyword/tag/query)


SWOC: Semantic Wonder Cloud

http://sisinflab.poliba.it/semantic-wonder-cloud/index/


Q&A

[email protected]

Thanks for being here on Friday! :-)


http://sisinflab.poliba.it/semantic-wonder-cloud/index/

http://sisinflab.poliba.it/impakt-revolution/


Conclusion

NOT: a tool for smarter tagging Ranking algorithm for RDF graphs

Future work Test our algorithms with different domains Extract more fine grained contexts Enrich the extracted context using also relevant properties Integrate our approach with real existing systems Use the core system to automatically extract relevant tags

(concepts) from a document (or from a collection of documents) exploiting tools for named entities extraction

ranking the linked data: the case of dbpedia - icwe 2010

Documents

web engineering

international conference

web xml

web services

stantards web

semantic tags

tag http

relevant tags