from exploratory search to web search and back - pikm 2010

19
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management October 30, 2010 – Fairmont Royal York, Toronto, Canada FROM EXPLORATORY SEARCH TO WEB SEARCH AND BACK Politecnico di Bari Via Orabona, 4 70125 Bari (ITALY) Roberto Mirizzi, Tommaso Di Noia [email protected], [email protected]

Upload: roberto-mirizzi

Post on 11-May-2015

533 views

Category:

Technology


0 download

DESCRIPTION

The power of search is with no doubt one of the main aspects for the success of the Web. Currently available search engines on the Web allow to return results with a high precision. Nevertheless, if we limit our attention only to lookup search we are missing another important search task. In exploratory search, the user is willing not only to find documents relevant with respect to her query but she is also interested in learning, discovering and understanding novel knowledge on complex and sometimes unknown topics. In the paper we address this issue presenting LED, a web based system that aims to improve (lookup) Web search by enabling users to properly explore knowledge associated to her query. We rely on DBpedia to explore the semantics of keywords within the query thus suggesting potentially interesting related topics/keywords to the user.

TRANSCRIPT

Page 1: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

FROM EXPLORATORY SEARCH TO WEB SEARCH AND BACK

Politecnico di BariVia Orabona, 470125 Bari (ITALY)

Roberto Mirizzi, Tommaso Di [email protected], [email protected]

Page 2: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

Outline

Tags to improve Web SearchExploratory Search

LED (Lookup Explore Discover): exploratory search in the Web (of Data)

DBpediaRanker: RDF ranking in DBpedia

Conclusion and Future work

Page 3: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

Why we use tags?

and many

more…

Page 4: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

What is Exploratory Search?

[Gary Marchionini. Exploratory Search: From Finding to understanding. Communications of the ACM, 49(4): 41-46, 2006]

Page 5: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

Can Semantic tags support Exploratory search?

Plugged into the Web 3.0

DisambiguationRelations among tagsMachine understandableSemantic-aided query refinement

LED: Lookup Explore Discover

http://sisinflab.poliba.it/led/

If Semantic tags helped 10% of Internet users to save 10 minutes per month on their searches, this would save globally over 4,000,000 of working hours per year

Page 6: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

LED: Lookup Explore DiscoverObjectives

Enable users to properly explore the semantics of a keyword

Guide users to refine a query suggesting related topics/keywords

Improve lookup search to explore knowledge

Page 7: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

What is behind LED? (i)

Page 8: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

What is behind LED? (ii)Comments

DBpedia resources are highly interconnected in the RDF graph

Not all the relevant resources for a given node are its direct neighbors

1. Explore the neighborhood of a resource to discover new relevant resources not directly connected to it

2. Rank the results

Page 9: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

DBpedia graph exploration in LED

Semantic_Web XML-based_standards

Knowledge_representation Data_management Internet_architecture

Triplestores Folksonomy

XML Computer_and_telecommunication_stantards

Web_services User_interface_markup_languages Scalable_Vector_GraphicsMicroformats

skos:subject skos:broaderCategoryArticle

Legend

……

Resource Description Framework

Microformat

RDFa

Page 10: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

The functional architecture

Back-end

Query engine

Storage

GUI

Ext.

Info

Sou

rces

DBpedia Lookup Service

Interface

Delicious

Yahoo!

Bing

Google

Graph Explorer

SPARQLContext Analyzer

Ranker

Offline computation

Linked Data graph exploration

Rank nodes exploiting external information

Store results as pairs of nodes together with their similarity

Runtime SearchStart typing a query

Query the system for relevant tags (corresponding to DBpedia resources) and aggregate results

Show the semantic tag cloud and the results

1

2

3

1

2

3

Offl

ine computation

Runtime search

1

2

3

12

3

Tag Cloud Generator

Meta-searchengine

Page 11: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

DBpediaRanker: ranking

?r1 ?r2isSimilar

v

hasValue

einfo_sourc2

21

1

21einfo_sourc21 )(

),(

)(

),(),(

rf

rrf

rf

rrfrrsim

viceversaand r and rbetween wikilink,2

saor vicever r and rbetween k wikilin,1

r and rbetween wikilink no ,0

),(

21

21

21

21 rrorewikilinkSc

)(

),(),(

2

1221 rl

rrlrroreabstractSc

Graph-based and text-based ranking

Ranking based on external sources

Page 12: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

DBpediaRanker: an example (i)

wikilinkScore(RDFa, Resource_Description_Framework) = 2

abstractScore(RDFa, Resource_Description_Framework) = 1.0

Page 13: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

DBpediaRanker: an example (ii)

sim(RDFa, Resource_Description_Framework)Google = 1.67e5 / 4.42e5 + 1.67e5 / 1.19e7 = 0.39

delicious

Page 14: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

DBpediaRanker: context analysis

The same similarity measure is used in the context analysis

?r1

?c1

belongsTo

v

hasValue

?c2

?c…

?cN

C

Example:

C = {Programming Languages, Databases, Software}

Does Dennis Ritchie belongs to the given context?

Algorithm:

If(v>THRESHOLD) then r1 belongs to the context; add r1 to the graph exploration queueElse r1 does not belong to the context; exclude r1 from graph explorationEndIf

Page 15: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

Evaluation (i)

http://sisinflab.poliba.it/evaluation

Comparison of 5 different algorithms 50 volunteers

Researchers in the ICT area 244 votes collected (on average 5 votes for each users)

Average time to vote: 1min and 40secs

Page 16: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

Evaluation (ii)

http://sisinflab.poliba.it/evaluation/data

3.91 - Good

Page 17: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada

Conclusion

LED: a system for exploratory search and query refinement on the (Semantic) Web

DBpediaRanker: ranking algorithms for resources in DBpedia

Future work Expose a RESTful API for building novel mashups and for

comparing with different systems Improve ranking algorithms Deal with cases where a single knowledge base in not

sufficient Combine a content-based recommendation and a

collaborative-filtering approach

Page 18: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management

October 30, 2010 – Fairmont Royal York, Toronto, Canada

Trick or Treat?FROM EXPLORATORY SEARCH TO WEB SEARCH AND

BACK (PIKM 2010)

If you're interested in learning more…1. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tags generation and

retrieval for online advertising. 19th ACM International Conference on Information and Knowledge Management (CIKM 2010)

2. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Ranking the Linked Data: the case of DBpedia. 10th International Conference on Web Engineering (ICWE 2010)

3. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tag cloud generation via DBpedia. 11th International Conference on Electronic Commerce and Web Technologies (EC-Web 2010)

4. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tagging for crowd computing. 18th Italian Symposium on Advanced Database Systems (SEBD 2010)

5. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic Wonder Cloud: exploratory search in DBpedia. 2th International Workshop on Semantic Web Information Management (SWIM 2010) - Best Workshop Paper at International Conference on Web Engineering (ICWE 2010)

Roberto Mirizzi - [email protected]

Thanks for your attention!

Page 19: From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada