from exploratory search to web search and back - pikm 2010
DESCRIPTION
The power of search is with no doubt one of the main aspects for the success of the Web. Currently available search engines on the Web allow to return results with a high precision. Nevertheless, if we limit our attention only to lookup search we are missing another important search task. In exploratory search, the user is willing not only to find documents relevant with respect to her query but she is also interested in learning, discovering and understanding novel knowledge on complex and sometimes unknown topics. In the paper we address this issue presenting LED, a web based system that aims to improve (lookup) Web search by enabling users to properly explore knowledge associated to her query. We rely on DBpedia to explore the semantics of keywords within the query thus suggesting potentially interesting related topics/keywords to the user.TRANSCRIPT
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
FROM EXPLORATORY SEARCH TO WEB SEARCH AND BACK
Politecnico di BariVia Orabona, 470125 Bari (ITALY)
Roberto Mirizzi, Tommaso Di [email protected], [email protected]
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
Outline
Tags to improve Web SearchExploratory Search
LED (Lookup Explore Discover): exploratory search in the Web (of Data)
DBpediaRanker: RDF ranking in DBpedia
Conclusion and Future work
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
Why we use tags?
and many
more…
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
What is Exploratory Search?
[Gary Marchionini. Exploratory Search: From Finding to understanding. Communications of the ACM, 49(4): 41-46, 2006]
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
Can Semantic tags support Exploratory search?
Plugged into the Web 3.0
DisambiguationRelations among tagsMachine understandableSemantic-aided query refinement
LED: Lookup Explore Discover
http://sisinflab.poliba.it/led/
If Semantic tags helped 10% of Internet users to save 10 minutes per month on their searches, this would save globally over 4,000,000 of working hours per year
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
LED: Lookup Explore DiscoverObjectives
Enable users to properly explore the semantics of a keyword
Guide users to refine a query suggesting related topics/keywords
Improve lookup search to explore knowledge
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
What is behind LED? (i)
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
What is behind LED? (ii)Comments
DBpedia resources are highly interconnected in the RDF graph
Not all the relevant resources for a given node are its direct neighbors
1. Explore the neighborhood of a resource to discover new relevant resources not directly connected to it
2. Rank the results
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpedia graph exploration in LED
Semantic_Web XML-based_standards
Knowledge_representation Data_management Internet_architecture
Triplestores Folksonomy
…
…
XML Computer_and_telecommunication_stantards
Web_services User_interface_markup_languages Scalable_Vector_GraphicsMicroformats
skos:subject skos:broaderCategoryArticle
Legend
……
…
Resource Description Framework
Microformat
RDFa
…
…
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
The functional architecture
Back-end
Query engine
Storage
GUI
Ext.
Info
Sou
rces
DBpedia Lookup Service
Interface
Delicious
Yahoo!
Bing
Graph Explorer
SPARQLContext Analyzer
Ranker
Offline computation
Linked Data graph exploration
Rank nodes exploiting external information
Store results as pairs of nodes together with their similarity
Runtime SearchStart typing a query
Query the system for relevant tags (corresponding to DBpedia resources) and aggregate results
Show the semantic tag cloud and the results
1
2
3
1
2
3
Offl
ine computation
Runtime search
1
2
3
12
3
Tag Cloud Generator
Meta-searchengine
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpediaRanker: ranking
?r1 ?r2isSimilar
v
hasValue
einfo_sourc2
21
1
21einfo_sourc21 )(
),(
)(
),(),(
rf
rrf
rf
rrfrrsim
viceversaand r and rbetween wikilink,2
saor vicever r and rbetween k wikilin,1
r and rbetween wikilink no ,0
),(
21
21
21
21 rrorewikilinkSc
)(
),(),(
2
1221 rl
rrlrroreabstractSc
Graph-based and text-based ranking
Ranking based on external sources
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpediaRanker: an example (i)
wikilinkScore(RDFa, Resource_Description_Framework) = 2
abstractScore(RDFa, Resource_Description_Framework) = 1.0
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpediaRanker: an example (ii)
sim(RDFa, Resource_Description_Framework)Google = 1.67e5 / 4.42e5 + 1.67e5 / 1.19e7 = 0.39
delicious
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpediaRanker: context analysis
The same similarity measure is used in the context analysis
?r1
?c1
belongsTo
v
hasValue
?c2
?c…
?cN
C
Example:
C = {Programming Languages, Databases, Software}
Does Dennis Ritchie belongs to the given context?
Algorithm:
If(v>THRESHOLD) then r1 belongs to the context; add r1 to the graph exploration queueElse r1 does not belong to the context; exclude r1 from graph explorationEndIf
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
Evaluation (i)
http://sisinflab.poliba.it/evaluation
Comparison of 5 different algorithms 50 volunteers
Researchers in the ICT area 244 votes collected (on average 5 votes for each users)
Average time to vote: 1min and 40secs
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
Evaluation (ii)
http://sisinflab.poliba.it/evaluation/data
3.91 - Good
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada
Conclusion
LED: a system for exploratory search and query refinement on the (Semantic) Web
DBpediaRanker: ranking algorithms for resources in DBpedia
Future work Expose a RESTful API for building novel mashups and for
comparing with different systems Improve ranking algorithms Deal with cases where a single knowledge base in not
sufficient Combine a content-based recommendation and a
collaborative-filtering approach
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
Trick or Treat?FROM EXPLORATORY SEARCH TO WEB SEARCH AND
BACK (PIKM 2010)
If you're interested in learning more…1. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tags generation and
retrieval for online advertising. 19th ACM International Conference on Information and Knowledge Management (CIKM 2010)
2. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Ranking the Linked Data: the case of DBpedia. 10th International Conference on Web Engineering (ICWE 2010)
3. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tag cloud generation via DBpedia. 11th International Conference on Electronic Commerce and Web Technologies (EC-Web 2010)
4. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tagging for crowd computing. 18th Italian Symposium on Advanced Database Systems (SEBD 2010)
5. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic Wonder Cloud: exploratory search in DBpedia. 2th International Workshop on Semantic Web Information Management (SWIM 2010) - Best Workshop Paper at International Conference on Web Engineering (ICWE 2010)
Roberto Mirizzi - [email protected]
Thanks for your attention!
PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge ManagementOctober 30, 2010 – Fairmont Royal York, Toronto, Canada