a. frank semantic search engines – on the way to web 3.0 מנועי חיפוש סמנטיים –...
Post on 21-Dec-2015
224 views
TRANSCRIPT
A. Frank
Semantic Search Engines – On the Way to Web 3.0
– מנועי חיפוש סמנטיים
Web -3.0בדרך ל
אריאל פרנקמחלקה למדעי המחשב
אוניברסיטת בר-אילן[email protected]
A. Frank2
Contents
• Web 3.0 & Semantic Search
• General Search
• "Natural Language" Search
• Vertical Search
• "Social Networking" Search
• Personalized Search
A. Frank3
What is Web 2.0!?
,Open Gardens blog, Ajit Jaokarhttp://opengardensblog.futuretext.com/archives/2005/12/mobile_Web_20_w.html
A. Frank4
“The good, the bad and the”…
A. Frank5
Web 1.0, Web 2.0, Web 3.0, Web X.0…
A. Frank6
Semantic Search
• Syntactic search – can match the query against – index of the textual content of the resources
– URIs (URLs, URNs) in the system
– literals in the RDF metadata
– or a combination of these, possibly using:
• Exact, prefix or substring match, stemming, minimal edit distance
• Semantic search – in addition to syntactic search, can use– index of the meaning of sentences in each resource – semantic information and analysis
– the graph structure of RDF metadata
– or a combination of these, possibly using:
• query expansion, classification/categorization, tagging, graph traversal, microformats, RDF & OWL inferencing and reasoning
A. Frank7
Can Semantic SEs answer this)?-:
A. Frank8
Types/Examples of Semantic SEs
• General Search– MetaWeb Freebase, Yahoo! Microsearch, …
• "Natural Language" Search– Powerset, Hakia, AskMeNow AskWiki, …
• Vertical Search– Kango, AdaptiveBlue, ReportLinker, …
• "Social Networking" Search– SemantiNet, Delver, Google Social Graph API, …
• Personalized Search– Twine, MavinIT PSS, …
A. Frank9
Contents
• Web 3.0 & Semantic Search
• General Search
• "Natural Language" Search
• Vertical Search
• "Social Networking" Search
• Personalized Search
A. Frank10
MetaWeb Technologies - Freebase
• Based in San Francisco, MetaWeb Technologies was spun out of Applied Minds in July 2005.
• Goal: build a better infrastructure for the Web application developers and publishers.
A. Frank11
Freebase Rational
• Open, shared database of the world’s knowledge that collects data from the Web to build a massive, collaboratively-edited database of cross-linked data.
• It is built by the community, for the community. • Free for anyone to query, contribute to, build
applications on top of, or integrate into their Web sites.• Focus is on organizing and managing complex data
structures by use of Semantic Web technologies.• Enables extraction of ordered knowledge out of the
information chaos that is the current Web.
A. Frank12
Freebase
A. Frank13
Freebase Repository
• Covers millions of topics in hundreds of categories.• Draws from large open repositories like Wikipedia,
MusicBrainz, and the SEC archives.• Contains structured information on many popular
topics, like movies, music, people and locations – all reconciled and freely available via an open API.
• Freebase information is supplemented by the efforts of a passionate global community of users, who are working together to add structured information on everything relevant.
A. Frank14
Domains and Types
A. Frank15
Google Company
A. Frank16
Freebase Help Center
A. Frank17
Freebase Semantics
• Freebase spans domains, but requires that a particular topic exist only once, even if it might normally be found in multiple databases.
• For example, Arnold Schwarzenegger would appear in a movie database as an actor, a political database as a governor and a bodybuilder database as a Mr. Universe.
• In Freebase, there is only one topic for Arnold Schwarzenegger, with all three facets of his public persona brought together.
• The unified topic acts as an information hub, making it easy to find and contribute information about him.
A. Frank18
Arnold Schwarzenegger (1)
A. Frank19
Arnold Schwarzenegger (2)
A. Frank20
Freebase Dynamics
• If the user is a developer, or just mildly technical, Freebase offers tools that make it easy to query and integrate the data into Web applications, blogs, wikis, user pages or anything else that would benefit from an injection of structured information.
• In addition to reconciling many facets of one topic, the underlying structure of Freebase lets the user run more complex queries.
• For example, if Freebase is asked for films starring Jennifer Connelly and actors who have appeared in Steven Spielberg movies, a list of 8 movies is given.
A. Frank21
…Films starring Jennifer Connelly
A. Frank22
Freebase vs. Wikipedia
• The difference lies in the way they store information. • Wikipedia arranges information in the form of articles.• Freebase lists facts and statistics. Its list form is good
not only for people who like to glance at facts, but also for people who want to use the data to build other Web sites and software. (Information in an article form can’t be reused in the same way.)
• Topics covered by Freebase include subjects that are too obscure for Wikipedia, which strives for notability appropriate to an encyclopedia.
A. Frank23
Contents
• Web 3.0 & Semantic Search
• General Search
• "Natural Language" Search
• Vertical Search
• "Social Networking" Search
• Personalized Search
A. Frank24
Powerset
• Powerset is a Silicon Valley company.
• Goal: build a transformative consumer search engine based on Natural Language Processing (NLP).
A. Frank25
Powerset Rational
• Unlike conventional search engines that use keywords, Powerset reads and understands every sentence on a Webpage and allows asking questions in plain English.
• Unique innovations in search are rooted in breakthrough technologies that take advantage of the structure and nuances of natural language.
• Using these advanced techniques, Powerset is building a large-scale search engine that breaks the confines of keyword search.
• By making search more natural and intuitive, Powerset is fundamentally changing how we search the Web, by delivering higher quality results.
A. Frank26
Who proved Fermat’s last theorem?
A. Frank27
What did Steve Jobs say about the iPod?
A. Frank28
What did Bush say about Gore?
A. Frank29
Powerlabs
• Powerlabs is a community where users can:– interact with demonstrations of Powerset’s
technology before search engine launches in 2008– give feedback to help improve the "Natural
Language" indexing– suggest ideas for the ideal search engine.
• Utilizes the participation of users on such a scale and at such an early stage of development, as a recognition of the potential of crowds wisdom to guide Powerset.
A. Frank30
Powerlabs Sign In
A. Frank31
Wiki Search Sneak Peek
• Access to first open search box covering Wikipedia.• Powerset uses linguistic analyses of both the query
and Wikipedia to find the best matches. • The Miniviewer allows to view highlighted matches
in the context of a Wikipedia article without ever having to leave the results page.
• By incorporating semantic information from Powerset’s indexing process into republished Wiki pages, internal page search enables a whole new kind of search: semantic-search-within-the-page.
A. Frank32
Explore Wikipedia
A. Frank33
Google acquire something
A. Frank34
Google acquire company
A. Frank35
Search Wikipedia
A. Frank36
Companies acquired in 2001
A. Frank37
Powerset PowerMouse
• PowerMouse is an application that provides a view into Powerset’s technology, letting users examine how structured information is extracted from open text.
• It is not intended as a search application per se, but allows to search for and navigate through facts encoded in Powerset’s Wikipedia index.
• It allows to see in dramatic fashion how compactly large amounts of data can be organized and displayed based on a few semantic relationships.
A. Frank38
PowerMouse Examples
A. Frank39
Google acquire something
A. Frank40
something eats carrot
A. Frank41
person won nobel
A. Frank42
Contents
• Web 3.0 & Semantic Search
• General Search
• "Natural Language" Search
• Vertical Search
• "Social Networking" Search
• Personalized Search
A. Frank43
Kango
• Vertical semantic search engine for personalized travel information.
• Goal: first step to deciding where to go, where to stay or what to do; finds the trip that is right for you.
A. Frank44
Kango Rational
• Kango indexes the collective wisdom on travel from the entire Web.
• Recommendations are based on a gestalt of voices heard in over 20 million reviews, ratings, blogs, journals, and articles collected from over a thousand sources such as Web sites, books and magazines.
• Organizes and presents the most relevant opinions and product details in a "federated" search display based on what’s known about travel preferences.
A. Frank45
Kango Repository
• Kango has scoured the Web to collect all kinds of places to go, things to do and places to stay.
• It then analyzed and organized millions of travelers' opinions to enable search based on exact travel requirements and preferences.
• Kango brings together:– more than a thousand sites– 400,000 lodging, activity and destinations options – 20 million reviews, ratings and blogs.
A. Frank46
How Kango Works
A. Frank47
Kango Semantics
• It provides many options for specifying a trip.• Kango thinks about those options in terms of
the “Long Tail“ concept to help make the trips distinct and memorable.
• It "understands" the travel lingo, so it helps make informed decisions about what best fits specific travel preferences for each user.
• Kango is creating an ontology of global travel content that includes ranking of superlatives within review sites.
A. Frank48
Lodging
A. Frank49
Things to Do
A. Frank50
Kango Dynamics
• Enables new ways of filtering through its collection to get the recommendations that are most relevant to preferences and priorities.
• Based on persons traveled with, the kind of destination looked for, and what is likely to be done, it sifts through its information to deliver the right getaway.
• For example, returns– one set of hotel and activity recommendations when
traveling to Monterey for a romantic getaway
– a different set when going to Monterey with the family to visit the aquarium and hang out on the beaches.
A. Frank51
Old Monterey Inn
A. Frank52
Campgrounds in Hawai
A. Frank53
Contents
• Web 3.0 & Semantic Search
• General Search
• "Natural Language" Search
• Vertical Search
• "Social Networking" Search
• Personalized Search
A. Frank54 A. Frank
SemantiNet
• SemantiNet is a startup, based in Tel Aviv, that is creating a new revolutionary technology that is based on Semantic Web concepts.
• Goal: leverage Web information in a meaningful way to boost the manner users experience the Internet.
A. Frank55 A. Frank
SemantiNet Rational
• SemantiNet makes life easy by allowing users to take advantage of the variety and richness of information and services that exist on the Internet, but in a way that is simple, smart and intuitive.
• SemantiNet leverages Semantic Web concepts to seamlessly integrate information and services enabling users to achieve more while working less!
A. Frank56 A. Frank
SemantiNet Repository
• SemantiNet collects relevant information from common social networks and established Web sites in order to provide users with a customized and efficient personalized and contextual browsing experience.
• Relevant personal information can be – entered on their Web site – provided by users through use of SemantiNet – or extracted from "traffic data" generated by
browser use.
A. Frank57 A. Frank
SemantiNet Semantics
• Develops a semantic framework solution that allows for rapid deployment of Web mashups, applications and services, in a way that enhances the way people use the internet.
• Rather than simply aggregating information, SemantiNet’s technology, integrates information as well as mashing it as needed.
• The idea is to bring the relevant online content to the user rather than the user to the content.
A. Frank58
SemantiNet Demo
A. Frank59
SemantiNet Demo
A. Frank60
SemantiNet Demo
A. Frank61
SemantiNet Demo
A. Frank62
Example of Social Graph
A. Frank63
Delver
• Delver (formerly Semingo) is headquartered in Herzeliya and will officially open U.S. offices in Silicon Valley in spring of 2008.
• Goal: provide a semantic search engine that allows users to search for information created and referenced by their own social graph.
A. Frank64
Delver Rational
• Delver provides a “connected search engine” that allows users to find content, media and people within their network via a simple search interface.
• Delver organizes and ranks content from the user’s network because social connections are critical for discovering more personally relevant information.
• It indexes the social Web (social networks, blogs, social applications, etc.), and cross-connects the data with users’ social graph.
• Improves the relevancy of Web search results by prioritizing these results based upon the specific searcher’s social network.
A. Frank65
Delver Repository
• Delver begins by crawling the Web in order to map users’ social connections.
• It specifically indexes people's social connections on flickr, MySpace, LinkedIn, YouTube, hi5, facebook, Blogger, and more sites are being added all the time.
• Instead of just looking at a Web site's popularity, Delver looks at information like whether your friends have tagged the site or if it's found on their social network profiles, bookmarking sites, photos and video sharing sites, or on their blogs.
• The results are more relevant because they account for who a person is and what it finds valuable.
A. Frank66
Liad Agmon
A. Frank67
Venture Funding
A. Frank68
Delver Semantics
• Delver knows who a user is and who his friends are even if users didn't import their address book or add their "Social Networking" profiles.
• Instead, Delver leverages the social graph to map out a user's social connections.
• Since everyone's social graph is unique, like a fingerprint, the same Delver query will yield significantly different results for each user – as reflected through the collective experiences of each person’s contacts.
• The results are more personal and meaningful to users than a generic search using a "normal" search engine.
A. Frank69
Delver Dynamics
• When a user performs a query, results from all over his social Web are displayed.
• Even if a user and others are not directly related as "friends" on a social network, the plus sign the beneath picture can still be clicked to add them as a connection.
• This way, a user can view the relevant bookmarks, links, blog posts, photos, and videos of people like him even if he doesn’t know them personally... and they don't have to confirm the connection on their end.
• Alternately, a user can choose to exclude certain connections from his search results.
A. Frank70
Roi Carthy
A. Frank71
Visit New York
A. Frank72
Contents
• Web 3.0 & Semantic Search
• General Search
• "Natural Language" Search
• Vertical Search
• "Social Networking" Search
• Personalized Search
A. Frank73
Radar Networks Twine
• Radar Networks, a pioneer of Semantic Web technology, introduced Twine.
• Goal: enables individuals and groups to organize, share and discover information and knowledge around their interests.
A. Frank74
Twine Rational
• Twine is a "knowledge networking" tool designated as a revolutionary Semantic Web application.
• It is a new service that helps organize, share and discover information about user interests, with networks of like-minded people.
• Twine can be used alone, with friends, groups and communities, or even in a company.
• It has aspects of social networking, wikis, blogging, knowledge management systems – but its defining feature is that it's built with Semantic Web technologies.
• It aims to bring a usable and scalable interface to the long-promised dream of the Semantic Web.
A. Frank75
Twine Repository
• Using Twine, a user can: – add content via Wiki functionality (has many post types)
– email content into the system
– and "collect" something (as an object, e.g., a book object).
• Twine ties it all together:– As information is added to Twine, it is automatically tagged
so that it can be easily found. – Users can connect with individuals and groups, gather and
share content, and engage in discussions around interests. – Twine connects between new people, content and products
that match their interests, and also helps users discover other people and their contributions.
A. Frank76
Twine Semantics
• Twine is powered by semantic understanding.• At first glance it is very much like Wikipedia, but
there is a whole lot more smarts to the system.• It's not based around socializing, but aims to share
information and automatically organize it, learn about user interests, and make varied connections and recommendations.
• The more it is used, the better it understands the user interests and the more useful it becomes.
• It is a "Semantic Graph", which maps relationships to both people and topics.
A. Frank77
Twine Sign In
A. Frank78
Twine Dynamics
• Enables user commenting and viewing of related things.
• Allows sharing of tags.• Enables import and export of user own data. • RSS feeds to track all kinds of things (topics,
events, search, etc).• Semantic Web technologies are being used:
RDF, OWL, SPARQL, XSL, GRDDL.• An open platform - there will be SPARQL and
REST APIs.
A. Frank79
Welcome Steve to Twine
A. Frank80
Explore Green Business and Investing
A. Frank81
Steve Smith’s Twine
A. Frank82
Explore Green Tech
A. Frank83
Semantically up(?-:
A. Frank84
Where does the MetaWeb fit!?
A. Frank85
References
• Web 3.0, In Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/w/index.php?title=Web_3.0&oldid=123368293
• Entrepreneurs See a Web Guided by Common Sense, John Markoff , New York Times, November 12, 2006, http://www.nytimes.com/2006/11/12/business/12Web.html?ex=1320987600&en=254d697964cedc62&ei=5088
• Parts I & II: A Smarter Web, John Borland, Technology Review, March 19-20, 2007, http://www.technologyreview.com/Infotech/18396/
A. Frank86
References
• M. Hildebrand, J. R. van Ossenbruggen, L. Hardman, An Analysis of Search-based User Interaction on the Semantic Web, Report INS-E0706, May 2007, 6th Intl. Semantic Web Conference, November 2007, http://ftp.cwi.nl/CWIreports/INS/INS-E0706.pdf
• Jim Hendler, Web 3.0: Chicken Farms on the Semantic Web, IEEE Computer, January 2008, http://www.computer.org/portal/site/computer/menuitem.5d61c1d591162e4b0ef1bd108bcd45f3/index.jsp?&pName=computer_level1_article&TheCat=1075&path=computer/homepage/0108&file=Webtech.xml&xsl=article.xsl&
• Richard Waters, World-wise Web?, Financial Times, http://www.ft.com/cms/s/0/4fba0434-e98c-11dc-8365-0000779fd2ac.html?nclick