why search is the problem

28
Search is the Problem Tyler Bell @twbell Hobbling a data-centric Internet

Upload: tyler-bell

Post on 03-Jul-2015

1.044 views

Category:

Technology


0 download

DESCRIPTION

Presentation for SemTech 2011

TRANSCRIPT

Page 1: Why Search is the Problem

Search is the Problem

Tyler Bell@twbell

Hobbling a data-centric Internet

Page 2: Why Search is the Problem

1. Too User Centric• Inserting people between data and its interpretation• APIs designed, and enforced, for user consumption only

3. Any Ten Answers• Idea of correspondance remains unfamiliar• Creating explicit relationships on-the-fly remains rare

5. Tags Over Semantics• Tokens are easy, but designed for the search use case• Changing the state of our knowledge requires change in tags

1. Where things are going• Changing role of APIs:

• Structured data easier to access via page (FB, Best Buy, etc.)

• Decline of entity search ‘gatekeeper APIs’• Increase in machine to machine services

• APIs designed, and enforced, for user consumption only

Hidden Slide Overview

Page 3: Why Search is the Problem

The Problem:

Search is now the norm: human disambiguation is demanded

Page 4: Why Search is the Problem
Page 5: Why Search is the Problem
Page 6: Why Search is the Problem

http://www.fondos-hq.com/upload/DesktopWallpapers/cache/Futurama-fondos-Caricaturas-HQ-dibujos-animados-futurama-caricaturas-1024x768.jpg

HCI Apogee

Page 7: Why Search is the Problem

The Artefact:

APIs are designed for user consumption only

Page 8: Why Search is the Problem

“You are permitted to use the Services only for the purpose of incorporating and displaying results […] as part of a Search Product deployed on Your Web site”

Yahoo! BOSS Services Terms of Use

http://www.flickr.com/photos/auntiep/17135231/

Page 9: Why Search is the Problem
Page 10: Why Search is the Problem

Employs probability matrices to disambiguate placename intent

Contains cross-references to entities in other namespaces

Designed for toponym and business name disambiguation

‘Venue harmonization’ with select data partners

ID cross-reference (one-way street only: ID tweets)

Page 11: Why Search is the Problem

You must not use or display the Content without a corresponding Google map

You are not permitted to use or provide any part of the Service or Content […] in an API that you offer to others

http://www.flickr.com/photos/auntiep/17135231/

Google Maps/Places Terms of Use

Page 12: Why Search is the Problem

http://twitter.com/#!/zachklein/status/55986694289227776

Page 13: Why Search is the Problem

http://continuations.com/post/4365211963/the-web-stp-challenge-making-apis-useful

We need more STP [Straight Through Processing] for the web so that we have fewer stove pipe services and can move to a seamless web instead. The obstacle is no longer a lack of APIs […] the problem is a lack of data mapping/unification services.

- Albert Wenger

http://twitter.com/#!/cdixon/status/49906284492881920

Page 14: Why Search is the Problem

The Problem:

Multiple electronic representationsof one conceptual entity

Page 15: Why Search is the Problem
Page 16: Why Search is the Problem

Too much of this…

Not enough of this

21c7c504-537a-47d6-a4cf-e260ccb6620d

Page 17: Why Search is the Problem

“With a single click you can recommend that raincoat, news article or favorite sci-fi movie to friends, contacts and the rest of the world”

Page 18: Why Search is the Problem
Page 19: Why Search is the Problem

“With a single click you can recommend that Webpage to friends,contacts and the rest of the world”

Page 20: Why Search is the Problem

http://tripletalk.wordpress.com/2011/01/25/rdfa-deployment-across-the-web/

Peter Mika, Jan 2011

Page 21: Why Search is the Problem

Webpage URLs are Entity URIs

Identifiers for people, places, things

http://developers.facebook.com/docs/opengraph/

Page 22: Why Search is the Problem

http://www.yelp.com/biz/irish-times-san-francisco-2

Page 23: Why Search is the Problem
Page 24: Why Search is the Problem

enable publishers to give us hints about what things they are describing on their sites… markup [will] amplify the value [webmasters ]receive in return

improve how their sites appear in major search engines… powering richer search results and new kinds of applications.

improve the search experience… alignment between search and our Web of Objects program

Page 25: Why Search is the Problem
Page 26: Why Search is the Problem

17.5m entities pointing to over…

1.5b references found across…

4.7m domains

US Local Dataset

Page 27: Why Search is the Problem

Datawire

http://www.flickr.com/photos/tigerplish/250836258/

TL;DR:• Search: human disambiguation is expected• Few inputs leads to ‘pull’, not ‘push’• Plurality of content is a real bugger

• Structured content will do more than improve the look of search results

• Increased recognition of machine-to-machine APIs

• The socially networked world demands understanding across caissons

The Good News:

Page 28: Why Search is the Problem

Tyler Bell@twbell