1 web search from information retrieval to microeconomic modeling prabhakar raghavan yahoo! research

56
1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

1

Web SearchFrom Information Retrieval to Microeconomic Modeling

Prabhakar Raghavan

Yahoo! Research

Page 2: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

2Yahoo! Research

What is web search?

• Access to “heterogeneous”, distributed information– Heterogeneous in creation– Heterogeneous in accuracy– Heterogeneous in motives

• Multi-billion dollar business– Source of new opportunities in marketing

• Strains the boundaries of trademark and intellectual property laws

• A source of unending technical challenges

Page 3: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

3Yahoo! Research

The coarse-level dynamics

Content creators Content aggregators

Feeds

Crawls

Content consumers

Adv

erti

sem

ent

Edi

tori

al

Sub

scri

ptio

nT

rans

acti

on

Page 4: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

4Yahoo! Research

Page 5: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

5Yahoo! Research

Brief (non-technical) history

• Early keyword-based engines

– Altavista, Excite, Infoseek, Inktomi, Lycos, ca. 1995-1997

• Paid placement ranking: Goto (morphed into Overture Yahoo!)

– Your search ranking depended on how much you paid

– Auction for keywords: casino was expensive!

Page 6: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

6Yahoo! Research

Brief (non-technical) history

• 1998+: Link-based ranking pioneered by Google

– Blew away all early engines except Inktomi

– Great user experience in search of a business model

– Meanwhile Goto/Overture’s annual revenues were nearing $1 billion

Page 7: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

7Yahoo! Research

Brief (non-technical) history

• Result: Google added “paid-placement” ads to the side, separate from search results

• 2003: Yahoo follows suit, acquiring Overture (for paid placement) and Inktomi (for search)

Page 8: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

8Yahoo! Research

Algorithmic results CPC Advertisements

Page 9: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

9Yahoo! Research

Editorial

User reviews

Ads

Page 10: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

10Yahoo! Research

Types of content

• Editorial content: books, music, professionally-produced websites

• User-generated content: blogs, reviews, bulletin boards, groups, etc.

• Total web growth: 1-3M pages/day– Not “real” growth– Think text content…

Courtesy: Andrew Tomkins

Page 11: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

11Yahoo! Research

Total text content

• 6B people type 4 hrs/day at 100 wpm

• Storage: 52PB/yr = Cost: $25M/yr

• In another 5 years, this looks about like the cost of having 10 people on your payroll

• Conclusion 1: any company with tens of people can store every bit of text produced by every human on the planet

• Conclusion 2: no scale-based differentiation around text content

(of course, not all content is text…)Courtesy: Andrew Tomkins

Page 12: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

12Yahoo! Research

User generated content (UGC)

• New content– 2 billion pages of editorial content– Tiny number of songs, etc– 5-10 billion pages UGC exist (already ~10%

of consumption), growing

• Note: UGC did not exist as a category a couple of years ago!– Rapidly becoming a key growth area of

consumed web content – but we don’t know how to process it!

Courtesy: Andrew Tomkins

Page 13: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

13

Tags The simplest form of UGC

Is the Turing test always the right question?

Page 14: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

14Yahoo! Research

Page 15: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

15Yahoo! Research

The power of social tagging

• Flickr – community phenomenon

• Millions of users share and tag each others’ photographs (why???)

• The wisdom of the crowd can be used to search

• The principle is not new – anchor text used in “standard” search

• Don’t try to pass the Turing test?

Page 16: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

16Yahoo! Research

Anchor text

• When indexing a document D, include anchor text from links pointing to D.

www.ibm.com

Armonk, NY-based computergiant IBM announced today

Joe’s computer hardware linksCompaqHPIBM

Big Blue today announcedrecord profits for the quarter

Page 17: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

17Yahoo! Research

Challenges in tag-based search

• How do we use these tags for better search?

• How do you cope with spam?

• What’s the ratings and reputation system?

• The bigger challenge: where else can you exploit the power of the people?

• What are the incentive mechanisms?– Luis von Ahn (CMU): The ESP Game

Page 18: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

18Yahoo! Research

Page 19: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

19

More UGC: Social search

Indexing the knowledge in people’s heads

Page 20: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

20Yahoo! Research

Page 21: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

21Yahoo! Research

Page 22: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

22Yahoo! Research

Social content Social capital

Page 23: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

23Yahoo! Research

Incentives

Page 24: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

24Yahoo! Research

Incentives

• What assignment of incentives leads to good user behavior?

– What’s “good” user behavior?

– Good questions, good answers, new questions?

• Whom do you trust and why?

– Propagation of trust and distrust.

Page 25: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

25Yahoo! Research

Ratings and reputation

• Node reputation: Given a DAG with– a subset of nodes called GOOD

– another subset called BAD

– Find a measure of goodness for all other nodes.

• Node pair reputation: Given a DAG with a real-valued trust on the edges– Predict a real-valued trust for ordered node

pairs not joined by an edge

Metric labelling

Page 26: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

26

CPC advertisements

What pays the bills

Page 27: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

27Yahoo! Research

Ads

Page 28: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

28Yahoo! Research

Generic questions

• Of the various advertisers for a keyword, which one(s) get shown?

• What do they pay on a click through?

• The answers turn out to draw on insights from microeconomics

Page 29: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

29Yahoo! Research

Ads go in slots like this one

and this one.

Page 30: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

30Yahoo! Research

Advertisers generally prefer this slot

to this one

to this one

to this one.

Page 31: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

31Yahoo! Research

Click through rate r1 = 200 per hour

r2 = 150 per hour

r3 = 100 per hour

etc.

Page 32: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

32Yahoo! Research

Why did witbeckappliance win

over ristenbatt?

Page 33: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

33Yahoo! Research

First-cut assumption

• Click-through rate depends only on the slot, not on the advertisement

• In fact not true; more on this later.

Page 34: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

34Yahoo! Research

Advertiser’s value

• We assume that an advertiser j has a value vj per click through

– Some measure of downstream profit

• Say, click-through followed by• 96% of the time, no purchase

• 0.7% buy Dishwasher, profit $500

• 1.2% buy Vacuum Cleaner, profit $200

• 2.1% buy Cleaning agents, profit $1$ 5.921

Page 35: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

35Yahoo! Research

Example

• For the keyword miele, say an advertiser has a value of $10 per click.

• How much should he bid?

• How much should he be charged?

The value of a slot for an advertiser,

what he bids and

what he is charged, may all be different.

Page 36: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

36Yahoo! Research

Advertiser’s payoff in ad slot i

(Click-through rate) x (Value per click) – (Payment to search engine)

= ri vj – (Payment to Engine)

= ri vj – pij

Payment ofadvertiser j

in slot iFunction of all other bids.

Page 37: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

37Yahoo! Research

Two auction pricing mechanisms

• First price: The winner of the auction is the highest bidder, and pays his bid.

• Second price: The winner is the highest bidder, but pays the second-highest bid.

• Engine decides and announces pricing.

• What should an advertiser bid?

Not truthful.

Page 38: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

38Yahoo! Research

Second-price = Vickrey auction

• Consider first a single advt slot

• Winner pays the second-highest bid

• Vickrey: Truth-telling is a dominant strategy for each player (advertiser)

– No incentive to “game” or fake bids

Page 39: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

39Yahoo! Research

Auctions and pricing: multiple slots

• Overture’s model:

– Ads displayed in order of decreasing bid

– E.g., if advertiser A bids 10, B bids 2, C bids 4 – order ACB

• How do you price slots? Generalized Vickrey?

– Generalized second-price (GSP)

– Vickrey-Clark-Groves (VCG): each advertiser pays the externality he imposes on others

Page 40: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

40Yahoo! Research

Bidder A, $10

Bidder C, $4

Bidder B, $2

Pays 4

Pays 2

Generalized Second Price auction pricing

Page 41: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

41Yahoo! Research

VCG pricing

• Suppose click rates are 200 in the top slot, 100 in the second slot

• VCG payment of the second player (C) is 2 x 100 = 200

• For the first player, 4x(200-100) + 200

Externality on third player B.

Externality on C. Externality on B.

Page 42: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

42Yahoo! Research

VCG and GSP

• Truth-telling is a dominant strategy under VCG …

• Truth-telling not dominant under GSP!

Edelman, Ostrovsky, Schwarz

Page 43: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

43Yahoo! Research

VCG and GSP

• Static equilibrium of GSP is locally envy-free: no advertiser can improve his payoff by exchanging bids with advertiser in slot above.

• Depending on the mechanism, revenue varies: GSP ≥ VCG.

Edelman, Ostrovsky, Schwarz

Locally envy-free mechanisms correspondto Stable Marriage solutions.

Page 44: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

44Yahoo! Research

GSP for bid-ordering

• What’s good about bid-ordering and GSP?

–Advertisers like transparency

• What’s wrong with bid-ordering?

Page 45: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

45Yahoo! Research

Brand advertising?

Bid ordering(former Yahoo! order)

Page 46: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

46Yahoo! Research

Page 47: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

47Yahoo! Research

Revenue ordering

• Simplified version of Google’s ordering

– Each ad j has an expected click-through denoted CTRj

– Advertiser j’s bid is denoted bj

• Then, expected revenue from this advertiser is Rj = bj+1 x CTRj

• Order advertisers by Rj

– Payment by GSP

Page 48: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

48Yahoo! Research

Page 49: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

49Yahoo! Research

“current” Yahoo! ordering

Page 50: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

50Yahoo! Research

“Squashed” ordering

• Overture/Old Yahoo! scheme

– Order ads by bid

• Google (puportedly)

– Order by bid click-through rate (CTR)

• Squashing (Lahaie/Pennock)

• Key – advertisers react to mechanism!

s=0 s=1Order by bid*(CTR)sOverture Google?

Page 51: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

51Yahoo! Research

Page 52: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

52Yahoo! Research

Where do we go next?

• Premise:

– People don’t want to search

– People want to get tasks done

I want to book a vacation in Tuscany.Start Finish

Page 53: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

53Yahoo! Research

What is missing?

• Information integration

– Information extraction

– Schema normalization

• Mining social structure

– Tags, UGC

Welcome to The SavoyLocated on The Strand in the heartof the West End theatre district,

hotel near leicester squareSearch

Page 54: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

54Yahoo! Research

Computational microeconomics

• Reputation and incentive mechanisms

• Matching marketplaces– Jobs, dates, …

– Online matching everywhere• Hardest part is estimating the payoffs, not the

matching algorithm

• “Network effects”– Are 500 million users 500 times as valuable

as a million users? 5000 times?

Page 55: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

55Yahoo! Research

A new convergence

• Monetization and economic value an intrinsic part of system design

– Not an afterthought

– Mistakes are costly!

• Computing meets humanities like never before – sociology, economics, anthropology …

Page 56: 1 Web Search From Information Retrieval to Microeconomic Modeling Prabhakar Raghavan Yahoo! Research

56

Thank you.

[email protected]

http://research.yahoo.com