modular information aggregation with combinatorial prediction markets robin hanson department of...

Modular Information Aggregation with Combinatorial

Prediction Markets

Robin HansonDepartment of EconomicsGeorge Mason University

Weather Forecasting

• A Canonical Forecasting Arena

• Many public standardized forecasts

• Fast feedback, so lots of stats

• Forecasting analysis pioneer – methods, validation, aggregation, etc.

• Place to try out forecasting innovations?

Buy Low, Sell High

pricebuy

buy

sell

sell

Will price rise or fall?

E[ price change | ?? ]

Lots of ?? get tried,price includes all!

“Pays $1 if Bush wins”

(All are “gambling” “prediction” “info”)

Today’s Current Prices

7-19% Bird Flu confirmed in US. By Dec. ’07

11-13% Bin Laden caught by ‘08.

54-60% Gonzales resigns by ‘08

16-18% US or Israel air strike on Iran by ‘08.

21-25% China overt military act on Taiwan by 2011

51-52% Darling next UK Chancellor

56-59% Conservatives win next UK election

TradeSports.com

In direct compare, beats alternatives

• Vs. Public Opinion – I.E.M. beat presidential election polls 451/596 (Berg et al ‘01)– Re NFL, beat ave., rank 7 vs. 39 of 1947 (Pennock et al ’04)

• Vs. Public Experts– Racetrack odds beat weighed track experts (Figlewski ‘79)

• If anything, track odds weigh experts too much! – OJ futures improve weather forecast (Roll ‘84) – Stocks beat Challenger panel (Maloney & Mulherin ‘03)– Gas demand markets beat experts (Spencer ‘04)– Econ stat markets beat experts 2/3 (Wolfers & Zitzewitz ‘04)

• Vs. Private Experts– HP market beat official forecast 6/8 (Plott ‘00)– Eli Lily markets beat official 6/9 (Servan-Schreiber ’05)– Microsoft project markets beat managers (Proebsting ’05)

Hollywood Stock Exchange

2000 Oscar Assessor Error

Feb 18 HSX prices 1.08

Feb 19 HSX prices 0.854

Feb 18 Tom 1.08

Feb 18 Jen 1.25

Feb 18 John 1.22

Feb 18 Fielding 1.04

Feb 18 DPRoberts 0.874

Columnist Consensus 1.05

Science 291:987-988, February 9 2001

Track Odds Beat Handicappers

Figlewski (1979) Journal of Political Economy

14

Estimated on 146 races, tested on 46

Economic Derivatives MarketStandard Error of Forecast

Non-Farm Payrolls

Retail Trade ISM Manuf. Pur. Index

Ave of 50 Experts

99.2 0.55 1.12

Market 97.3 0.58 1.20

Sample size 16 12 11

Wolfers & Zitzewitz “Prediction Markets”(2004) Journal of Economic Perspectives

Prediction Performance of MarketsRelative to Individual Experts

020406080

100120140160180200220240260280300

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Week into the NFL season

Ra

nk NewsFutures

Tradesports

NFL Markets vs Individuals

Servan-Schreiber, Wolfers, Pennock & Galebach (2004)Prediction Markets: Does Money Matter? Electronic Markets, 14(3).

1,947 Forecasters

Average of Forecasts

Item 1988 1992 1996 2000 All

# big polls

59 151 157 229 596

Poll “wins”

25 43 21 56 145

Market “wins”

34 108 136 173 451

% market

58% 72% 87% 76% 76%

P-value 0.148 0.000 0.000 0.000 0.000

“Accuracy and Forecast Standard Error of Prediction Markets”Joyce Berg, Forrest Nelson and Thomas Rietz, July 2003.

Iowa Electronic Markets vs. Polls

“Accuracy and Forecast Standard Error of Prediction Markets”Joyce Berg, Forrest Nelson and Thomas Rietz, July 2003.

Cummulative number of companies that have implemented an internal prediction market

(lower bound estimate)

0

5

10

15

20

25

30

35

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Source:

Inputs OutputsPrediction Markets

Status QuoInstitution

Compare! For Same

Not Experts vs. Amateurs

• Forecasting Institution Goal:– Given same participants, resources, topic– Want most accurate institution forecasts

• Separate question: who let participate?– Can limit who can trade in market

• Markets have low penalty for add fools– Hope: get more info from amateurs?

Advantages

• Numerically precise

• Consistent across many issues

• Frequently updated

• Hard to manipulate

• Need not say who how expert on what

• At least as accurate as alternatives

Old Tech Meets New

• To gain info, elicit probs p = {pi}i , Ep[x |A](Let verify state i later, N/Q = people/questions)

• Old tech (~1950+): Proper Scoring RulesN/Q 1: works well, N/Q 1: hard to combine

• New tech (~1990+): Info/Predict MarketsN/Q 1: works well, N/Q 1: thin markets

• The best of both: Market Scoring Rules– modular, lab tests, compute issues, …

Old Tech: Proper Scoring Rules

• When report r, state is i, reward is si(r)

p = argmaxr i pi si(r), i pi si(p) • Quadratic (G. Brier 1950) si 2ri – k rk

2

• Logarithmic (I. Good ‘52) si log(ri)– Unique: reward = likelihood (R. Winkler 1969)

• Offers info effort incentive (R. Clemen ‘02)– Stronger for simul. pick: pivot mech. (T. Page ‘88)

• In principle, can elicit complex joint distributions• Long used in forecasting, test scoring, lab expers

Old Tech Issues

Problems• Incentives• Number shy• Cognitive bias• Non risk-neutral• State-dependent utility• Combo explosion• Disagreements

Solutions• Proper scoring rules• Prob wheel, word menu• Corrections• Lottery payoffs• Insurance game• Dependence network• Dictator per Q, ??

Opinion Pool “Impossibile”

• Task: pool prob. T(A) from opinions pn(A) • Any 2 of IPP, MP, EB dictator (T= pd

) ! IPP = if A,B indep. in all pn, are indep. in TEB = commutes: pool, update on infoMP = commutes: pool, coarsen states field)

(MP T = n=0 wn pn , with wn indep. of A)

• Really want pool via belief origin theory– General solution: let best traders figure it out?

New Tech Issues

Problems• Incentives• Shy, complex utility• Combo explosion• Who expert on what• Cognitive bias, pooling• Thin markets (N/Q <~1)• What is independent

Solutions• Bet with each other • Same solutions• Same solutions • Self-select • Specialist traders• Market scoring rules• ??

Thin Market, No-Trade Problems

• Trade requires coordination– In Time: waiting offers suffer adverse selection– In Assets: far more possible assets than trades

• combo match call markets help some, but matching is NP-hard for all or nothing orders

– If expect traders rare, don’t bother to offer• Most possible markets do not exist (also illegal)

• No-trade among rational, info-motivated– Need fools, risk-hedgers, or outside subsidy

Accuracy

.001 .01 .1 1 10 100

Q/N = Estimates per trader

Market Scoring Rules

Old Tech Meet NewSimple Info Markets

thin marketproblem

Scoring Rulesopinion

poolproblem

q1

Scoring Rule = Demand Curve

• (L. Savage 1971)• Rule is demand qD(p)

– p0 solves q=0

• User sells q1, price discr.– Let p1 be user belief

• Note: = 0 if p1 = p0

– p0 is like rule belief

• Note: can reuse rule as qD(p) - q1 (i.e., p0→p1)– So this is a market maker!

Price

Quantity

1

0

$1 if Ap0

Demand

p1

Commodity:

Market Scoring Rule (MSR)

• MSRs act as scoring rules and info market makers– are sequentially reused combinatorial scoring rules

• Score rule: user t faces $ rule: si = si(pt) - si(pt-1)

“Anyone can use scoring rule if pay off last user”

• Automated market maker: price from net sales s

– Tiny sale fee: pi(s) i (sisi+i)

– Big sale fee: ipi(s(t)) si´(t) dt

– Log score rule gives: pi(s) = exp(si) / k exp(sk)

$ s(1)-s(0)

$ i if i

MSR Usage Concept

• User browses current probabilities, expectations– Can set assumptions, browse other values given them

– Can see market value of portfolio given assumptions

• Upon finding an “odd” current value E(x|A)– Can see value’s price/trade history

– Can see how far long/short are from past trades

• User proposes new value to replace old– Told exact “bet” required to implement change

– Can accept, and/or make book orders & trading agents

The Vision

• Detailed consensus forecasts always visible re many locations, times, conditions, etc.

• Authorized traders can change any estimates.– Possibly conditional on many other estimates.– Could write program to change many estimates.– At least until their account runs out.

• Win if moved estimate closer to truth.• Consistent winners gain more resources.• Could even privatize, and/or let public try.

Laboratory Tests

• Joint work with John Ledyard (Caltech), Takashi Ishida (Net Exchange)

• Caltech students, get ~$30/session• 6 periods/session, 12-15 minutes each• Trained in 3var session, return for 8var• Metric: Kulback-Leibler i qi log(pi /qi)

distance from market prices to Bayesian beliefs given all group info

Mechanisms Compared

• Survey Mechanisms (# cases: 3var, 8var)

– Individual Scoring Rule (72,144)

– Log Opinion Pool (384,144)

• Market Mechanisms– Simple Double Auction (24,18)

– Combined Value Call Market (24,18)

– MSR Market Maker (36,17)

Environments: Goals, Training

• Want in Environment: – Many vars, few related, guess which– Use both theory and data– Few people, specialize in variable sets – Can compute rational group estimates– Explainable, fast, neutral

• Training Environment: – 3 binary variables X,Y,Z, 23 = 8 combos– P(X=0) = .3, P(X=Y) = .2, P(Z=1)= .5 – 3 people, see 10 cases of: AB, BC, AC – Random map XYZ to ABC

Case A B C

1 1 -

1 2 1 -

0

3 1 -

0 4 1 -

0

5 1 -

0 6 1 -

1

7 1 -

1 8 1 -

0

9 1 -

0 10 0 -

0

Sum: 9 -

3

Same A B C

A -- --

4 B -- --

-- C -- --

--

(Actually: X Z Y )

-1

-0.5

0

0.5

1

0 5 10 15

MSR Info vs. Time – 7 prices

0 5 10 15

Minutes

0

-1

1

% Info Agg. =

KL(prices,group)KL(uniform,group)

1-

Experiment Environment

• 8 binary vars: STUVWXYZ• 28 = 256 combinations• 20% = P(S=0) = P(S=T)

= P(T=U) = P(U=V) = … = P(X=Y) = P(Y=Z)

• 6 people, each see 10 cases: ABCD, EFGH, ABEF, CDGH, ACEG, BDFH

• random map STUVWXYZ to ABCDEFGH

Case A B C D E F G H 1 0 1 0 1 - - - - 2 1 0 0 1 - - - - 3 0 0 1 1 - - - - 4 1 0 1 1 - - - - 5 0 1 1 1 - - - - 6 1 0 0 1 - - - - 7 0 1 1 1 - - - - 8 1 0 0 1 - - - - 9 1 0 0 1 - - - - 10 1 0 0 1 - - - -

Sum 6 3 4 10 - - - -

Same A B C D E F G H A -- 1 2 6 -- -- -- -- B -- -- 7 3 -- -- -- -- C -- -- -- 4 -- -- -- -- D -- -- -- -- -- -- -- -- …

(Really: W V X S U Z Y T )

MSR Info vs. Time – 255 prices

-1

-0.5

0

0.5

1

0 5 10 15

0 5 10 15

Minutes

0

-1

1

% Info Agg. =

KL(prices,group)KL(uniform,group)

1-

Combo Market Maker Best of 5 Mechs

3 Variables = 8 States

0.000

0.050

0.100

0.150

0.200

0.250

0.300

KL

Dis

tanc

e

8 Variables = 256 States

0.600

0.800

1.000

1.200

1.400

1.600

KL

Dis

tanc

e

Experiment Conclusions

• Experiments on complex info problems – Bayesian estimates far too high a standard

• 7 indep. prices from 3 people in < 4 minutes– Simple DA < Indiv. Score Rule ~ Opinion Pool

~ Combined Value < Market Scoring Rule

• 255 indep. prices from 6 people in < 4 min. – Combined Value ~ Simple DA ~ Indiv. Score

Rule < Opinion Pool ~ Market Scoring Rule

No Cost For Combos!

• Total cost: C = si true(pfinal) - si true(pinitial)

• Expected cost: E[C] i i (si(1i) - si())

• For log MSR: = S( = -i i log(i)

– Let state i = combination of base var values vi

– S(all var S(var, for var = {var value v}v

– So compared to cost of log rule for each var, all var/value combos cost no more!

How Compute?

• Simple: – store 2N probs, asset values– Can integrate book orders

• Feasible: overlapping var patches– With simple MSR per patch, is Markov field– Allow trade only if all vars in same patch

• How pick/change patch structure?

• Arbitrage to make patches agree?

Arbitrage Is Not Modular

1. Everyone agrees on prices2. Expert on gets new info,

trades3. Arbitrage updates all prices 4. Expert on has no new info, but

must trade to restore old info!

Bayesian/Markov Networks

• Local info trades not require distant corrections if updates follow Bayes’ rule

• Bayesian/Markov net tech does this– Off the shelf exact tech if net forms a tree– Many approx. techs made for non-trees

• Some development needed– Must update user assets as well as prices– Need robust to gaming on errors – stochastic?

Summary

• How elicit informed estimates?– Scoring rules if N/Q 1, info/predict markets if 1

• Market scoring rules do both: – Key insight … reused scoring rules are market makers– Browse billions of estimates, change ones want via bets– Lab tests confirm ability; are growing # groups using

• Log rule has many advantages– If bet on p(A|B), keeps p(B), I(,,I(,,) … – Noisy choices easier to interpret– Costs no more for all var/value combos– Computation simplified, but still issues to explore

modular information aggregation with combinatorial prediction markets robin hanson department of...

Documents

s i p i s i p

market markets

prediction info slide

s i r p

argmax r s i p i s i

state i

s i logr

infopredict markets