a dynamic level- model in gamesfaculty.haas.berkeley.edu/hoteck/papers/dlk-ho.pdf · a dynamic...

A Dynamic Level-k Model in Games

Teck Ho and Xuanming SuTeck Ho and Xuanming Su

UC Berkeley

March, 2010 Teck Hua Ho 1

A 4-stage Centipede GameA 4 stage Centipede Game

A AB B

4 2 16 8

641654

128

164

832

1 32 4

5

OutcomeRound 1 2 3 4 5

1 32 4

1‐5 6.2% 30.3% 35.9% 20.0% 7.6% 6‐10 8.1% 41.2% 38.2% 10.3% 2.2%


Backward Induction 100% 0% 0% 0% 0%

A 6-Stage Centipede Gameg p

64 32

25664

A AB B

4 2 16 8

A B

76416

321281 8 4 32

O

1 32 4 65Outcome

Round 1 2 3 4 5 6 7

1‐5 0.0% 5.5% 17.2% 33.1% 33.1% 9.00% 2.10%

6‐10 1.5% 7.4% 22.8% 44.1% 16.9% 6.60% 0.70%

Backward Induction 100% 0% 0% 0% 0% 0% 0%


OutlineOutline

k d i d i d i i i l iBackward induction and its systematic violations

Dynamic Level-k model and the main theoretical resultsThe centipede game

Resolving well-known paradoxes: o Cooperation in finitely repeated prisoner’s dilemmao Cooperation in finitely repeated prisoner s dilemma

o Chain-store paradox

An empirical application: The centipede gameAlternative explanationso Reputation-based story

S i l fo Social preferences


Backward Induction PrincipleBackward Induction Principle

B k d i d ti i th t id l t d i i l t tBackward induction is the most widely accepted principle to generate prediction in dynamic games of complete information

Extensive-form games (e.g., Centipede)

Finitely repeated games (e.g., Repeated PD and chain-store paradox)

M lti person d namic programmingMulti-person dynamic programming

For the principle to work, every player must be willingness to bet on p p , y p y gothers’ rationality


Violations of Backward InductionViolations of Backward Induction

Well-known violations in economic experiments include: (http://en.wikipedia.org/wiki/Backward_induction ):

Passing in the centipede game

Cooperation in the finitely repeated PD

Chain-store paradox

Likely to be a failure of mutual consistency condition (different people make initial different bets on others’ rationality)


Standard Assumptions in Equilibrium AnalysisEquilibrium Analysis

Assumptions Backward DLk

Induction ModelSolution MethodSolution Method

Strategic Thinking X X

Best Response X X

Mutual Consistency X ?

Instant Equilibration X ?

March, 2010

sta t qu b at o ?

7Teck Hua Ho

NotationsNotations

iIsS

)by (indexed players ofnumber Total : )by (indexed subgamesofnumber Total :

sN s subgameat activeare whoplayersofnumber Total :

64A AB B

4 2 16 8

6416

41

28

164

832

124 ====== NNNNIS


1 ,2 ,4 4321 ====== NNNNIS

Deviation from Backward InductionDeviation from Backward Induction

∑ ∑ ∞ ⎥⎦

⎤⎢⎣

⎡=

S Ni

sI

s

LLDNS

GLL1 ),(11),,...,(δ= = ⎦⎣s isNS 1 1

⎭⎬⎫

⎩⎨⎧ ≠

= ∞∞ otherwise

LaLa,,L(LD

ii

s ,0)()( 1

)

1(.)0 ≤≤ δ


ExamplesExamples

64A AB B

4 2 16 8

6416

41

28

164

832

}{}{};{ TTTTLTPLTPL BAE 1

21]0011[),,(

},,,{},,,,{};,,,{

41

4 =+++=

=−−=−−= ∞

GLL

TTTTLTPLTPL

BAδ

Ex1:

1]0001[)(

},,,{},,,,{};,,,{

1 +++

=−−=−−= ∞

GLL

TTTTLTTLTPL

BA

BA

δ

Ex2:


4]0001[),,( 4

14 =+++=GLL BAδ

Systematic Violation 1: Limited Inductiony

A AB B 6416

41

28

164

832

256

64 32

25664

A AB B

4 2 16 8

A B

16321281 8 4 32

)()( GLLGLL BABA δδ <


),,(),,( 64 GLLGLL δδ <

Limited Induction in Centipede GameLimited Induction in Centipede Game

Figure 1: Deviation in 4 stage versus 6 stage game (1st round)


Figure 1: Deviation in 4-stage versus 6-stage game (1 round)

Systematic Violation 2: Time UnravelingSystematic Violation 2: Time Unraveling

A AB B 6416

41

28

164

832

16

32

∞→→ tGtLtL BA as0)),(),((δ ∞→→ tGtLtL as0)),(),((δ


Ti U li i C ti d GTime Unraveling in Centipede Game

Figure 2: Deviation in 1st vs 10th round of the 4 stage game


Figure 2: Deviation in 1 vs. 10 round of the 4-stage game

OutlineOu e


Level-k model and the main theoretical resultsThe centipede game






Research questionResearch question

To develop a good descriptive model to predict the probability of player i (i=1,…,I) choosing strategy j at subgame s (s=1,.., S) in any dynamic game of complete informationany dynamic game of complete information

)(sPij

March, 2010 16Teck Hua Ho

Criteria of a “Good” Model

Nests backward induction as a special case

Behavioral plausibleHeterogeneous in their bets on others’ rationality

Captures limited induction and time unraveling

Fit d t llFits data well

Simple (with as few parameters as the data would allow)


Standard Assumptions in Equilibrium AnalysisEquilibrium Analysis

Assumptions Backward HierarchicalInduction Strategizing

Solution MethodSolution Method

Strategic Thinking X X

Best Response X X

Mutual Consistency X HeterogenousBets

Instant Equilibration X Learning


Dynamic Level-k Model: SummaryDynamic Level k Model: Summary

Pl h l f l hi hPlayers choose rule from a rule hierarchy

Players make differential initial bets on others’ chosen rules

After each game play, players observe others’ rules (e.g., strategy method)

Pl d t th i b li f l h b thPlayers update their beliefs on rules chosen by others

Players always choose a rule to maximize their subjective expected utility in each roundexpected utility in each round


Dynamic Level-k Model: Rule Hierarchyy y

l h l f l hi h d b bPlayers choose rule from a rule hierarchy generated by best-responses

Rule hierarchy: LLLRule hierarchy:

R t i t L t f ll b h i d i th i ti

,....,, 210 LLL)( 1−= kk LBRL

Restrict L0 to follow behavior proposed in the existing literature

BIL = BIL =∞


Dynamic Level-k Model: Poisson Initial Beliefy

Diff l k diff i i i l b h ’ h lDifferent people make different initial bets on others’ chosen rules

Poisson distributed initial beliefs:

K

!)(

KeKf

Kλτ−=

λ : average belief of rules used by opponents

f(k) fraction of players think that their opponents use Lk-1 rule.


Dynamic Level-k model:Belief Updating at the End of Round tBelief Updating at the End of Round t

Initial belief strength: N (0) = βInitial belief strength: Nk(0) = βUpdate after observing which rule opponent chose

+ )(ii I(k t)tΝtN 11)(

=

⋅+ )−(=

iki

kk

tNtB

I(k,t)tΝtN

)()(

11)(

I(k ) 1 if h L d 0 h i

∑=

= S

k

ik

k

tNtB

0')(

)(

I(k, t) = 1 if opponent chose Lk and 0 otherwiseBayesian updating involving a multi-nomial distribution with a Dirichlet prior (Fudenberg and Levine, 1998; Camerer and Ho,

March, 2010

Dirichlet prior (Fudenberg and Levine, 1998; Camerer and Ho, 1999)

22Teck Hua Ho

Dynamic Level-k model: :O ti l R l i R d t+1Optimal Rule in Round t+1

Optimal rule k*:

∑ ∑= =

=⎭⎬⎫

⎩⎨⎧

⋅=S

s

S

kskks

ikSk aatBk

1 1'',',..,1

* )()(maxarg π

Let the specified action of rule Lk at subgame s be akLet the specified action of rule Lk at subgame s be aks


The Centipede Game (Rule Hierarchy)The Centipede Game (Rule Hierarchy)

Player A Player B

(P,-,P-) (-,P,-,P)( , , ) ( , , , )

(P,-,P-) (-,P,-,T)

(P,-,T,-) (-,T,-,P)

(P,-,T,-) (-,T,-,T)

(T,-,T,-) (-,T,-,T)


Player A in 4-Stage Centipede Gamey g p

Nik(t) β=0.5

Round (t) L L L L L Rule Used by Opponent Optimal Rule (Player A)Round (t) L 0 L 1 L 2 L 3 L 4 Rule Used by Opponent Optimal Rule (Player A)

0 β L 2

1 β 1 L 3 L 2

β2 β 2 L 3 L 2

3 β 3 L 3 L 4


3 β 3 L 3 L 4

Dynamic Level-k Model: SummaryDynamic Level k Model: Summary

Pl h l f l hi hPlayers choose rule from a rule hierarchy

Players make differential initial bets on others’ chosen rules

After each game play, players observe others’ rules (e.g., strategy method)

Pl d t th i b li f l h b thPlayers update their beliefs on rules chosen by others

Players always choose a rule to maximize their subjective expected utility in each roundexpected utility in each round

A 2-paramter extension of backward induction (λ and β)


Main Theoretical Results: Limited InductionMain Theoretical Results: Limited Induction

)()( GLLGLL BABA δδ < ),,(),,( 64 GLLGLL δδ <


Main Theoretical Results: Time UnravelingMain Theoretical Results: Time Unraveling

∞→→ tGtLtL BA as0)),(),((δ ∞→→ tGtLtL as0)),(),((δ


Iterated Prisoner’s Dilemma (Rule Hierarchy)Iterated Prisoner s Dilemma (Rule Hierarchy)

3 3 0 5Level Strategy

0 TFT*3,3 0,5

5,0 1,1

0 TFT*

1 TFT,D

2 TFT,D,D

* K l (1982)

3 TFT,D,D,D

K TFT,D1…,Dk

* Kreps et al (1982)


Main Theoretical ResultsMain Theoretical Results

TT'GLLGLL TBA

TBA >< );,,(),,( 'δδ



∞→→ tGtLtL BA as 0)),(),((δ


Properties of Level-0 RuleProperties of Level 0 Rule

Maximize group payoff: A level-0 player always chooses a decision that if others do the same will lead to the largest total

ff f th ( TFT i RPD)payoff for the group (e.g., TFT in RPD)

P i di id l ff Whil i i i ffProtect individual payoff: While maximizing group payoff, a level-0 player also ensures that the chosen decision rule is robust against continued exploitation by others (e.g., TFT in RPD)g p y ( g , )


Chain-Store Paradox (Rule Hierarchy)Chain Store Paradox (Rule Hierarchy)

E

51

E

CS

OUT IN

1

0 2

CS

FIGHT SHARE

0 2Level Chain Store (CS) Entrant

0 FIGHT(F)G O l CS i b d h ( h

1F,F,F,..,F,F,S GTR: OUT unless CS is observed to share (then

ENTER(E)2 F,F,F,..,F,S,S GRE, E

3 F F F S S S GTR E E


3 F,F,,..F,S,S,S GTR,E,E

K F,..,F,S1,..,SkGTR,E1,..,Ek-1

OutlineOu e


Level-k model and the main theoretical resultsThe centipede game






4-Stage versus 6-Stage Centipede Gamesg g p

A AB B 6416

41

28

164

832

256

64 32

25664

A AB B

4 2 16 8

A B

6416

321281 8 4 32


Empirical Regularitiesp g

Outcome

Round 1 2 3 4 5

1 5 6 2% 30 3% 35 9% 20 0% 7 6%1‐5 6.2% 30.3% 35.9% 20.0% 7.6%

6‐10 8.1% 41.2% 38.2% 10.3% 2.2%

Outcome

Round 1 2 3 4 5 6 7

1‐5 0.0% 5.5% 17.2% 33.1% 33.1% 9.00% 2.10%

6‐10 1.5% 7.4% 22.8% 44.1% 16.9% 6.60% 0.70%


Dynamic Level-k Model’sPrediction in 4 stage gamePrediction in 4-stage game


Dynamic Level-k Model’sPrediction in 6 stage gamePrediction in 6-stage game


MLE Model Estimates

1 11.1

Special cases are rejectedSpecial cases are rejected

Both heterogeneity and learning are important


Model PredictionsModel Predictions


Alternative 1:Gang of Four’s Story (Kreps, et al, 1982)g y ( p , , )

large

θ = proportion of altruistic players (level 0 players)


Gang of Four’s Predictions (LL=-955.7)g ( )


Alternative 2: Social Preferences


ConclusionsConclusions

i l l k d l i i i l l iDynamic level-k model is an empirical alternative to BI

Captures limited induction and time unraveling

Explains violations of BI in centipede game

Explains paradoxical behaviors in 2 well-known games ( ti fi it l t d PD h i t d )(cooperation finitely repeated PD, chain-store paradox)

Dynamic level-k model can be considered a tracing procedure for backward induction (since the former converges to thefor backward induction (since the former converges to the latter as time goes to infinity)


a dynamic level- model in gamesfaculty.haas.berkeley.edu/hoteck/papers/dlk-ho.pdf · a dynamic...

Documents