graduate microeconomics ii lecture 9: repeated...

33
Graduate Microeconomics II Lecture 9: Repeated Games Patrick Legros 1 / 33

Upload: others

Post on 07-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Graduate Microeconomics IILecture 9: Repeated Games

Patrick Legros

1 / 33

Page 2: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Outline

Introduction

Example: efficiency wages

Folk TheoremsOptimality PrincipleDynamic Programming FormulationFolk TheoremsFinite Horizon

2 / 33

Page 3: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Outline

Introduction

Example: efficiency wages

Folk TheoremsOptimality PrincipleDynamic Programming FormulationFolk TheoremsFinite Horizon

3 / 33

Page 4: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Outline

Introduction

Example: efficiency wages

Folk TheoremsOptimality PrincipleDynamic Programming FormulationFolk TheoremsFinite Horizon

4 / 33

Page 5: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Introduction

Static vs dynamic relationships:

I at any point in time there is a past and a future

I from a game theoretical perspective, agents can conditiontheir play today on what has happened in the past

I hence the past may affect the future and this may induceagents to take actions that are not equilibrium actions in thestatic situation

Examples of repeated relationships:

I marriage

I Coca-Cola vs. Pepsi

I Political parties and voters

5 / 33

Page 6: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Example: Prisoner’s dilemma

Consider the (stage) game Γ where each agent has two strategies{C ,D} and payoffs are given by,

C DC 2, 2 0, 3D 3, 0 1, 1

D is a strictly dominant strategy and equilibrium payoffs are (1, 1).

6 / 33

Page 7: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

The feasible set of payoffs is (the unique static Nash equilibrium isthe red dot.)

1’s payoff

2’s payoff

1 2 3

1

2

3

We will show that when the game is repeated infinitely often, aslong as the players are “patient enough”, they can attain asequilibrium outcomes any payoffs in the shaded region.

7 / 33

Page 8: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Example: prisoner’s dilemmaInfinite Repetition

The repeated game Γ∞ is defined by:

I each period agents choose action C or D

I each agent has the same discount factor δ ∈ (0, 1)

I if an agent receives a flow of payoffs{u(1), u(2), · · · , u(t), · · · } his utility is

U = (1− δ)∞∑

t=1

δt−1u(t)

Note that when an agent gets the same payoff u each period, hisutility is (1− δ)u

∑∞t=1 δt−1 = u since

∑Tt=1 δt−1 = 1/(1− δ).

(1− δ) factor serves as a normalization and facilitates thecomparison between static and repeated games since one cancompare the per period payoff in each case.

8 / 33

Page 9: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Example: prisoner’s dilemmaBehavioral strategies

At time t, an agent observes the sequence of past actions. Anhistory at t is therefore h(t) ∈ H(t) = {{C ,D} × {C ,D}}t−1.

I For instance at time t = 2 (tomorrow) the history can be(C ,C ) or (C ,D) or (D,C ) or (D,D)

I at time t = 3 (after tomorrow) a possible history is{(C ,D), (D,C )}, etc.

An history is nothing than the (common) information set of thetwo agents. A (behavioral) strategy is then a choice of an action athistory: for agent i

σi : H(t) → [0, 1]

that is σi (h(t)) is the probability that agent i chooses strategy Cat information set h(t).

9 / 33

Page 10: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Example: prisoner’s dilemmaTrigger strategy

We show here that playing C each period is part of an equilibriumstrategy, and therefore that the per-period payoff of each agent is2 as long as the discount factor is “large enough.” (A similarargument shows that any feasible payoffs that dominate (1, 1) canbe an equilibrium outcome.)

Consider the (trigger) strategy σ

I at t = 1 play C

I at t ≥ 2, play C if h(t) = (C ,C )t−1, otherwise play D.

If agents follow this strategy their per-period payoff is u(C ,C ) = 1.To verify that this strategy is a Nash equilibrium, it is enough toconsider one period deviations (we will prove this later).

Two types of deviations: in and out of equilibrium

10 / 33

Page 11: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Prisoner’s dilemmaDeviations

In equilibrium: reach at time t an history that is part of theequilibrium, that is h(t) = (C ,C )t−1. Consider agent 1 andsuppose that he deviates to D. Since the history at t + 1 will notbe part of the equilibrium, each agent plays D from t + 1 on.Hence deviating at t gives an average payoff of

(1− δ)3 + δ1.

Since by not deviating the per-period payoff is 2, agent 1 does notwant to deviate only if

(1− δ)3 + δ1 ≤ 2

⇔ δ ≥ 1

2

11 / 33

Page 12: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Out of equilibrium: suppose that the history is out of equilibrium.Then each agent should play D.

I Given the definition of σ, independently of what actions areplayed at t, the history at t + 1 will not be (C ,C )t andtherefore both agents will play D from t + 1 on.

I Hence once an out of equilibrium history is reached, thesituation is strategically equivalent to the static game:

I If agent 1 deviates to C , his payoff today is 0 rather than 1but his per-period payoff after t is the same.

Hence we have no beneficial deviation if δ ≥ 12 proving that the

strategy σ is a Nash equilibrium.

12 / 33

Page 13: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Example: efficiency wages

(Shapiro and Stiglitz 1984; Ford); consider the following stagegame:

I Firm offers the worker a wage wI the worker accepts or rejects the offer

I if the worker rejcts w , he becomes self-employed and earns w0

I if the worker accepts w , he chooses either to supply effort e(at cost e) or to shirk (at zero cost)

I Effort is not observed but output is: output is either high (y)or low (0); the probability of high output is 1 if effort is e andis p, p < 1 if the worker shirks. However it is not possible togive compensation as a function of output

I Hence if the worker accepts w , payoffs are

(y − w ,w − e) if the worker exerts effort(py − w ,w) 0

13 / 33

Page 14: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Example: efficiency wagesRepeated game

Assume thaty − e > w0 > py

While it is socially efficient for the worker to work in the firm, theunique subgame perfect equilibrium in the stage game is for theworker to accept any wage w greater than w0 and to shirk.

Anticipating this, the firm is willing to offer a wage equal to atmost py but since py < w0 the worker will reject this offer. Thelabor market breaks down.

In a repeated game, the firm can use the observation of the outputto decide whether to retain or fire the worker.

14 / 33

Page 15: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Example: efficiency wagesStrategies

Set w∗ > w0, value of w∗ will be set in order to satisfy stability.At time t, an history for the firm includes:

I the sequence of wage offers w(τ), τ = 1, · · · , t − 1

I the sequence of worker’s decision to accept or refuse the offer

I when the worker accepts the offer the value of output at theend of the period

In addition an history for the worker includes his effort choice afteraccepting the wage offer.Say that the history is H if for every previous period the firmoffered wage w∗, the worker accepted and the output is y .

Firm At t = 1 the firm offers w∗. For t ≥ 2, the offer isw∗ only if history is H; otherwise the offer is 0.

Worker Accept any offer w ≥ w0; if history is H and thecurrent offer is w∗, exert effort; if history is not H orthe current offer is not w∗, shirk.

15 / 33

Page 16: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Efficiency wagesDeviation by the firm

If the firm does not deviate, gets a per period payoff of y − w∗.In equilibrium:

I if deviates to w 6= w∗, but w ≥ w0 worker accepts thecontract, shirks and later on the firm offers 0, hence getspy − w0 at most today and 0 tomorrow. Hence need thaty − w∗ ≥ (1− δ)(py − w0), which is true since by assumptionpy < w0.

I if deviates to w < w0 the worker chooses self-employment andthe firm gets 0, hence need y − w∗ ≥ 0.

Out of equilibrium (past history is not H)

I If offers w ≥ w0, the worker will shirk (since past history isnot H) and the firm gets at most py < w0 and the firm makesloses.

16 / 33

Page 17: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Efficiency wagesDeviation by the worker

In equilibrium (history is H and the firm offers w∗ today). Ifaccepts gets w∗ − e

I Accepting the offer dominates self-employment (obvious)

I if accept but shirks. Output is y with probability p and is 0otherwise. If y , history is still H next period, if 0, history is nolonger H.

I Let V be the continuation payoff if history is H. If the historyis not H, we have seen that the worker will chooseself-employment.

I If W is the per-period utility if the worker shirks, we have(one period deviation)

V = (1− δ)(w∗ − e) + δV (1)

W = (1− δ)w∗ + δ[pV + (1− p)w0] (2)

17 / 33

Page 18: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

and (1) and (2) implyV = w∗ − e

and incentive compatibility requires V ≥ W , or

w∗ ≥ w0 +1− δp

δ(1− p)e

Hence it is indeed necessary for the firm to give a rent to theworker in order to avoid shirking.

Out of equilibrium: if the history is not H, the firm offers 0 andthe worker chooses optimally self-employment.

18 / 33

Page 19: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Efficiency wagesSummary

The firm IC is y ≥ w∗, the worker IC is w∗ ≥ w0 + 1−δpδ(1−p)e, and

efficiency wages can indeed lead to workers not shirking in a firmonly if

y − w0 ≥1− δp

δ(1− p)e

for instance if p = 0 the right hand side is e/δ and the condition is

δ ≥ e

y − w0

19 / 33

Page 20: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Intertemporal preferences

Consider a stage game Γ = (N, {Si}, {ui}) and the associatedrepeated game Γ∞.

Usually assume that agents’ intertemporal preferences arerepresented by the average discounted utility: if δ ∈ (0, 1) is thediscount factor then U = (1− δ)

∑Tt=1 δt−1u(t) where T could be

∞.

Alternatives:

I finite averaging: (1/T )∑T

t=1 u(t) for T finite, and if T = ∞,

use infinite time average lim ∗T →∞(1/T )∑T

t=1 u(t) wherelim ∗ can be lim inf, lim sup. Results can be sometimes quitedifferent with this intertemporal representation of preferences

I Overtaking criterion: a sequence {u(t)} dominates a sequence{v(t)} if there exists T0 such that for all T ≥ T0,∑T

t=1 u(t) ≥∑T

t=1 v(t).

20 / 33

Page 21: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

The principle of optimalityPerfect information

History at t is h(t) ∈ H(t) = (×Ni=1Si )

t−1. Behavioral strategy σi

is a choice of (possibly mixte) action at each information set.Denote Eσ the expectation operator over histories generated by theN-tuple of strategy σ. Then, write the expected payoff to player iwhen strategy σ is played as

Ui (σ) = (1− δ)Eσ

∞∑t=1

δt−1ui (σ(t))

.

Consider the subgame starting at h(t). Denote by σh(t)i the

restriction of the behavioral strategy to information sets in thesubgame starting at h(t) and evaluate the expected payoff of eachplayer by using the same normalization as before.

σ is a subgame perfect equilibrium if for each h(t), σh(t) is a Nashequilibrium in the subgame originating at h(t).

21 / 33

Page 22: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Notation: write σi ∗ si (h(t) the strategy where i plays si (h(t)) ath(t) and σi (h) at all other information sets.

Principle of optimality (enough to consider one period deviations):σ is a subgame perfect equilibrium if and only if for each i , foreach t and history h(t), there does not exists an action si (h(t))feasible at h(t) such that

Ui (σ) < Ui (σ−i , σi ∗ si (h(t)), (3)

22 / 33

Page 23: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Step 1: necessity is immediate (since σi is a best response, oneperiod deviations cannot improve i ’s payoff).

Step 2: suppose that there is no beneficial one period deviationbut that there is a beneficial deviation where i deviates finitelymany times. Since there are finitely many deviations there exists Tand h(T ) such that i uses σ̂i (h(t)) 6= σi (h(T )) and for allt ≥ T + 1 uses σ. But then by (3), i must actually prefer usingσi (h(t)) than σ̂i (h(t). Repeating the argument leads acontradiction.

Notation: Consider two behavioral strategies σi and σ̂i . Let(σi ∗ σ̂i )

T be the strategy that coincides with σ for t ≥ T + 1 andthat coincides with σ̂ for t ≤ T .

23 / 33

Page 24: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Step 1: necessity is immediate (since σi is a best response, oneperiod deviations cannot improve i ’s payoff).

Step 2: suppose that there is no beneficial one period deviationbut that there is a beneficial deviation where i deviates finitelymany times. Since there are finitely many deviations there exists Tand h(T ) such that i uses σ̂i (h(t)) 6= σi (h(T )) and for allt ≥ T + 1 uses σ. But then by (3), i must actually prefer usingσi (h(t)) than σ̂i (h(t). Repeating the argument leads acontradiction.

Notation: Consider two behavioral strategies σi and σ̂i . Let(σi ∗ σ̂i )

T be the strategy that coincides with σ for t ≥ T + 1 andthat coincides with σ̂ for t ≤ T .

24 / 33

Page 25: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Step 3: Suppose that σ̂i improves on σi and contains infinitelymany deviations. Hence, there exists ε > 0 such that

Ui (σ−i , σ̂i )− Ui (σ) > ε (4)

Since there exists M < ∞ such that for all s, ui (s) < M, we havefor any T

Ui (σ−i , σ̂i )− Ui (σ−i , (σi ∗ σ̂i )T ) ≤ δT−1M (5)

But then, combining (4) and (5), we have for all T ,

Ui (σ−i , (σi ∗ σ̂i )T )− Ui (σ) > ε− δT−1M

Choosing T large enough, the right hand side of the inequality ispositive. But then (σi ∗ σ̂i )

T improves on σi , a contradiction fromstep 2.

25 / 33

Page 26: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Step 3: Suppose that σ̂i improves on σi and contains infinitelymany deviations. Hence, there exists ε > 0 such that

Ui (σ−i , σ̂i )− Ui (σ) > ε (4)

Since there exists M < ∞ such that for all s, ui (s) < M, we havefor any T

Ui (σ−i , σ̂i )− Ui (σ−i , (σi ∗ σ̂i )T ) ≤ δT−1M (5)

But then, combining (4) and (5), we have for all T ,

Ui (σ−i , (σi ∗ σ̂i )T )− Ui (σ) > ε− δT−1M

Choosing T large enough, the right hand side of the inequality ispositive. But then (σi ∗ σ̂i )

T improves on σi , a contradiction fromstep 2.

26 / 33

Page 27: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Optimality principleLeads to a dynamic programming formulation

Let V ⊆ RN be the set of subgame equilibrium payoffs: that isv ∈ V ⇔ ∃σ a SPE, such that v = U(σ).

Consider a SPE σ; by the optimality principle it is enough to verifythat for each h(t), each agent uses an optimal action at thisinformation set while keeping the same behavioral strategies in thefollowing subgame.

Let Eσ be the expectation operator over H(t + 1) when σ is playedat h(t).For each h(t + 1) the continuation utility is U(σh(t+1)).Applying the optimality principle we need to verify that for all σ̂i ,

(1− δ)ui (σ−i (h(t)), σ̂i ) + δEσ−i (h(t)),σ̂iUi (σ

h(t+1))

≤ (1− δ) ui (σ(h(t)))︸ ︷︷ ︸today

+δ Eσ(h(t))Ui (σh(t+1))︸ ︷︷ ︸

tomorrow

27 / 33

Page 28: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Folk theorems

Theorem (Friedman 1971) Let G be a finite, static game ofcomplete information. Let u denote the payoffs from a Nashequilibrium of G and let v , vi > ui ,∀i be a feasible payoff. Then ifδ is sufficiently close to 1, there exists a SPE of the infinitelyrepeated game G∞(δ) that achieves v as the average payoff.

Theorem (Fudenberg-Maskin 1986) Let u be the vector ofminmax payoffs in G . Let v where vi > ui ,∀i and suppose that thefeasible set has full dimension. Then there exists δ such that v isan average equilibrium payoff in the repeated game G∞(δ).

28 / 33

Page 29: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Folk theoremsRemarks

I Full dimensionality is needed in the second theorem whenthere are 3 or more players (condition can be weakened).

I If α(s) is a rational number for each s, then the theorems gothrough. For the first theorem: let the players play each swith frequency α(s), and revert to the Nash equilibrium ifthey fail to do so. For the second theorem a little moreinvolved since if i deviates the other players must make himget his minmax payoff in a credible way.

I Any feasible v can be written as the convex combination ofpayoffs corresponding to pure strategies: v =

∑s∈S α(s)u(s)

where α(s) ≥ 0 and∑

s∈S α(s) = 1.

I If α(s) is not rational for some s, then can only approximate v(since rational numbers are dense in R).

29 / 33

Page 30: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Finite horizon

Go back to the prisoner’s dilemma.

I If G is repeated finitely many times only, in the last period T ,agents use their dominant strategy independently of the pasthistory.

I Hence at T − 1, agents cannot change the play at T and alsouse their dominant strategy.

I Repeating the argument, the agents play (D,D) each period.

Does this imply that agents can improve on static Nash equilibriumpayoffs only when T = ∞?

30 / 33

Page 31: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Finite horizonBenoit-Krishna

I Actually not. The specificity of the prisoner’s dilemma gameis that there is a unique Nash equilibrium outcome in thestage game.

I If T is finite but there are at least two Nash equilibriumoutcomes that are not ranked similarly by two players, thenthere are equilibrium outcomes of the repeated game thatimprove on the stage game equilibrium payoffs.

I Can even get the Folk theorem when there are two players(Benoit-Krishna 1985)

31 / 33

Page 32: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Finite horizonExample 1

a2 b2 c2

a1 5, 3 0, 0 2, 0b1 0, 0 2, 2 0, 0c1 0, 0 0, 0 0, 0

Here two Nash equilibria in the stage game: (a1, a2) and (b1, b2).If game repeated twice, play (c1, c2) in the first period and play(a1, a2) at time 2 is an SPE; deviations are avoided by playing(b1, b2) at time 2 following any deviation at time 1.

32 / 33

Page 33: Graduate Microeconomics II Lecture 9: Repeated Gameshomepages.ulb.ac.be/~plegros/documents/classes... · comparison between static and repeated games since one can compare the per

Finite horizonExample 2

a2 b2 c2 d2

a1 4, 4 0, 5 0, 0 0, 0b1 0, 0 2, 1 0, 0 0, 0c1 5, 0 0, 0 1, 2 0, 0d1 0, 0 0, 0 0, 0 3, 3

Three Nash in the stage game: (b1, b2), (c1, c2), (d1, d2). Repeattwice: play (a1, a2) at t = 1 and (d1, d2) at t = 2. This is a SPE:if 1 deviates a t = 1, play (c1, c2) at t = 2; if 2 deviates at t = 1,play (b1, b2) at t = 2

33 / 33