final lecture. ``life can only be understood backwards; but it must be lived forwards.” søren...

Post on 19-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Final Lecture

``Life can only be understood backwards; but it must be lived forwards.”

Søren Kierkegaard

Thoughts on subgame perfection?

Some Problems from Chapter 12

Problem 1, Chapter 12 Find a separating equilibrium for this game

Equilibrium in Signaling games

• In this signaling game, Player 1 is the sender,Player 2 is the receiver. • In a Bayes’ Nash equilibrium for a signaling

game, we need to specify the receiver’s beliefs.

• Then we check whether when receiver takes action based on these beliefs, the outcome is consistent with these beliefs.

Getting started

• Since player 1 can be one of two types and there are two possible actions A and B for player 1, there are only two possible strategies for player 1 that result in separating equilibria. These are– Choose A if type s and B if type t– Choose B if type s and A if type t

• Let’s see if either or both of these strategies “works”.

Strategies and beliefs

• Let’s see if we can find beliefs for the receiver (player 2) that make for a separating equilibrium where player 1 plays A if type s and B if type t.

• Recall that Player 2 sees what player 1 played, but does not see his type.

• But if senders are using the above strategy, then Player 2 believes that those who play A are type s and those who play B are type t.

Problem 1, Chapter 12 Find a separating equilibrium for this game

Best responses for 2

• Then if player 2 sees action A, he believes that player 1 is type s and his best response given his beliefs is to take action y.

• If player 2 sees action B, he believes that player 1 is type t and his best response given these beliefs is to take action x.

Best responses for 1

• If Player 1 believes that player 2 plays y when he sees action A and x when he sees action B, what will Player 1 do?

• Look at the payoffs. If Player 1 is a type s, then he would rather that Player 2 play y than x. If he is of type t, he would rather Player 2 play x than y.

• So his best response to the way player 2 responds to messages is to send message s when he is type A and t when he is type B.

Beliefs confirmed

• So we see that if the receiver believes that sender will send message A if he is type s will send message B if he is of type t then in the resulting Nash equilibrium, the receiver’s beliefs are confirmed. This is what happens.

Another separating equilibrium?

• Suppose Player 2 believes that Player 1 will send message B if he is type s and A if he is type t.– Then if Player 2 sees message A, he believes he is

playing a type t and his best response is x – If he sees message B, his best response is y.

• But type s wants player 2 to do y and type B wants player 2 to play x.

• So what is Player 1’s strategy?

Beliefs not confirmed

• Suppose that Player 2 believes that Player 1 will send message B if he is type s and A if he is type t.

• Then we have shown that when Player 2 acts according to these beliefs, the best strategy for Player 1 is send message A if he is type s and B if he is type t.

• So these beliefs are not confirmed. There is not a separating equilibrium in which Player 2 has these beliefs.

Chapter 12, Problem 2

Nature determines Player 1’s type, which is either t=-1,t= 1,t=2, or t=3, each with probability ¼.Sender learns his type and sends one of three possible messages, bumpy, smooth or slick.Receiver observes message (but not type) an chooses one of three actions: a=0, a=5, or a=10.If sender is type t and receiver takes action a,payoff of sender is a×t and payoff of receiver is 2a×t . (typo in textbook-last word should be “action”, not “payoff”.)

Separating equilibriaum

• Part a) asks “Find a separating perfect Bayes Nash equilibrium”.

Answer: There isn’t one. There couldn’t be, since in a separating equilibrium each type takes a different action. But there are 4 types and only 3 messages you can send.

I think the author should have asked the question in the form: “Is there a separating PBNE? If so, find it.”

Semi-separating PBNE

• You might be able to guess that it will be fairly easy to separate the type t=-1 from the other types.

• Notice that this type and only this type wants the receiver to take action 0.

• Note also that the receiver will want to take action 0 if and only if the sender’s type is t=-1.

Let’s try this

• Start with receiver’s beliefs. Suppose receiver believes that senders strategy is – Say “bumpy” if you are of type -1 and say

“smooth” or “slick” if you are of type 1, 2, or 3.• If receiver hears “bumpy” and believes that

those who say “bumpy” are type -1, then his best response is 0. If receiver hears “smooth” or “slick”, his best response is 10.

Sender’s response

• Suppose that sender believes that receiver’s strategy is – If sender says “bumpy”, take action 0, if sender says

“smooth” or “slick”, take action 10. • Type -1 senders want receiver to take action 0. Other

types of senders want receivers to take action 10. • So given receiver’s strategy, best response of sender

is – Say “bumpy” if type=-1, say “smooth” or “slick” if

type=1,2, or 3.

Beliefs confirmed

• So we see that if receiver believes that sender will say “bumpy” if of type -1 and otherwise will say “smooth” or “slick”, then when in the resulting Nash equilibrium, the receiver’s beliefs about how senders behave will be confirmed.

Some Problems from Chapter 13

Problem 7 (the doctors)

• N doctors share a practice and share all income from it. Doctor can exert effort level, 1,2,…10. Profit of firm is 2(e1+e2+…en) where ei is effort level of Dr. i.

• Payoff to Dr. i is (1/n) 2(e1+e2+…en) –ei

• What is a Nash equilibrium?• What would be a best cooperative outcome?

Clicker question

In the stage game, what is the Nash equilibrium effort level for each doctor?A) 1B) 2C) 5D) 10E) There is no pure strategy Nash equilibrium.

Another question

• If one doctor believed that all other doctors would follow her example and work just as hard as she does, what effort level would this doctor choose.

A) 1B) 3C) 5D) 10

• When payoff to each Dr. i is • (1/n) 2(e1+e2+…en) –ei

• What is the payoff to each doctor if all doctors choose effort level e*?

A) ne*B) 0C) e*/2D) e*E) e*/n

Incentivizing with a grim trigger

• Suppose they all use the following grim trigger strategy. Work at effort level e*>1 so long as all others work that hard. If anybody works less hard at any time, then provide e=1 in all future periods.

• For what discount rates will this strategy be a SPNE?

Let’s try e*=10 and n=3.

• The most interesting of the SPNE is the one where everybody works at e*=10.

• Let’s do this one first.• Grim trigger-Strategy Do e=10 so long as

everybody else does e=10. If anybody ever works less, revert to e=1.

Payoffs

• If everybody plays this strategy, all work at e=10 in all periods. – They all get payoffs of 10 in each period for

expected payoff of 10(1+d+d2+…+)=10/(1-d)• If somebody works less than 10, say e=1 in the

first period, her payoff in period 1 is – 2 × (21 /3) - 1 = 13 – Her payoff in all future periods would then be 1.

Comparisons

• Expected value from playing grim trigger with effort 10 is 10/(1-d)

• Expected payoff from defecting in first period is 13+ 1×(d+d2+d3…)=13+d/(1-d)

• Grim trigger is NE if 10/(1-d)>13+d/(1-d)This is the case when 10>13(1-d)+d, which implies d>1/4.

For n=3 and e=e*>1

• We can show e=e*>1 can be sustained by a grim trigger strategy with 3 doctors so long as

Now let’s try e*=5 and n=3

• Consider grim trigger strategy, Work at effort level 5 so long as nobody shirks and work at level 1 if anybody works less.

• Expected payoff to each player if everybody does this is 5 in every period and 5/(1-d) over whole game.

• If you deviate and work at level 1, your payoff in first period is 2×(11/3)-1=19/3 and then you would get 1 in all future periods.

Comparison

• Grim trigger sustains effort level 5 if5/(1-d)>(19/3)+d/(1-d)This is equivalent to d>1/4.

General e*>1 and n=3

• If all play this strategy, they each get a payoff of 2(3e*)/3-e*=e* in each stage game.

• Discounted total payoff would be e*/(1-d)• If somebody provides only e=1 in the first

period, the best that Dr. could get is 2(2e*+1)/3-1=(4e*-1)/3 in the first period and 2(3)/3-1=1 in all future periods. The discounted total value of this stream is (4e*-1)/3+1(d/(1-d).

Comparison

• If the other player is playing the grim trigger strategy sustaining e*, then playing that strategy will be a best response if

e*/(1-d)> (4e*-1)/3+(d/(1-d).• This is equivalent to d>1/4

Part (b)

• What if doctors can only see total done by others, but not what each individual did.

• Same grim trigger works, just keys on the total.

Problem 7 (old edition)The stage game:• Payoff to player 1 is V1(x1,x2)=5+x1-2x2

• Payoff to player 2 is V2(x1,x2)=5+x2-2x1

• Strategy set for each player is the interval [1,4]

What is a Nash equilibrium for the stage game?A) Both players choose 4B) Both players choose 3C) Both players choose 2D) Both players choose 1E) There is no pure strategy Nash equilibrium

Part b (i)

• If the strategy set is X={2,3}, when is there a subgame perfect Nash equilibrium in which both players play a “grim strategy” always play 2 so long as nobody has ever played anything else, but play 3 forever if anyone ever plays 3.

• Note that “both play 3” is the only N.E. for the stage game.

• Compare payoff v(2,2) forever with payoff v(3,2) in first period, then v(3,3) ever after.

• That is, compare 3 forever with 4 in the first period and then 2 forever.

Payoff if both play 2 always

• Payoff in stage game to either player if both play 2 is

V (2,2)=5+2-2x2=3. • Expected payoff to each if both play 2 forever

is 3(1+d+d2 +…)=3/(1-d)

Payoff from playing 3

• If you play 3 in period 1 where the other player plays 2, and if in all future periods the other player plays 3, the best you can do after period 1 is play 3.

• Expected payoff in first period would be V(3,2)=5+3-2x2=4 • Expected payoff in future periods where play

continues would be V(3,3)=5+3-2x3=2. • Total expected payoff is then 4+3(d+d2+d3+…)=4+2d/(1-d).

Comparison

• If you play the grim trigger strategy, play 2 so long as nobody has played 3, but play 3 forever if anybody ever plays 3, your payoff is the discounted value of 3 forever: 3/(1-d)

• If the other player is playing this grim trigger strategy, then the best you can get by playing something is 4+2d/(1-d).

• This grim trigger is a SPNE if 3/(1-d)>4+2d/(1-d)• Equivalently, 3>4(1-d)+2d, or d>1/2.

Part b(ii) X=[1,4]

• When is there a subgame perfect equilibrium where everybody does y so long as nobody has ever done anything differently and everybody does z>y if anyone ever does anything other than y?

• First of all, it must be that z=4. Because actions after a violation must be Nash for stage game.

• When is it true that getting V(y,y) forever is better than getting V(4,y) in the first period and then V(4,4) forever?

Comparison

• V(y,y) forever is worth V(y,y)/(1-d)=(5-y)/(1-d)• V(4,y) and then V(4,4) forever is worth 9-y+1d+1d2+…=9-y+d/1-d)• Works out that V(y,y)>V(4,y) if d(8-y)>4– (Of course the problem also requires 1≤y≤4.)– Notice that for there to be an equilibrium that

sustains y=1 forever, we need d>4/7. To sustain y=3 forever, we would need d>4/5.

Problem 2, Chapter 13

Exploring the problem

• Note that the strategy profile {c, x} yields the highest total payoff for the two players and payoff is equally divided.

• Is this a Nash equilibrium? Why not?• What are the Nash equilibria? • Can we sustain repeated play of c, x by

subgame perfect grim trigger strategies that revert to a not-so-good Nash equilibrium if anyone fails to play c or x?

Best Responses and the Four Nash Equilbria

Question 2, Part a

• When is there a SPNE where:– Player 1 Plays strategy Cdgrim; chooses c so long as

all previous play is c,x but moves to d forever if Player 2 ever plays anything but x

– Player 2 Plays strategy Xygrim: Choose x so long as all previous play is c,x but moves to y forever if Player 1 ever plays anything but c.

Checking for SPNE• If Player 2 is plays Strategy Xygrim, Player 1’s payoff

from playing Cdgrim is 7 in every period so long as the game lasts. Expected payoff from this strategy is 7/(1-d).

• If Player 1 plays anything other than c at any time, on every later play, Player 2 will play y.

• Best possibility for Player 1 would be to play b and then d forever. Expected payoff from this strategy is 8 +6/(1-d).

• Note that once 1 has ticked off 2, 2 will always play w and d is a best response to w. And perpetual w is a best response to perpetual d.

Comparing

• Sticking with strategy Cdgrim and continuing to play C is better than any other play if

7/(1-d)>8+6d/1-d) This implies 7>8(1-d) +6d, which implies thatd>1/2.

Other SPNE

Grim trigger strategies that revert to other Nash equilibria are also SPNE for sufficiently large d. For example, suppose Player 1 reverts to b forever and 2 reverts to w forever if anyone fail to do c or x. This works if 7/(1-d)>8+3d/(1-d). Equivalently d>1/5.

Part b of question 2

• Don’t worry about working this one. It involves an intricate pattern of responses that is hard to follow and in my opinion not worth the effort required to work it out.

Getting cooperation in finite games

• Chapter 13, Problem 3 illustrates an important and interesting idea.

• In games that have two (or more) Nash equilibria, it is sometimes possible to get cooperation in early rounds.

• The idea. Although we must end up at a Nash equilibrium, the course of play can determine which one we wind up at.

Chapter 13, Problem 3

• We play the stage game from Problem 2 repeatedly, but only 3 times. Show that some “cooperative behavior” can be sustained in Nash equilibrium.

• This game has more than one Nash equilibrium and one is better for both than the others.

• This is what gives us a shot.

What we learned before.

• If the stage game has only one Nash equilibrium, then a game consisting of a finite number of repetitions has only one SPNE

• In this equilibrium, everybody always plays the Nash equilibrium action from the stage game.

• When there is more than one N.E. for the stage game, we can use the threat of reverting to the worse Nash equlibrium to incentivize good behavior in early rounds.

Proposed SPNE

• Player 1: Strategy A1-- Play c in period 1 and c in period 2 if other played x in period 1. Otherwise play b in periods 2 and 3. If Player 2 plays x in periods 1 and 2 plays x in both rounds 1 and 2, then play d in round 3.

• Player 2: Strategy A2-- Play x in period 1 and x in period 2 if other played c in period 1. Otherwise play w in periods 2 and 3. If Player 1 plays c in periods 1 and 2, play y in period 3.

Checking that A1,A2 is a SPNE

• Let’s work backwards. For each possible course of play in first two rounds, the third round is a regular subgame. Play in each of these subgames must be a N.E. One of these subgames occurs where 1 has played c twice and two has played x twice. Strategies A1 and A2 have player 1 play d and two play x in this case. This is a Nash equilibrium.

• In other subgames for last play, someone has done something other than c or x. In this case, strategies A1 and A2 prescribe b for 1 and w for 2. This is a Nash equilibrium as well.

• So the A1 and A2 prescribe Nash equilibria for all of the “last play” subgames.

Best Responses and the Four Nash Equilbria

Subgames after first play

• After the first play of the game, there are 25 different regular subgames corresponding to different actions on first play by the players.

• If on the first play, Player 1 did c and Player 2 did x, then if 1 follows A1 and 2 follows A2, they will play c and x on second round and d and y on third round. They will each get payoff 7+7+6=20.

• Could Player 1 do better in this subgame?• The best deviation from strategy A1 for Player

1 would be to play b rather than c at this point Why?

• If Player 1 plays c on round 1 and b on round 2 and player 2 is playing A2, then Player 2 will play x on rounds 1 and 2 and w on round 3.

• Best Player 1 can do then is to play b on round 3 and get total payoff 7+8+3=18

• Since playing A1 gives him 20>18, A1 prescribes Nash equilibrium play on this subgame.

What about the other 24 subgames after first round.

• In the other subgames after the first round, somebody has played something other than c or x.

• In this case, if Player 2 is playing A2, Player 2 will play w in the next two rounds.

• If Player 2 is playing w in next two rounds, best response for Player 1 is to play b in next two rounds, which is what Strategy A1 prescribes.

Conclusion for these subgames

• We have seen that at all subgames starting after the first round, A1 prescribes best responses to A2.

• Symmetric reasoning shows that A2 prescribes best responses to A1.

• Thus we have shown that A1 and A2 prescribe Nash equilibrium play in all regular proper subgames.

Conclusion for Full Game

• We still need to show that A1, A2 is a Nash equilibrium for the full game.

• We saw that payoff to Player 1 from A1 is 20.• Suppose Player 1 plays something other than c on

first round. Then A2 will have 2 play w in the next two rounds.

• Best thing other than c for Player 1 on first round is b. After that given that 2 is playing w, playing b is best in the next two rounds for Player 1.

• So best Player 1 can get by deviating in first round is 8+3+3=14<20.

Conclusion

• Symmetric reasoning applies to Player 2.• The strategy profile A1, A2 is a subgame

perfect Nash equiibrium since the substrategies prescribed in each subgame are Nash equilibria (best responses to each other)

Understanding what happens

• In a subgame perfect Nash equilibrium for a finitely repeated game, it must be that play in the last round is a Nash equilibrium for the stage game.

• In this example, there is more than one Nash equilibrium we could wind up at.

• We can get cooperation in early rounds by threats of going to a “bad” last period equilibrium if others misbehave while doing your part for a “better” last period Nash equilibrium in the last period if others behave.

Problem 4, Ch 13

a) Define a grim-trigger strategy profile.b) Derive conditions whereby this strategy profile is a SPNE.(proposed answer to b: d>3/4)

Hints for Problem 4

• What is a nice outcome for stage game?• What is a Nash equilibrium for this game.• Define “grim trigger” strategies in which each

player does her part of a nice outcome so long as the other does his part, but if either ever does anything else, both revert to the Nash equilibrium forever.

• Find payoffs from always playing “nice”.• Find best you can do by “defecting” from nice

play when other is playing the grim trigger.

Problem 5, Ch 13

a) Find a SPNE strategy profile that results in an outcome path where both players

choose x in every period.

• Note: x,x is not a Nash equilibrium for stage game, but w,w and z,z are.

• We see that x,x is better for both than either w,w or z,z.

• We could construct trigger strategies with either w,w or z,z as the threat.

• For what values of d is there a SPNE trigger strategy with z,z being the threat?

Proposed answer to part a

• With z,z as the reversion “punishment”, We need 6/(1-d)>10+3d/(1-d). This means d>4/7.

• There is also a SPNE in which the reversion is to w,w for some values of d?

• For you to figure out: What values of d?

Part b) Find a SPNE strategy profile that results in an outcome path where players choose x in odd numbered periods and y in even periods.• Try strategies. Continue to abide by the rule “play x

in odd periods, y in even” so long as nobody has ever violated this rule. If anybody violates the rule, play z forever.

• Payoff from playing this rule forever is 6+8d+6d2+8d3+6d4+8d5+6d6+… =6(1+d2+d4+d6+..)+8d(1+d2+d4…) =6(1+d2+(d2)2+(d2 )3+…)+8d(1+d2+(d2)2+(d2 )3+…)=6/(1-d2)+8d/(1-d2)

Payoff from violating rule• Most profitable violation is choose d at start.If other is playing the proposed trigger strategy, Other will play x on first play and violator will get 10 on first play. But ever after, other will play z, and best violator can do is play z. Payoff from doing this is 10+3d/(1-d). • Proposed strategy profile is a SPNE if 6/(1-d2)+8d/(1-d2)>10+3d/(1-d). This is true if 7d2+5d>4. We see that the left side of this inequality is increasing in d. We also see that the inequality holds for d=1, but not for d=1/2. (We could solve a quadratic to find exactly which d’s work.)

Part c) Find a SPNE strategy profile that results in an outcome path in which players choose x in

first 10 periods, then always choose z.• There ain’t one. Can you see why?

Part d) You should be able to show that the one and only grim trigger strategy that does this is one where players revert to z if someone ever deviates form choosing y.

Final exam

• Exam will ask questions from all chapters.• Some problems will be easy, some will be

harder.

May all your subgames be happy..Even if not always regular and proper.

top related