reputation and the flow of information in repeated …ef253/2013_10_10_repflow.pdf1 introduction...

24
Reputation and the Flow of Information in Repeated Games Eduardo Faingold Department of Economics Yale University [email protected] October 2013 Abstract This paper derives equilibrium payoff bounds from reputation effects for repeated moral hazard games in which a long-run player interacts frequently with a population of short-run players and the monitoring technology varies with the length of the period of interaction. The bounds depend on the monitoring technology through the flow of information, a measure of signal informativeness (per unit of time) based on relative entropy. Examples are shown where, under complete information, the set of equilib- rium payoffs of the long-run player converges, as the period length tends to zero, to the set of static equilibrium payoffs, whereas when the game is perturbed by a small ex ante probability on commitment types, reputation effects remain powerful in the frequent interaction limit. Keywords: Reputation, commitment, imperfect monitoring, frequent interactions. JEL Classification: C70, C72. I thank Mehmet Ekmekci, Drew Fudenberg, Olivier Gossner and George Mailath for stimulating discus- sions and helpful comments, and the National Science Foundation (grant SES-1227593) for financial sup- port. This paper subsumes and improves upon the earlier analysis of reputations under frequent interactions in Faingold (2006). 1

Upload: others

Post on 07-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

Reputation and the Flow of Informationin Repeated Games�

Eduardo FaingoldDepartment of Economics

Yale University

[email protected]

October 2013

Abstract

This paper derives equilibrium payoff bounds from reputation effects for repeatedmoral hazard games in which a long-run player interacts frequently with a populationof short-run players and the monitoring technology varies with the length of the periodof interaction. The bounds depend on the monitoring technology through the flow ofinformation, a measure of signal informativeness (per unit of time) based on relativeentropy. Examples are shown where, under complete information, the set of equilib-rium payoffs of the long-run player converges, as the period length tends to zero, tothe set of static equilibrium payoffs, whereas when the game is perturbed by a smallex ante probability on commitment types, reputation effects remain powerful in thefrequent interaction limit.

Keywords: Reputation, commitment, imperfect monitoring, frequent interactions.

JEL Classification: C70, C72.

�I thank Mehmet Ekmekci, Drew Fudenberg, Olivier Gossner and George Mailath for stimulating discus-sions and helpful comments, and the National Science Foundation (grant SES-1227593) for financial sup-port. This paper subsumes and improves upon the earlier analysis of reputations under frequent interactions inFaingold (2006).

1

Page 2: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

1 Introduction

This paper studies reputation effects (in the sense of Kreps and Wilson (1982), Milgromand Roberts (1982)) in repeated games in which a long-run player interacts frequently witha population of short-run players. The main result provides upper and lower bounds (akinto those of Fudenberg and Levine (1989, 1992)) on the set of Nash equilibrium payoffs ofthe long-run player in the limit as the discount rate and the period length tend to zero (inany order).

In typical studies of repeated games, it is common to interpret comparative statics withthe discount factor as an exercise of varying the length of the period of interaction. Thelimit when the discount factor tends to one is often interpreted as the limit when the periodlength tends to zero and the monitoring technology is independent of the period length.But, when players observe noisy signals about each other’s actions (as in many applicationsof interest) it may be more natural to assume that the informativeness of those signals isconstant over each unit of real time. Since this requires the signal distributions to vary withthe period length (Abreu, Milgrom, and Pearce, 1991), it leads to a comparative statics thatis different from the traditional one: as the periods shrink, the discount factor tends to oneand the informativeness of the signals (over the period of interaction) deteriorates. It isnow well known that allowing the monitoring technology to vary with the period length inthis way can have a dramatic effect on the equilibrium analysis of repeated games: in somecases, the set of (perfect public) equilibrium payoffs of the repeated game converges, asthe period length tends to zero, to the (convex hull of the) set of static Nash equilibriumpayoffs (Abreu, Milgrom, and Pearce (1991), Faingold and Sannikov (2011), Fudenbergand Levine (2007, 2009), Sannikov and Skrzypacz (2007)).

How about when the repeated game is perturbed by a small ex ante probability that thelong-run player may be a commitment type? Do reputation effects survive in the limit as theperiod length shrinks to zero and the signal informativeness deteriorates? The main resultof this paper, Theorem 2, answers this question in the affirmative, by providing upper andlower bounds on the set of Nash equilibrium payoffs of the long-run player in the limit asthe period length and the discount rate tend to zero (in any order). Examples are shown inwhich the bounds are tight and equal the Stackelberg payoff, whereas without uncertaintyover types the equilibrium set would collapse to static Nash.

The intuition for the dramatic difference between the complete and incomplete informa-tion results is simple. In the perturbed game, due to the learning process of the uninformedshort-run players, the relevant time frame for reputation effects is the “long run”—a repu-tation is not won in a day—and in the long run the many (poorly informative) signals canbe aggregated to become arbitrarily informative. By contrast, in the unperturbed complete

2

Page 3: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

information game, there is no learning about types and non-myopic incentives can only becreated by using each period’s signals to adjust continuation play to deter deviations. Thus,unlike the perturbed game, the relevant period for incentives in the complete informationgame is the period of interaction itself—over which the signals are becoming noisier andnoisier—, hence the possibility of collapse to static Nash equilibrium.1

In spite of this clean intuition, the result is more subtle than it may appear at first glance,for it cannot be proved by appealing directly to the classical Fudenberg-Levine bounds(henceforth FL bounds). To see why, let us first briefly review the argument of Fudenbergand Levine (1992). Assume (for simplicity) that there is a unique best-reply to the Stack-elberg action in the stage game and that the signal distributions statistically identify theaction of the long-run player. Fix an " > 0 and an arbitrary Nash equilibrium of the per-turbed game and consider what happens under a deviation in which the normal type mimicsthe behavior of the Stackelberg type (i.e, the behavioral type who is committed to playingthe Stackelberg action in every period). Fudenberg and Levine (1992) show that, underthis deviation, on a set of infinite histories that has probability at least 1 � ", there is anupper bound K on the number of periods in which the short-run players are not playing abest-reply to the Stackelberg action, where K depends on the monitoring technology andon ", but is independent of the particular equilibrium fixed and hence of of the discountfactor. Due to this uniformity, by letting the discount factor tend to 1 and " tend to zero inthis order (assuming a fixed monitoring structure), the expected average discounted payoffof the normal type (under the deviation) converges to the Stackelberg payoff. Hence, theStackelberg payoff must be a lower bound on (the patient limit of) the set of Nash equilib-rium payoffs of the normal type, because it is a lower bound on the expected payoff from adeviation. Similar ideas apply to the upper bound.

The argument above assumes a fixed monitoring technology, and thus a fixed periodof interaction �. The reason why it is not possible to derive the reputation bounds forgames with frequent interactions by taking the limit of the FL bounds is related to howthe uniform bound K depends on � (through the monitoring technology). In Section 4, asimple example with Poisson signals is provided which shows that K need not scale with�. That is, if the period length is cut in half, one might expect the number of periods ofnon-best replies to (at least approximately) double, but this is not the case for K, which isonly an upper bound on the number of non-best replies (albeit a uniform one).

To circumvent this problem the proof of Theorem 2 result relies on a recent result ofGossner (2011), who uses entropy techniques to derive reputation bounds that improve onthe FL bounds for fixed discount factor and fixed monitoring. The bounds provided in

1See a recent paper by Bohren (2011) for an alternative channel by which intertemporal incentives can becreated in the frequent interaction limit which does not rely on uncertainty over types.

3

Page 4: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

Theorem 2, which are the limits of the Gossner bounds as the period length tends to zero,rely on a notion of self-confirmed best-reply that is based on an infinitesimal form of relativeentropy, which can be interpreted as a measure of the flow of information in the game.Several examples are provided to illustrate how this measure of information flow (and hencethe bounds) can be calculated, including an example in which the upper and lower boundsconverge to the Stackelberg payoff as the period length and the discount rate tend to zeroin this order, while applying the same order of limits to the classical Fundenberg-Levinediscrete-time result yields uninformative bounds.2;3 While the Gossner bounds and the FLbounds always coincide in the patient limit when the monitoring technology is fixed, thelimits can be very different when the monitoring and the discount factor vary simultaneouslyto adjust for the period length.4

Reputation effects under frequent interactions have also been studied in Faingold andSannikov (2011), where it is assumed that the period length is exactly zero, i.e., the inter-action takes place directly in continuous time. Unlike this note, which focuses on frequent-interaction bounds on equilibrium payoffs for patient long-run players under general as-sumptions on the monitoring structure and on the type space, Faingold and Sannikov (2011)provide a characterization of equilibrium behavior (directly in continuous time) for fixeddiscount rates, by restricting attention to games with Brownian signals and assuming thatthere is a single commitment type. Another closely related paper is Faingold (2013), whichderives analogues of the FL bounds for continuous-time games with signals driven by aLévy process (which includes games with Brownian and Poisson signals as special cases.).A pair of examples in Section 3 demonstrates that the reputation result for continuous-time games with Lévy signals does not imply the limit of discrete-time games result of thepresent paper.5

2That is, the FL upper and lower bounds converge to the greatest and least feasible payoff of the large playerin the stage game, respectively.

3Such and example is not at all pathological: the gap between the limit of the FL bounds as the period lengthshrinks and the frequent interaction bounds based on the information flow always arises for (nontrivial) gameswith signals that are sampled from a Poisson process (as in Abreu, Milgrom, and Pearce (1991)).

4See also Ekmekci, Gossner, and Wilson (2012) for another reputation result that relies crucially on theimprovement of the Gossner bound over the FL bound.

5This seemingly pathological failure of upper hemi-continuity (from discrete to continuous time) is dueto the nonexistence of a topology on the space of continuous-time behavior strategies under which expectedpayoffs are continuous and the space of behavior strategies is compact.

4

Page 5: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

2 Baseline Model

2.1. Discrete-time repeated game. We recall the canonical reputation model of Fudenbergand Levine (1992), in which a long-run player faces a sequence of short-run players in aninfinitely repeated game with imperfect monitoring in discrete time. At the beginning ofperiod n D 0; 1; 2; : : :, the long-run player and the current short-run player simultaneously,and privately, choose actions an1 2 A1 and an2 2 A2 respectively, whereA1 andA2 are finiteaction spaces. A publicly observable signal yn is then drawn from a measurable space Yaccording to a probability distribution q.�jan1 ; a

n2/, where q W A! �.Y / andA WD A1�A2.

At the end of the period, the current short-run player departs from the game, being replacedby a new short-run player at the beginning of the following period. The new short-runplayer knows the entire past history of public signals.

The long-run player is privately informed of his type ! 2 �, where � is a countabletype space. The short-run players are uncertain as to which type of long-run player theyface and share a common prior � 2 �.�/, which satisfies �.!/ > 0 for all ! 2 �. Thetype space� contains strategic (or normal) types and behavioral types. The set of strategictypes is denoted �s � � and the overall payoff of a strategic type ! is

1∑nD0

.1 � ı/ıng1.an1 ; a

n2 ; !/;

where g1 W A1�A2��s ! R is the stage-game payoff function of the long-run player andı 2 .0; 1/ is his discount factor. A behavioral type is a mechanistic type of long-run playerwho is committed to playing some mixed action ˛1 2 �.A1/ in every period, irrespectiveof history.6 The set of behavioral types, denoted �b WD � n�s , is then naturally identifiedwith a countable subset of �.A1/. Finally, the payoff function of the short-run players iscommon knowledge and denoted g2 W A1 � A2 ! R.

2.2. Strategies and equilibrium. The space of n-period histories of the long-run player isdenoted Hn

1 WD �� .A1 � Y /n and is equipped with the product � -field, where H 0

1 WD �.Likewise, the space of n-period histories of the short-run players is denotedHn

2 WD Yn and

is endowed with the product � -field, where H 02 is the one-point set. A (behavior) strategy

of player i (i D 1; 2) is a sequence of measurable maps �ni W Hni ! �.Ai /, n > 0, such

that �1.!; �/ D ! for every behavioral type ! 2 �b . The set of strategies of player i isdenoted†i and the set of strategy profiles is denoted† WD †1�†2. Given the prior�, eachstrategy profile � 2 † induces a unique probability measure P�;� over the space of playsH1 WD ��.A1�A2�Y /

1 endowed with the product � -field. For each! 2 �, the symbolP!;� designates the induced probability measure over plays conditional on !. Expectations

6The payoff functions of behavioral types are left unmodeled

5

Page 6: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

with respect to P�;� and P!;� are denoted E�;� and E!;� , respectively. A profile � 2 †is a (Bayes-)Nash equilibrium of the repeated game �ı D

(.�;�/; .Ai ; gi /iD1;2; .Y; q/; ı

)if for every � 01 2 †1,

E�;�

[ 1∑nD1

.1 � ı/ıng1.an1 ; a

n2/]> E�;.� 01;�2/

[ 1∑nD1

.1 � ı/ıng1.an1 ; a

n2/];

and for every n > 0 and � 02 2 †2 such that � 0m2 D �m2 for every m ¤ n,

E�;�[g2.a

n1 ; a

n2/]> E�;.�1;�

02/

[g2.a

n1 ; a

n2/]:

For each ! 2 �s , let

N1.!; ı/ WD{E!;�

[ 1∑nD1

.1 � ı/ıng1.an1 ; a

n2/]W � is a Nash equilibrium of �ı

};

the set of Nash equilibrium interim payoffs of type !. Finally, define

N 1.!; ı/ WD infN1.!; ı/ and xN1.!; ı/ WD supN1.!; ı/:

2.3. Reputation effects. We review the classical result of Fudenberg and Levine (1992),which provides upper and lower bounds on the set of Nash equilibrium payoffs of patientstrategic types. Recall that a mixed action ˛2 2 �.A2/ is a self-confirmed best-reply to ˛1 2�.A1/, written ˛2 2 B2.˛1/, if ˛2 is not weakly dominated and ˛2 2 arg max˛02 g2.˛

01; ˛02/

for some ˛01 2 �.A1/ with q.�j˛01; ˛2/ D q.�j˛1; ˛2/. For each ! 2 �s define

g1.!/ WD sup

˛12�.A1/

inf˛22B2.˛1/

g1.˛1; ˛2; !/

andxg1.!/ WD sup

˛12�.A1/

sup˛22B2.˛1/

g1.˛1; ˛2; !/:

The latter is called the generalized Stackelberg payoff, which can be greater than the tradi-tional Stackelberg payoff due to imperfect observability. Indeed the set of self-confirmedbest-replies can be greater than the set of best-replies, but the two concepts coincide whenthe long-run player’s actions are identified, i.e. when q.�j˛1; ˛2/ D q.�j˛01; ˛2/ implies˛1 D ˛01 for all ˛1, ˛01 2 �.A1/ and all ˛2 2 �.A2/ that are not weakly dominated.Finally, the stage game is called non-degenerate if there is no undominated pure actiona2 2 A2 with g1.�; a2/ D g1.�; ˛2/ for some ˛2 2 �.A2/ n fa2g.

Theorem 1 (Fudenberg and Levine (1992)). Suppose that behavioral types have full sup-port, i.e. �b is a dense subset of �.A1/. Then, for every strategic type ! with �.f!g/ > 0,

g1.!/ 6 lim inf

ı!1N 1.!; ı/ 6 lim sup

ı!1

xN1.!; ı/ 6 xg1.!/:

6

Page 7: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

Moreover, if the stage game is non-degenerate and the long-run player’s action are identi-fied, then xg1.!/ D g1.!/ and hence,

limı!1

N1.!; ı/ D{xg1.!/

}:

3 Reputation under Frequent Interactions

Turning to the analysis of reputation effects in games with frequent interactions, embed thediscrete-time model of the previous section in continuous time, so each period of interactionhas length � > 0. The long-run player discounts his flow payoffs exponentially at rater > 0, so his discount factor across periods is ı D e�r�. As in Abreu, Milgrom, andPearce (1991), the monitoring structure .Y �; q�/ is allowed to vary with the period length�. In particular, the informativeness of each period’s signals may deteriorate as the periodsshrink, as when the signals are sampled from an underlying continuous-time process, suchas when the noise is driven by a Poisson process (as in Abreu, Milgrom, and Pearce (1991)),a Brownian motion (as in Sannikov and Skrzypacz (2007) and Fudenberg and Levine (2007,2009)) or, more generally, a Lévy process (as in Sannikov and Skrzypacz (2010)).

The ensuing analysis relies on Gossner (2011), who uses methods from information the-ory (Cover and Thomas, 2006) to derive payoff bounds that improve on those of Fudenbergand Levine (1992) for fixed discount factor and monitoring structure. Recall the definitionof the relative entropy (also called Kullback-Leibler divergence) between two probabilitymeasures P and Q on a measurable space X :

H.QjjP / WD

{ ∫X ln.dQ

dP/ dQ; if Q << P;

1 otherwise,

where dQ=dP denotes the Radon-Nikodym derivative. The relative entropy is known tobe non-negative, equal to zero if Q D P and lower semi-continuous when viewed as afunction defined on �.X/ ��.X/ mapping into R [ fC1g.7

For each ˛1, ˛01 2 �.A1/ and ˛2 2 �.A2/ define

h.˛1; ˛01I˛2/ WD lim inf

�!0

1

�H(q�.�j˛1; ˛2/

∣∣∣∣q�.�j˛01; ˛2/); (1)

which inherits the properties of the relative entropy and is thus non-negative, equal to zeroif ˛1 D ˛01 for any ˛2, and lower semi-continuous when viewed as a function defined on�.A1/ ��.A1/ ��.A2/ taking values in R [ fC1g. The function h quantifies the flow

7The lower semi-continuity is relative to the norm topology on �.X/, not the weak*-topology.

7

Page 8: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

of information that the short-run players receive about the actions of the long-run player inthe limit when the interactions become arbitrarily frequent.

A mixed action ˛2 2 �.A2/ is an information-flow confirmed best-reply to ˛1 2�.A1/, written ˛2 2 B�2 .˛1/, if ˛2 is not weakly dominated and ˛2 2 arg max˛02 g2.˛

01; ˛02/

for some ˛01 2 �.A1/with h.˛1; ˛01I˛2/ D 0. The long-run player’s actions are information-flow identified if h.˛1; ˛01I˛2/ D 0 implies ˛1 D ˛01 for all ˛1, ˛01 2 �.A1/ and undomi-nated ˛2 2 �.A2/. For each ! 2 �s define

g�1.!/ WD sup

˛12�.A1/

inf˛22B

�2 .˛1/

g1.˛1; ˛2; !/

andxg�1 .!/ WD sup

˛12�.A1/

sup˛22B

�2 .˛1/

g1.˛1; ˛2; !/:

Finally, denote by N 1.!; r;�/ and xN1.!; r;�/ the infimum and supremum, respectively,of the set of Nash equilibrium payoffs of type ! in the repeated game with discount rater and period length � (i.e. the repeated game of the previous section with discount factore�r� and monitoring structure .Y �; q�/).

The following result is the analogue of Theorem 1 for games with frequent interactions.The proof, relegated to the appendix, appeals to the main result in Gossner (2011).

Theorem 2. Suppose that behavioral types have full support, i.e. �b is a dense subset of�.A1/. Then, for every strategic type ! with �.f!g/ > 0,

g�1.!/ 6 lim inf

r!0lim inf�!0

N 1.!; r;�// 6 lim supr!0

lim sup�!0

xN1.!; r;�// 6 xg�1 .!/:

Moreover, if the stage game is non-degenerate and the long-run player’s action are information-flow identified, then xg�1 .!/ D g

1.!/ and hence,

limr!0

lim�!0

N1.!; r;�/ D{xg�1 .!/

}:

What is learned from this result, especially in conjunction with the examples below, isthat the appropriate notion of signal informativeness to study reputation effects in the fre-quent interaction limit is the one given by the information flow h, the infinitesimal versionof relative entropy. This is key, for if other notions of informativeness were used, such as theone based on the total variation distance used by Fudenberg and Levine (1992) in discretetime, the resulting payoff bounds may become completely uninformative in the frequentinteraction limit, as discussed in the next section.

To illustrate the result, explicit calculations of the information flow h are provided fora few examples of monitoring structures with frequent interactions. The examples will

8

Page 9: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

also be useful to clarify the connection between the frequent-interaction limit and repeatedgames that are played directly in continuous time.

Example 1 (Sampling from a Poisson process). As in Abreu, Milgrom, and Pearce (1991),assume there is an underlying counting process, N.t/, whose instantaneous arrival rateis a function of the action profile. The players can adjust their actions at discrete timest D 0;�; 2�; : : :, and at any time t in this grid the value of N.t/ is publicly observedbefore the players get to choose their time t actions. Thus, in the formalism of Section 2,the public signals are the increments yn D N.n�/ �N..n � 1/�/, so that

Y � D f0; 1; 2; : : :g and q�.yja/ D e��.a/�.�.a/�/y=yŠ for all y 2 f0; 1; : : :g;

where the function � W A ! Œ0;1/ gives the arrival rate. Then, the information flow canbe explicitly calculated:

h.˛1; ˛01I˛2/ D �.˛

01; ˛2/ � �.˛1; ˛2/ � �.˛1; ˛2/ ln

(�.˛01; ˛2/�.˛1; ˛2/

);

for every ˛1, ˛01 2 �.A1/ and ˛2 2 �.A2/. By the concavity of the logarithm, h.˛1; ˛01I˛2/is always nonnegative, and it can be readily verified that h.˛1; ˛01I˛2/ D 0 if and only if�.˛01; ˛2/ D �.˛1; ˛2/. In particular, if the long-run player has two pure actions, then thegame is information-flow identified if and only if the arrival rates induced by the two pureactions are different regardless of the short-run player’s mixed action; if the long-run playerhas three or more pure actions, then the game cannot be information-flow identified.

Thus, if the long-run player has two pure actions, which induce different arrival rates,then the set of Nash equilibrium payoffs of a strategic type will converge to his Stackelbergpayoff as the period length and the discount rate tend to zero (assuming a non-degeneratestage game and a prior with full support). By contrast, consider the repeated product-choicegame where the long-run player is a firm who chooses whether to provide high or lowquality, and the short-run players are consumers who decide whether to buy the high-end orthe low-end product of the firm; the stage-game payoffs are:

High-end Low-endHigh 2; 3 0; 2

Low 3; 0 1; 1

Then, if the Poisson arrivals are “good news,” i.e. �.High; a2/ > �.Low; a2/ for everya2, then the only sequential equilibrium of the complete information game (when � issufficiently small) is the repetition of the static equilibrium .Low;Low-end/ after everyhistory, regardless of the discount rate, by an argument similar to that in Abreu, Milgrom,

9

Page 10: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

and Pearce (1991).8 �

Example 2 (Sampling from a Brownian motion). As in Sannikov and Skrzypacz (2007),assume there is an underlying diffusion process, X.t/, which evolves according to

dX.t/ D �.a.t// dt C dZ.t/; t 2 Œ0;1/;

where Z is a standard Brownian motion and the drift is given by � W A ! R. Players canadjust their actions at times t D 0;�; 2�; : : :, and at any time t in this grid the value ofX.t/is publicly observed before the players get to choose their actions a.t/ D .a1.t/; a2.t//, sothat

Y � D R and q�.�ja/ � N(�.a/�;�

):

Thus, for any � > 0 and any mixed-action profile, the induced distribution over publicsignals is a mixture of Gaussian distributions. Unfortunately, the relative entropy of a pairof mixtures of Gaussians does not admit a representation in terms of the relative entropies ofthe pairs of Gaussians in the support, and no closed-form expression is known. But we areonly interested in the flow of information, that is, the limit as�! 0 of the quotient betweenthe divergence and the period length �, and that turns out to have a simple expression:

h.˛1; ˛01I˛2/ D

1

2

(�.˛01; ˛2/ � �.˛1; ˛2/

)2 for all ˛1; ˛01 2 �.A1/; ˛2 2 �.A2/;

as shown in the appendix. Thus, h.˛1; ˛01I˛2/ D 0 if and only if �.˛01; ˛2/ D �.˛1; ˛2/,hence, if the long-run player has only two pure actions, then the game is information-flowidentified if and only if the drifts induced by the two pure actions are different regardless ofthe mixed action of the short-run player; if the long-run player has three or more actions,then the game cannot be information-flow identified.

Thus, also in this example, the incomplete information game features strong reputationeffects, while the equilibrium of the complete information game becomes degenerate as theperiod length tends to zero, as shown in Fudenberg and Levine (2007, 2009). �

Example 3 (Binomial approximation of Brownian motion I). Consider the monitoring tech-nology .Y �; q�/, where

Y � D{p

�;�p�}

and q�(˙p�∣∣a) D 1

2˙ �.a/

p�;

8Although Abreu, Milgrom, and Pearce (1991) only examine repeated games between two long-run players,similar arguments apply to games between a long-run player and a sequence of short-run players.

10

Page 11: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

for given functions � W A ! R and � W A2 ! .0;1/. This relates to Brownian motionvia Donsker’s Invariance Principle: For any fixed a 2 A, the family of stochastic processes.S�/�>0,

S�.t/ D

Œt=��∑nD0

yn�; t 2 Œ0;1/;

where y1�; y2�; : : : are i.i.d. random variables with distribution q�.�ja/, converges weakly

to a diffusion process with drift �.a/ and volatility 1, as � ! 0. Here, as expected, theinformation flow h takes exactly the same form as in the previous example, as can be readilyverified. However, this is not true for all discrete-time approximations of Brownian motion,as shown by the next two examples. �

Example 4 (Binomial approximation of Brownian motion II). Consider the monitoring tech-nology .Y �; q�/, where

Y � D{�.a/�˙

p� W a 2 A

}and q�

(�.a/�˙

p�ja

)D1

2

for given � W A ! R. Here, too, the cumulative process S� (defined as in the previousexample) satisfies the conditions of Donsker’s Theorem, and so it converges weakly to adiffusion with drift �.a/ and volatility 1. Suppose that the long-run player has only twoactions and that those two actions induce different values of � irrespective of the short-runplayer’s action. Then, for every positive �, the short-run players can perfectly monitor thelong-run player (so that h D 1). In particular, the long-run player’s actions are identifiedfor every positive �. Thus, given any non-degenerate stage game and any prior with fullsupport on behavioral types, the set of Nash equilibrium payoffs of any strategic type con-verges to his Stackelberg payoff as � tends to 0, irrespective of patience (i.e., for any fixedpositive discount rate r).

Nonetheless, the continuous-time limit game has full support imperfect monitoring:changes of drift induce absolutely continuous changes of the probability measure over thetrajectories of the diffusion (over any finite horizon). Indeed, in the continuous-time game,the set of Nash equilibrium payoffs of a strategic type must converge to his Stackelbergpayoff as the discount rate r tends to 0, as shown in Faingold (2013). For fixed discountrates, however, the equilibrium payoffs in the continuous-time game are typically lowerthan the Stackelberg payoff (Faingold and Sannikov, 2011), unlike the frequent interactionlimit discussed above, which does not require r ! 0 to yield a reputation effect. �

Example 5 (Binomial approximation of Brownian motion III). Suppose A1 D f0; 1g andconsider the monitoring technology .Y �; q�/, where

Y � D{˙p�;˙

p�.1 ��/

}and q�

(˙p�.1 � a1�/ja

)D1

2for a1 2 f0; 1g:

11

Page 12: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

Then, the cumulative process S�, defined as in Example 3, converges weakly to a standardBrownian motion irrespective of a.9 Thus, in the continuous time limit the signal is com-pletely uninformative, and therefore only static equilibria are possible. Nonetheless, foreach positive �, the signal distributions under the two pure actions have disjoint support,hence there is perfect monitoring for each � > 0. Just like in the previous example, herea reputation result obtains in the limit as the period length � tends to zero for any fixeddiscount rate. But, unlike the previous example, only static equilibria are possible in thecontinuous-time game, irrespective of patience. So, here, the discontinuity between discreteand continuous time is even more dramatic.10 �

4 Discussion

This section discusses the connection between the frequent interaction bounds based onthe information flow h and the limit of the FL bounds as the period length tends to zero.Specifically, an example is provided in which the frequent interaction limit of the FL boundsis completely uninformative, while the frequent interaction bounds based on the informationflow h are tight and equal the Stackelberg payoff in the patient limit.11

For the discussion, it is useful to recall how the FL bounds are calculated. Let us firstreview the main building block in Fudenberg and Levine’s (1992) proof of their reputationbounds, which is a Bayesian learning result related to Blackwell and Dubins (1962)’s clas-sical merging theorem.12 Let .�;F / be a measurable space and .Fn/n>0 a filtration. GivenP and Q probability measures on .�;F / with Q << P , define

dn.Q;P / D ess supB2FnC1

jP.BjFn/ �Q.BjFn/j;

where ess sup denotes the essential supremum of a family of measurable functions.13

9It can be readily verified that the conditions of Donsker’s Theorem are satisfied.10The reader might be, at first, unsatisfied with this example (and the previous one), because the approxima-

tion does not have full support. But the example can be slightly modified to deliver a full support approximationwith essentially the same features. Indeed, suppose that when player 1 chooses a1 2 f0; 1g then with proba-bility 1 � � the public signal is drawn from f˙

p�.1 � a1�/g with equal probabilities, and with probability

� it is drawn from f˙p�.1 � .1 � a1/�/g also with equal probabilities. Such a quadrinomial approximation

also satisfies the conditions of Donsker’s Theorem, but it has support independent of a1 for any �. It can bereadily verified that h D 1 also in this case, and hence a reputation result obtains in the frequent-interactionlimit, although in the continuous-time game the monitoring is pure noise.

11This is unlike the discrete-time case with fixed period length, where the improvement of the Gossnerbounds over the FL bounds can only be useful when the long-run player is impatient.

12See Sorin (1999) for further connections between merging and reputation.13The ess sup operator gives the smallest measurable function that is an almost sure upper bound on a given

12

Page 13: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

Uniform Merging Theorem (Fudenberg and Levine (1992)).14 For every � 2 .0; 1�, " >0 and � > 0, there exists a non-negative integer K such that, for every P , Q and Q0

probability measures on .�;F / with P D �QC .1 � �/Q0, one has

Q(

#fn > 0 W dn.Q;P / > "g > K)6 �: (2)

For concreteness, let us review how this result relates to the FL bounds in the contextof the product-choice game with Poisson signals of Example 1. In fact, to make thingseven simpler, let us assume that only the long-run player controls the arrival rate �. Fixthe period length � > 0 and consider a behavioral type O! committed to playing High withprobability strictly greater than 1=2. Denote the mixed action of this behavioral type by O1.Let � be an arbitrary Nash equilibrium of the repeated game with period length �. Thus,following the notation of Section 2,

P�;� D �. O!/P O!;� C .1 � �. O!//P: O!;� ;

where

P: O!;� D

∑!¤ O! �.!/P!;�

1 � �. O!/:

Next, choose ".�/ > 0 small enough that supy jq�.yj˛1/� q

�.yj O1/j 6 ".�/ implies˛1.High-end/ > 1=2. A fact that will turn out to be useful later is that for any choice of".�/ for which the latter implication holds, one has lim sup�!0 ".�/=� <1.15

If the normal firm deviates and mimics the behavioral type O!, the payoff from thisdeviation must be lower than the payoff the firm receives in equilibrium. The uniformmerging theorem will help finding a lower bound on the firm’s payoff under this deviation.Fix � > 0 and letK.�; �; O!/ designate the smallest integerK such that inequality (2) holdsfor " D ".�/, � D �. O!/, P D P�;� ,Q D P O!;� ,Q0 D P: O!;� and Fn equal to the product� -field on Hn

2 . Thus, under the deviation, with probability at least 1 � � there are at mostK.�; �; O!/ periods in which the consumers are expecting the firm to choose high quality

family of measurable functions. The essential supremum above is well-defined because Q << P (Neveu,1975).

14The Q-almost sure convergence dn.Q;P / ! 0 is an immediate implication of Blackwell and Dubins’s(1962) merging theorem. The uniform merging theorem of Fudenberg and Levine (1992) strengthens theconclusion that dn.Q;P / ! 0 Q-a.s., under a stronger form of absolute continuity which came to be knownas the grain of truth condition: the theorem establishes the uniformity of the rate of convergence K over all P ,Q and Q0 that satisfy P D �Q C .1 � �/Q0 for a fixed size � of the “grain of truth.” Merging also plays acentral role in the literature on rational learning in games (Kalai and Lehrer (1993)).

15This follows from the fact that supy jq�.yj˛1/ � q

�.yj O1/j D j O1.High/ � ˛1.High/j � j�.Low/ ��.High/j�C o.�/ and O1.High/ > 1=2.

13

Page 14: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

with probability less than or equal to 1/2. Hence, with probability at least 1 � � underthe deviation, there are also at most K.�; �; O!/ periods in which the consumers are notconsuming the high-end product.16 Assume, conservatively, that such exceptional periodsare precisely the K.�; �; O!/ initial periods. This yields the following lower bound for thepayoff the firm receives in any Nash equilibrium:

FL.r;�; �; O!/ WD .1 � �/.3 � O1.High//e�r�K.�;�; O!/: (3)

Recall thatK.�; �; O!/ depends only on the period length� (through ".�/), on � and on thebehavioral type O!. In particular, it is independent of the discount rate r . Thus, letting r ! 0

first, then � ! 0 and then O1.High/ ! 1=2, the lower bound FL.r;�; �; O!/ converges tothe Stackelberg payoff of 2:5. This is the argument of Fudenberg and Levine (1992).

Now, consider what happens to the FL bound, FL.r;�; �; O!/, when we send � to 0before sending r to 0. To figure out this limit, we need to understand how K.�; �; O!/

varies with �. Or, more generally, how K in the uniform merging theorem varies with ".The following result (proved in the appendix) provides sharp estimates:

Lemma 1. For each � 2 .0; 1� and ", � > 0, let K�."; �; �/ be the smallest non-negativeinteger such that inequality (2) holds for all probability measures P;Q and Q0 with P D�QC .1 � �/Q0. Then, for every � 2 .0; 1� and � > 0,

lim sup"!0

"2K�."; �; �/ <1 :

Furthermore, if Fn ¤ FnC1 for all n, then for every > 0 and � 2 .0; 1�, there existsN� > 0 such that for every 0 < � < N�,

lim"!0

"2� K�."; �; �/ D1 :

Now, consider the FL bound (3) with K.�; �; O!/ WD K�.".�/; �; �.f O!g//. Then, thesecond part of the lemma above (with D 1) yields ".�/K.�; �; O!/ ! 1 as � !0. But since lim sup�!0 ".�/=� < 1 (cf. footnote 15), it follows that the real time� �K.�; �; O!/ tends to1 as � shrinks to 0, and hence

lim�!0

FL.r;�; �; O!/ D 0;

for every � > 0 small enough and every O! with O1.High/ > 1=2. Thus, for the product-choice game with Poisson signals of Example 1, the FL bounds become completely unin-formative in the frequent-interaction limit.

16Recall that whenever the consumers assign probability greater than 12 to high quality, their best-reply is to

consume the high-end product.

14

Page 15: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

A Appendix

A.1 Proof of Theorem 2

The main result of Gossner (2011) yields the following lower bound on N 1.!; r;�/:

supO!2�b

w�O! ..1 � e

�r�/ ln�. O!// D supO!2�b

w�O! .r� ln�. O!/C o.�//;

where, for each O! 2 �b , the function " 7! w�O!."/ is the pointwise supremum over all

convex functions that are below

" 7! inf{g1. O!; ˛2I!/ W ˛2 2 B

�2;". O!/

};

and

B�2;". O!/ D{˛2 W ˛2 is undominated and 9˛1 s.t. H.q�.�j O!; ˛2/jjq�.�j˛1; ˛2// 6 "; ˛2 2 arg max

˛02

g2.˛1; ˛02/}:

Since for all O!, ˛1 and ˛2,

H.q�.�j O!; ˛2/jjq�.�j˛1; ˛2// > h. O!; ˛1I˛2/�C o.�/;

we must haveB�2;.1�e�r�/ ln�. O!/. O!/ � B

�2;r ln�. O!/Co.1/. O!/;

and hence,

supO!2�b

w�O! ..1 � e

�r�/ ln�. O!// > supO!2�b

w�O!.r ln�. O!/C o.1//;

where o.1/ is a function that converges to zero as �! 0, and " 7! w�O!."/ is the pointwise

supremum over all convex functions that are below

" 7! inf{u1. O!; ˛2/ W ˛2 2 B

�2;". O!/

}:

The result then follows from the lower semi-continuity of the function h, by taking the limitas � ! 0 and r ! 0.17 The proof for the upper bound is analogous, and hence omitted.Also omitted is the proof that if the game is non-degenerate and the action of the long-runplayer is information-flow identified then g�

1.!/ D Ng�1 .!/ for every strategic type !, as this

is similar to the proof of Theorem 3.3 of Fudenberg and Levine (1992).

17See Gossner (2011, Lemma 8) for details of a similar continuity argument.

15

Page 16: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

A.2 Proof for Example 1

Fix ˛1, ˛01 2 �.A1/ and ˛2 2 �.A2/ and write ˛ D .˛1; ˛2/, ˛0 D .˛01; ˛2/. We have

H.q�.�j˛/jjq�.�j˛0// D∑Na

˛. Na/ exp.��. Na/�/1∑kD0

�. Na/k�k

kŠlog

∑a ˛.a/�.a/

k exp.��.a/�/∑a0 ˛0.a0/�.a0/k exp.��.a0/�/

D

∑Na

˛. Na/ exp.��. Na/�/(

log∑a ˛.a/ exp.��.a/�/∑a0 ˛0.a0/ exp.��.a0/�/

C��. Na/ log∑a ˛.a/�.a/ exp.��.a/�/∑

a0 ˛0.a0/�.a0/ exp.��.a0/�/

C o.�/

);

and hence,

h.˛1; ˛01I˛2/ D

∑Na

˛. Na/

(lim�!0

1

�log

∑a ˛.a/ exp.��.a/�/∑a0 ˛0.a0/ exp.��.a0/�/

C �. Na/ log�.˛/

�.˛0/

)D lim

�!0

1

�log

∑a ˛.a/ exp.��.a/�/∑a0 ˛0.a0/ exp.��.a0/�/

C �.˛/ log�.˛/

�.˛0/;

where, by L’Hospital rule,

lim�!0

1

�log

∑a ˛.a/ exp.��.a/�/∑a0 ˛0.a0/ exp.��.a0/�/

D1∑

a ˛.a/ exp.��.a/�/∑a0 ˛0.a0/ exp.��.a0/�/

[�

∑a

˛.a/�.a/ exp.��.a/�/∑a0

˛0.a0/ exp.��.a0/�/

C

∑a0

˛0.a0/�.a0/ exp.��.a0/�/∑a

˛.a/ exp.��.a/�/]D �.˛0/ � �.˛/;

which is the desired result.

A.3 Proof for Example 2

Fix ˛1, ˛01 2 �.A1/ and ˛2 2 �.A2/ and write ˛ D .˛1; ˛2/, ˛0 D .˛01; ˛2/. We have

H.q�.�j˛/jjq�.�j˛0// D

∫ 1�1

logf �.yj˛/

f �.yj˛0/f �.yj˛/ dy

16

Page 17: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

where f �.�ja/ denotes the density function of a Gaussian random variable with mean�.a/� and variance �. For any Na 2 A,

f �.yj˛/

f �.yj˛0/D

∑a ˛.a/ expf�.y � �.a/�/2=.2�/g∑a0 ˛0.a0/ expf�.y � �.a0/�/2=.2�/g

D

∑a ˛.a/ expf�.y � �. Na/�C .�. Na/ � �.a//�/2=.2�/g∑a0 ˛0.a0/ expf�.y � �. Na/�C .�. Na/ � �.a0//�/2=.2�/g

D

∑a ˛.a/ expf.�.a/ � �. Na//.y � �. Na/�/ � .�.a/ � �. Na//2�=2g∑

a0 ˛0.a0/ expf.�.a0/ � �. Na//.y � �. Na/�/ � .�.a0/ � �. Na//2�=2/g

D

∑a ˛.a/ expf

p�s.a; Na/.y � �. Na/�/=

p� ��s.a; Na/2=2g∑

a0 ˛0.a0/ expf

p�s.a0; Na/.y � �. Na/�/=

p� ��s.a0; Na/2=2g

;

wheres.a; a0/ WD �.a/ � �.a0/ 8a; a0 2 A:

Thus,

H.q�.�j˛/jjq�.�j˛0// D

D

∑Na

˛. Na/

∫log{∑

a

˛.a/ exp[p�s.a; Na/.y � �. Na/�/=

p� ��s.a; Na/2=2

]}f �.yj Na/ dy

∑Na

˛. Na/

∫log{∑

a0

˛0.a0/ exp[p�s.a0; Na/.y � �. Na/�/=

p� ��s.a0; Na/2=2

]}f �.yj Na/ dy;

hence the change of variables z WD .y � �. Na/�/=p� yields

H.q�.�j˛/jjq�.�j˛0// D

D

∑Na

˛. Na/

∫log{∑

a

˛.a/ exp[p�s.a; Na/z ��s.a; Na/2=2

]}f .z/ dz

∑Na

˛. Na/

∫log{∑

a0

˛0.a0/ exp[p�s.a0; Na/z ��s.a0; Na/2=2

]}f .z/ dz;

where f denotes the density of a Gaussian random variable with mean 0 and variance 1. Tocompute the integral

I�˛; Na WD

∫g�˛; Na.z/f .z/ dz;

whereg�˛; Na.z/ WD log

{∑a

˛.a/ exp[p�s.a; Na/z ��s.a; Na/2=2

]};

consider a second-order expansion of g�˛; Na.z/ around z D 0. By Taylor’s formula withLagrange form of remainder,

I�˛; Na D g�˛; Na.0/C

1

2

(g�˛; Na

)00.0/C

1

6

∫ (g�˛; Na

)000(��.z/

)z3f .z/ dz;

17

Page 18: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

where ��.�/ is a function that satisfies 0 < ��.z/=z < 1 for every z ¤ 0. We have

g�˛; Na.0/ D log{∑

a

˛.a/ exp[��s.a; Na/2=2

]};

and therefore, by L’Hospital rule,

lim�!0

g�˛; Na.0/

�D �

1

2

∑a

˛.a/s.a; Na/2;

Moreover, it can be readily verified that

(g�˛; Na

)00.0/ D

∑a ˛.a/s.a; Na/

2 exp[��s.a; Na/2=2

]∑a ˛.a/ exp

[��s.a; Na/2=2

] ��

(∑a ˛.a/s.a; Na/ exp

[�s.a; Na/2=2

])2(∑

a ˛.a/ exp[��s.a; Na/2=2

])2 �;

and (g�˛; Na

)000(��.z/

)D O.�3=2/; uniformly in z:

Thus,

lim�!0

(g�˛; Na

)00.0/

�D

∑a

˛.a/s.a; Na/2 �(∑

a

˛.a/s.a; Na/)2;

and (g�˛; Na

)000(��.z/

)�

D O.�1=2/; uniformly in z;

and therefore, by Lebesgue’s Dominated Convergence Theorem,

h.˛1; ˛01I˛2/ D lim

�!0

∑a ˛. Na/.I

�˛; Na � I

�˛0; Na/

D1

2

(∑a

˛0.a/s.a; Na/)2�(∑a

˛.a/s.a; Na/)2

D1

2

∑Na

˛. Na/[.�.˛0/ � �. Na//2 � .�.˛/ � �. Na//2

]D

1

2

∑Na

˛. Na/.�.˛0/ � �.˛//.�.˛0/C �.˛/ � 2�. Na//

D1

2.�.˛0/ � �.˛//2;

as was to be shown.

18

Page 19: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

A.4 Proof of Lemma 1

We begin with a few lemmas that are needed in the proof of Lemma 1. The first lemmabelow is related to the Borel-Cantelli lemma. Under a stronger assumption than the latter, itasserts that, in addition to the events happening infinitely often with probability one, thereis positive probability that the upper density of the indices for which the event obtains isbounded above 0.

Lemma 2. Let A1; A2; : : : be a sequence of independent events on a probability space.�;F ; P /. Suppose that c D lim infn!1 1

n

∑nkD1 PAk > 0. Then P.An i.o./ D 1 and,

in addition,

P

(lim supN!1

# fn 6 N W An obtainsgN

> c)> lim inf

N!1P

(# fn 6 N W An obtainsg

N> c

)>1

2:

Proof. Since the assumption implies that∑n PAn D 1, it follows from Borel-Cantelli

that P.An i.o./ D 1. Let us turn to the statement concerning the upper density. By theCentral Limit Theorem, ∑n

kD1 IAk� PAk√∑n

kD1 PAk

�! N .0; 1/ ;

where the convergence is in distribution. Let 0 < c < lim infn!1 1n

∑nkD1 PAk . We have:

P

(1

n

n∑kD1

IAk> c

)D P

∑nkD1 IAk

� PAk√∑nkD1 PAk

>cn �

∑nkD1 PAk√∑n

kD1 PAk

> P

∑nkD1 IAk

� PAk√∑nkD1 PAk

> 0

for n sufficiently large :

Therefore, denoting by ˆ the standard normal distribution function, we have

lim infn

P

(1

n

n∑kD1

IAk> c

)> 1 �ˆ.0/ D 1=2;

as required. �

Lemma 3. If pk D12.1C 1=k˛/ with ˛ > 1=2, then

1∑kD1

ln.1

4pk.1 � pk// <1 and

1∑kD1

.ln.pk

1 � pk//2 <1 :

19

Page 20: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

Proof. Follows directly from the definition of pk , the inequality ln.1C x/ < x for x > 0,and the fact that

∑1kD1 k

�� <1 for all � > 1. �

Lemma 4. Let P;Q and Q0 be such that P D �Q C .1 � �/Q0. Let �n be the posteriorprobability on the event that the process follows Q, conditional on Fn. Then 8c > 0,

Q.�n 6 c�0/ 6 c :

Proof. Since the odds ratio .1 � �n/=�n is a martingale under Q, we have:

Q.�n 6 c�1/ D Q

(1 � �n

�n>1 � c�0

c�0

)6

EQŒ.1 � �n/=�n�

.1 � c�0/=.c�0/

D.1 � �0/=�0

.1 � c�0/=.c�0/6 c;

as was to be shown. �

Lemma 5. Suppose that .Fn/n > 1 is a strictly increasing sequence of atomic � -algebras.Then, 8� > 1=2 and 8� 2 .0; 1�, 9P;Q and Q0 with P D �Q C .1 � �/Q0, such that8c > 0,

lim infN!1

Q(#fn > 1 W dn.P;Q/ > c=N �

g > N)> 1=2 ;

Proof. First, we prove the result for the case in which the measurable space is the set of allinfinite sequences of Heads and Tails and the � -algebra Fn is that generated by all historiesof length n. Let Q be the probability measure corresponding to i.i.d. draws of an unbiasedcoin. Let Q0 be the probability corresponding to independent draws of a coin such thatthe probability of Heads in period n is pn D 1=2.1 C 1=n˛/, where 1=2 < ˛ < �. LetP D �QC.1��/Q0 and �n be the posterior probability that the stochastic process followsQ conditional on period n information.

Notice that dn.P;Q/ D .pn � 1=2/.1 � �n/. Denote by �n the f0; 1g-valued ran-dom variable that takes value 1 if and only if the coin turns out Heads at time n. Explicit

20

Page 21: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

computation of Bayes rule yields for all K > 0:

Q

(dn.P;Q/

�n>4K=�1

n�

)D Q

(1 � �n

�n>8K=�1

n��˛

)D Q

(n∏kD1

.2pk/�k .2.1 � pk//

.1��k/ >8K=.1 � �1/

n��˛

)

D Q

n∑kD1

�k ln.2pn/Cn∑kD1

.1 � �k/ ln.2.1 � pk// > ln.8K=.1 � �1//DWB

� .� � ˛/

DW >0

lnn

D Q

(n∑kD1

ln.pk

1 � pk/�k >

n∑kD1

ln.1

2.1 � pk//C B � lnn

)

D Q

(n∑kD1

ln.pk

1 � pk/.�k �

1

2/ >

1

2

n∑kD1

ln.1

4pk.1 � pk//C B � lnn

)

> Q

(j

n∑kD1

ln.pk

1 � pk/.�k �

1

2/j 6 lnn �

1

2

n∑kD1

ln.1

4pk.1 � pk// � B

)(4)

Chebishev’s inequality yields for all M > 0:

Q

(j

n∑kD1

ln.pk

1 � pk/.�k �

1

2/j > M

)6∑1kD1.ln.

pk

1�pk//2

4M 2

By Lemma 3,∑1kD1.ln.

pk

1�pk//2 <1, hence we can choose M > 0 large enough so that

Q

(j

n∑kD1

ln.pk

1 � pk/.�k �

1

2/j > M

)61

2: (5)

On the other hand, by Lemma 3, also∑1kD1 ln. 1

4pk.1�pk// < 1. Therefore, for n suffi-

ciently large,

lnn �1

2

n∑kD1

ln.1

4pk.1 � pk// � B > M :

Therefore, by the inequalities (4) and (5) above, for all n sufficiently large,

Q

(dn.P;Q/=�n >

4K=�1

n�

)>1

2

By Lemma 4, Q(�n 6 1

4�1)6 1

4. Thus, for n large enough,

Q(dn.P;Q/ > K=n�

)C1

4> Q

(dn.P;Q/ > K=n�

)CQ

(�n 6

1

4�1

)> Q

(dn.P;Q/=�n >

4K=�1

n�

)>1

2:

21

Page 22: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

We have thus proved that,

lim infn!1

Q(dn.P;Q/ > K=n

)>1

4:

Therefore, the events fdn.P;Q/ > K=n�g satisfy the assumption of Lemma 2 and we have

lim infN!1

Q

(#fn 6 N W dn.P;Q/ > K=n�g

N>1

4

)>1

2:

Since

Q

(#fn 6 N W dn.P;Q/ > K=N �g

N>1

4

)> Q

(#fn 6 N W dn.P;Q/ > K=n�g

N>1

4

);

it follows that

lim infN

Q

(#fn W dn.P;Q/ > K=N �

g >1

4N

)>1

2;

which proves the desired result for the case in which .�;F ; .Fn/n > 1/ is the canonicalcoin tossing space.

If .�;F / is an arbitrary space with a strictly increasing atomic filtration .Fn/, then, forevery n, there are at least two disjoint atomic events An; Bn such that An; Bn 2 Fn nFn�1.Proceed as in the previous construction, identifying the event An with “Heads” and Bn with“Tails”. �

We are now ready to prove Lemma 1.

First, a close look at S. Sorin’s proof of the Fudenberg-Levine uniform merging the-orem (Sorin, 1999, Lemma 2.5) reveals that the integer K of inequality (2) can be takenas C="2, where C > 0 is a constant that depends only on � and �. Hence, we havelim sup"!0 "

2K�."; �; �/ 6 lim sup"!0 "2K D C <1, which is the first claim.

Consider the second claim. Assume the filtration is strictly increasing. Suppose, to-wards a contradiction, that for some q > 0 and � 2 .0; 1� there exist a function ."; �/ 7!N."; �/ and a sequence �k ! 0 with

lim sup"!0

"2�qN."; �k/ <1 ;

and such that for all P;Q and Q0 with P D �QC .1 � �/Q0 one has:

Q.#fn > 1 W dn.P;Q/ > "g > N."; �k// 6 �k

for all k > 1 and for all " > 0 sufficiently small.

22

Page 23: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

For each k > 1, let Mk D 1 C lim sup"!0 "2�qN."; �k/. Define a double sequence

"m;k D .Mk=m/1=.2�q/ for all k;m > 1. Hence we have for all k > 1,

lim supm!1

N."m;k; �k/=m < 1

By Lemma 5, there exist probability measures P;Q and Q0, with P D �Q C .1 � �/Q0,such that for all k > 1,

lim infm!1

Q(#fn > 1 W dn.P;Q/ > "m;kg > N."m;k; �k/

)> 1=2

Since �k < 1=2 for k large enough, it follows that for every N" > 0 and every k sufficientlylarge 9" 2 .0; N"� such that:

Q.#fn > 1 W dn.P;Q/ > "g > N."; �k// > �k ;

which is a contradiction. The contradiction shows that

lim"!0

"2�qK�."; �; �/ D1

for every q > 0, � 2 .0; 1� and � sufficiently small, as was to be shown.

References

Abreu, Dilip, Paul Milgrom, and David Pearce (1991), “Information and timing in repeatedpartnerships.” Econometrica, 59, 1713–1733.

Blackwell, D. and L. Dubins (1962), “Merging of opinions with increasing information.”Ann. Math. Statist., 38, 882–886.

Bohren, Aislinn (2011), “Stochastic games in continuous time: Persistent actions in long-run relationships.” Univeristy of California, San Diego.

Cover, Thomas M. and Joy A. Thomas (2006), Elements of Iinformation Theory. Wiley,second edition.

Ekmekci, Mehmet, Olivier Gossner, and Andrea Wilson (2012), “Impermanent types andpermanent reputations.” Journal of Economic Theory, 147, 142–178.

Faingold, Eduardo (2006), “Building a reputation under frequent decisions.” Mimeo., Uni-versity of Pennsylvania.

Faingold, Eduardo (2013), “Maintaining a reputation in continuous time.” Yale University.

23

Page 24: Reputation and the Flow of Information in Repeated …ef253/2013_10_10_repflow.pdf1 Introduction This paper studies reputation effects (in the sense ofKreps and Wilson(1982),Milgrom

Faingold, Eduardo and Yuliy Sannikov (2011), “Reputation in continuous-time games.”Econometrica, 79, 773–876.

Fudenberg, Drew and David K. Levine (1989), “Reputation and equilibrium selection ingames with a single long-run player.” Econometrica, 57, 759–778.

Fudenberg, Drew and David K. Levine (1992), “Maintaining a reputation when strategiesare imperfectly observed.” Review of Economic Studies, 59, 561–579.

Fudenberg, Drew and David K. Levine (2007), “Continuous time models of repeated gameswith imperfect public monitoring.” Review of Economic Dynamics, 10, 173–192.

Fudenberg, Drew and David K. Levine (2009), “Repeated games with frequent signals.”Quarterly Journal of Economics, 124, 233–265.

Gossner, Olivier (2011), “Simple bounds on the value of a reputation.” Econometrica, 79,1627–1641.

Kalai, Ehud and Ehud Lehrer (1993), “Rational learning leads to nash equilibrium.” Econo-metrica, 61, 1019–1045.

Kreps, David and Robert Wilson (1982), “Reputation and imperfect information.” Journalof Economic Theory, 27, 253–279.

Milgrom, Paul and John Roberts (1982), “Predation, reputation and entry deterrence.” Jour-nal of Economic Theory.

Neveu, J. (1975), Discrete-Parameter Martingales. Elsevier Science.

Sannikov, Yuliy and Andrzej Skrzypacz (2007), “Impossibility of collusion under imperfectmonitoring with flexible production.” American Economic Review, 97, 1794–1823.

Sannikov, Yuliy and Andrzej Skrzypacz (2010), “The role of information in repeated gameswith frequent actions.” Econometrica, 78, 847–882.

Sorin, Sylvain (1999), “Merging, reputation and repeated games with incomplete informa-tion.” Games and Economic Behavior, 29, 274–308.

24