individual rationality, model-consistent expectations and learning · 2011-09-07 · squares...
TRANSCRIPT
CENTRE FOR DYNAMIC MACROECONOMIC ANALYSIS WORKING PAPER SERIES
* For helpful comments, I thank Herbert Dawid, Martin Ellison, Christian Haefke, Morten Ravn, Michael Reiter, Martin Summer, Peter Tinsley, Stephen Wright and seminar participants at the Bank of England, Birkbeck College London, the CDMA conference in St Andrews and the Institute for Advanced Studies, Vienna. † Department of Economics, University College London, Gower Street, London WC1E 6BT, UK.
CASTLECLIFFE, SCHOOL OF ECONOMICS & FINANCE, UNIVERSITY OF ST ANDREWS, KY16 9AL TEL: +44 (0)1334 462445 FAX: +44 (0)1334 462444 EMAIL: [email protected]
www.st-and.ac.uk/cdma
CDMA11/12
Individual rationality, model-consistent
expectations and learning*
Liam Graham†
20 AUGUST 2011
ABSTRACT
To isolate the impact of the assumption of model-consistent expectations, this paper
proposes a baseline case in which households are individually rational, have full information
and learn using forecast rules specified as in the minimum state variable representation of the
economy. Applying this to the benchmark stochastic growth model shows that the economy with
learning converges quickly to an equilibrium very similar to that with model-consistent
expectations. In other words, if households are individually rational, the assumption that they
can also form model-consistent expectations does not seem a strong one. The mechanism by
which learning affects the model is considered in detail and the implications of relaxing the
assumptions of the baseline case are explored.
JEL Classification: D83; C62; E30.
Keywords: adaptive learning; rational expectations; bounded rationality; expectations
formation.
Individual rationality, model-consistent expectations
and learning∗
Liam Graham†
20 August 2011
Abstract
To isolate the impact of the assumption of model-consistent expectations, this
paper proposes a baseline case in which households are individually rational, have
full information and learn using forecast rules specified as in the minimum state
variable representation of the economy. Applying this to the benchmark stochastic
growth model shows that the economy with learning converges quickly to an equi-
librium very similar to that with model-consistent expectations. In other words,
if households are individually rational, the assumption that they can also form
model-consistent expectations does not seem a strong one. The mechanism by
which learning affects the model is considered in detail and the implications of
relaxing the assumptions of the baseline case are explored.
JEL classification: D83; C62; E30.
Keywords: adaptive learning; rational expectations; bounded rationality; ex-
pectations formation.
1 Introduction
The macroeconomic learning literature assumes agents are unable to formmodel-consistent
("rational") expectations. The question is then how to model the formation of expec-
tations and whether a particular model of expectation formation will mean the economy
converges to the same equilibrium as an economy with model-consistent expectations or
whether learning adds new dynamics.
∗For helpful comments, I thank Herbert Dawid, Martin Ellison, Christian Haefke, Morten Ravn,Michael Reiter, Martin Summer, Peter Tinsley, Stephen Wright and seminar participants at the Bank ofEngland, Birkbeck College London, the CDMA conference in St Andrews and the Institute for AdvancedStudies, Vienna.†Department of Economics, University College London, Gower Street, London WC1E 6BT, UK.
1
Models of learning need to make assumptions in three further areas. Firstly, how
rational are individuals conditional on their expectations of the macroeconomy (Adams
andMarcet, 2011, refer to this as internal rationality)? Some work (e.g. Evans et al, 2009,
also see the discussion in Evans et al, 2011) assumes individuals are rational regarding
their individual decisions (i.e. they are fully forward-looking and know their budget
constraints). Others (e.g. Bullard and Mitra, 2002; Carceles-Poveda, and Giannitsarou,
2007; Evans et al, 2011) adopt the "Euler equation learning" approach which assumes
that agents are boundedly rational with regard to their individual decisions only looking
forward a single period and ignoring their first-order conditions and budget constraints
beyond that.
Secondly, what information is at agents’ disposal? Some work (e.g. Evans and
Honkapohja, 2001) assumes that agents learn from aggregate consumption, some from
the states (e.g. Carceles-Poveda, and Giannitsarou, 2007) and some from other variables
(e.g. Evans et al, 2009).
Thirdly, given this information, how are learning rules specified? Some papers assume
learning rules are specified in the same way as the minimum state variable solution of the
economy with model-consistent expectations (e.g. Carceles-Poveda, and Giannitsarou,
2007, such learning rules are also the "saddlepath learning" of Ellison and Pearlman,
2011); some add an intercept to such a rule (e.g. Milani, 2011) and many others make
plausible but apparently arbitrary assumptions about which variables are in the learning
rules.
There is a wide diversity of such assumptions across the literature and it is often
diffi cult to see the extent to which results arise from the central question of the learning
literature, the inability to form model-consistent expectations, or from assumptions in
other areas. To address this issue, this paper proposes a baseline case in which the
only assumption relaxed is that of model-consistent expectations. This case consists of
individuals who are rational conditional on their expectations; who have full information
about the macroeconomy (in the sense that they observe relevant aggregates without
noise) and who have learning rules specified in terms of the minimum state variable
(MSV) representation of the economy under model-consistent expectations.
Such a baseline case isolates the impact of the assumption of model-consistent expec-
tations and is applicable to any learning model. This paper uses it to investigate the
impact of learning in the stochastic growth model. Then the assumptions of the baseline
case are relaxed in two directions. Firstly, the degree of individual rationality is allowed
to vary by adding the household’s forecast horizon as a parameter. Secondly, different
specifications of the learning rule are investigated1.
The main results are as follows:1The third assumption of full information is relaxed in a companion paper, Graham (2011).
2
1. The degree to which households are forward looking has a dramatic effect on the
speed of convergence. If households look forward only one period (which is similar
to "Euler equation learning" ) the model can take many tens of thousands of periods
to converge and may not converge at all. On the other hand, if households have
an infinite horizon, convergence is fast and robust - in the case of ordinary least
squares learning a standard theorem can be applied to show the rate of convergence
is at least√t.
2. Constant gain learning has only very small effects on the business cycle properties
of the model. Further, under most parameterizations the effect of learning is to
mute the response of output to technology shocks.
3. If an intercept is included in the learning rules, the effect of learning is somewhat
stronger, but still small and the response of output is still muted.
The first set of results relates to the speed of convergence. This matters for two
reasons. Firstly, fast convergence gives a justification for the hypothesis of model-
consistent expectations. If convergence is fast, a model-consistent expectations equi-
librium (MCEE) can be interpreted as the outcome of a learning process that has already
converged (Grandmont, 1998) without the need for strong assumptions on households’
cognitive ability. Secondly, studies of the implications of learning for business cycle dy-
namics typically initialize the model at the MCEE to avoid arbitrary transition dynamics
contaminating the results. If convergence is fast, this seems justified. If it is slow, this
is endowing households with exactly what they would have diffi culty learning.
This paper shows that households’forecast horizon is the key variable that determines
the speed of convergence under both ordinary least squares and constant gain learning.
With infinite horizons, convergence is fast; with short horizons, it is very slow and may
not occur at all. Some previous work (Dawid, 2005; Branch et al 2010) has examined
the impact of such horizons for macroeconomic dynamics. This paper shows a further
way in which the horizon matters.
The second set of results relates to business cycle dynamics. With the baseline set
of assumptions, the effects of learning are quite small. A natural metric is the standard
deviation of aggregate consumption, and learning increases this by at most 2% over
its value at the MCEE. While learning makes consumption more volatile, its impact
on labour supply and investment means output is less volatile than at the MCEE. This
stands in contrast to the simple intuition that by increasing the volatility of expectations,
learning increases the volatility of the macroeconomy (it turns out that this intuition can
be recovered only in the case of very short forecast horizons, which correspond to "Euler
equation learning"). Understanding these results requires a careful consideration of
the complicated mechanism by which constant gain learning affects the dynamics of this
3
economy. To elucidate this mechanism, the paper considers a simple univariate example
with exogenous income and fixed capital and labour before turning to the model economy.
One way of interpreting results (1) and (2) is that if households are individually ratio-
nal, the assumption of model-consistent expectations is not all that important. Individual
rationality and constant gain learning is suffi cient for the economy to converge quickly
to an equilibrium which would in practice be indistinguishable from the equilibrium with
model-consistent expectations.
This all applies to the baseline case in which learning rules are specified as in the
MSV solution. While there are many other plausible specifications for learning rules
(for example, including aggregate consumption or more lags) a number of recent papers
(e.g. Milani, 2011) have added an intercept to the learning rule, often interpreted as
representing uncertainty about the steady state. This is shown to strengthen the effect
of learning on the standard deviation of consumption, the maximum effect being an
increase of around 4% in the standard deviation, but output is still less volatile than
under model-consistent expectations.
The paper proceeds as follows. Section 2 presents the model and section 3 discusses
its general properties under learning. The properties of the model under least squares
learning are discussed in section 4 and under constant gain learning in section 5. Section
6 concludes. Detailed derivations are provided in the Appendices.
2 The model
This section presents the standard stochastic growth model. Rather than starting from a
representative household, a large number S of identical households are considered. While
in equilibrium households will be identical, it is important to carefully distinguish between
aggregate and individual quantities when modelling an individual household’s decision.
Upper case letters represent levels; lower case letters their linearized equivalents.
2.1 Households
The problem of household s is to choose paths for consumption (Cst ) and labour supply
(N st ) to maximize expected lifetime utility given by
Est
∞∑i=0
βi
[lnCs
t+i + θ
(1−N s
t+i
)1−η
1− η
](1)
where 1ηis the intertemporal elasticity of labour supply and β the subjective discount
rate. The expectations operator is written as Est since in the general case households will
have model-inconsistent ("non-rational") expectations and expectations will differ across
4
households. The maximization is subject to a budget constraint
RsktK
st +W s
t Nst = Cs
t + Ist (2)
where W st is wage, R
skt the aggregate return to capital, I
st is investment and K
st capital
which evolves according to
Kst+1 = (1− δ)Ks
t + Ist (3)
where δ is the depreciation rate.
The household’s first-order conditions consist of an Euler equation
1
Cst
= βEst
[Rst+1
Cst+1
](4)
where Rst = Rs
kt + 1 − δ is the gross return to a one-period investment in capital, and alabour supply relation
θ (1−N st )−η =
W st
Cst
(5)
2.2 Firms
There are also a large number of identical firms which use aggregate capital and labour
in their production function:
Yt = (Kt)1−α (AtNt)
a (6)
where At is an aggregate productivity shock. The first-order conditions are
Rkt = (1− α)YtKt
,Wt = αYtNt
(7)
2.3 Aggregates
Aggregates are defined explicitly, for example aggregate consumption is
ct =1
S
S∑s=1
cst (8)
2.4 Linearisation
As is standard in the learning literature, this paper will study a linearised version of the
above model2. The household’s Euler equation (4) is
Est∆c
st+1 = Es
t rst+1 (9)
2More details can be found in Appendix A.
5
and labour supply (5) is
nst = ς [wt − cst ] (10)
where ς = 1−NNη
and N is steady state labour. Household capital evolves according to
kst+1 = (1− δ) kst + δist (11)
and the budget constraint (2) is
$cst + (1−$) ist = α (wst + nst) + (1− α) (rskt + kst ) (12)
where $ is the steady state consumption - output ratio.
The first-order conditions for firms are
wt = yt − nt (13)
rkt = yt − kt (14)
and the production function is
yt = (1− α) kt + α (nt + at) (15)
To close the model, specify a process for exogenous technology
at = ρat−1 + εt (16)
where εt is drawn from N (0, σ2).
2.5 Equilibrium
Definition 1. (Equilibrium) A competitive equilibrium for the above economy is a
sequence of plans for
• allocations{cst , n
st , k
st+1
}s=1:S
t=1:∞
• prices {rt, wt}t=1:∞
• aggregate factor inputs {kt, nt}t=1:∞
such that
1. Given prices, the allocations solve the utility maximization problem for all house-
holds.
2. {rt, wt} are the marginal products of aggregate and individual capital and labouri.e. rst = rt, w
st = wt ∀s.
6
3. All markets clear
2.6 Calibration
Values for most of the parameters are chosen following Campbell (1994): δ = 0.025,
α = 0.6, β = 0.99, N = 0.2. The intertemporal elasticity of labour supply 1ηis chosen
to be 5. The aggregate productivity shock is given standard RBC values, ρ = 0.9,
σ = 0.7% per quarter. This is only a benchmark calibration. Sensitivities are given to
all important parameters.
3 Learning and optimal decisions
With model-consistent expectations, the model presented above is fully specified. With
learning, further modelling choices have to be made in three areas. Firstly, what degree
of individual rationality do we assume? Secondly, what variables are in households’
information set? Thirdly, how are households’learning rules specified?
The existing literature makes various choices. The "Euler equation" approach (see
Evans et al, 2011) assumes households are boundedly rational in that they use only their
Euler equation to implement consumption, forecasting their own consumption and the
return just one period ahead. This implies that consumption decisions will not satisfy
expected budget constraints (since the household does not look forward more than one
period), though the budget constraints themselves always hold. The learning rules
are written in terms of aggregate consumption3 and the states. Households have full
information on aggregate consumption (or at least they know all households are identical
so their consumption is the aggregate), the return and the states. This is generalised in
Branch et al (2010) to allow households to look forward an arbitrary number of periods.
In contrast, Carceles-Poveda and Giannitsarou (2007) implement a model in which
households only look one period ahead but forecast capital using a learning rule specified
as in the MSV solution (their equation 45). The process for technology is assumed to be
known. They do not make the distinction between individual and aggregate quantities
which is equivalent to assuming that households know they are identical i.e. the forecast
of aggregate capital is taken to also be a forecast of individual capital.
Eusepi and Preston (2011) take a different approach. They assume households are
rational (having infinite horizons and using both budget constraints and Euler equations).
In terms of information, households observe aggregate states and prices. Learning rules
contain the same variables that appear in the minimum state variable representation of
the economy, with the addition of an intercept. They further assume that the process
3In this sense, "Euler equation learning" has agents learning about their own consumption decisions.Forecasting choice variables seems a somewhat odd way to model bounded rationality.
7
for technology is known (this is necessary for households to be able to detrend by it)
and that while the innovation to technology is observed by households for the purposes
of implementing consumption, it is not used in the learning process (see the discussion
in their section II). A further example is Evans et al (2009). Here households are
individually rational and the learning rule is specified in terms of prices not states.
These four examples illustrate some of the diversity of assumptions to be found across
the learning literature. This paper proposes a baseline case in which households:
1. Are rational conditional on their expectations of the macroeconomy
2. Have an information set consisting of all aggregates
3. Forecast using rules of the same form as the minimum state variable representation
of the economy with model-consistent expectations
Comparing such an economy with one in which households have model-consistent
expectations gives the cleanest answer to the basic question of the learning literature -
how important is the assumption of model-consistent expectations for the properties of
the macroeconomy?
3.1 Optimal consumption
Assume the household looks forward T periods when making its consumption decision
(Dawid, 2005, refers to this as the "planning" horizon). Clearly T =∞ is the standard
infinite horizon case. With T = 1 the structural model is identical to that of Carceles-
Poveda and Giannitsarou (2007) and is closely related to the "Euler equation learning"
of Evans et al (2011)4.
To solve for consumption, substitute for labour from (10) and for capital from (11) in
the budget constraint (12) to obtain
kst = βkst+1 +1
γ1
[γ2cst − γ3w
st − γ5r
st ] (17)
where the constants are defined as part of the derivation in Appendix B.
Iterate this forward T periods, take expectations at time t then rearrange to give
γ2Est
T∑j=0
βjcst+j = [γ1kst + γ3w
st + γ5r
st ] + Es
t
T∑j=1
βj(γ3w
st+j + γ5r
st+j
)− γ1β
T+1Est k
st+T+1
4There are cases in which one-period forecasts can be optimal e.g. risk neutrality.
8
Then the Euler equation (9) can be used to substitute for expected future consumption
in terms of the return to give
cst = γck [γ1kst + γ3w
st + γ5r
st ] + γcwE
st
T∑j=1
βj(γcww
st+j + γcrr
st+j
)(18)
+γcsEst
T∑j=1
rst+j − γ1γckβT+1Es
t kst+T+1 (19)
where the constants are again defined in Appendix B.
The term in square brackets is consumption out of current wealth consisting of the
households’capital in their current factor income (the constants on the prices arise from
substituting out for quantities). The other terms represent consumption out of expected
future income. The presence of the final term is a reminder that the problem is that of
an infinitely lived household with a finite forecast horizon.
3.2 The perceived law of motion
Following the "saddlepath learning" of Ellison and Pearlman (2011), learning rules are
assumed to be specified in terms of the variables in the MSV representation of the economy
with model-consistent expectations, which in this model means the aggregate state vector
is
Xt =[kt at
]′Define matrices Tk and Ta such that kt = TkXt, at = TaXt.
Households are assumed to estimate a first-order VAR in the aggregate state vector
Xt = φstXt−1 + εφt (20)
Since aggregate states do not appear in (19), households also need to estimate the relation
between prices and states
Zst = ϕstXt + εϕt (21)
where Zt =[wst rst
]′.
Then
Estw
st+i = Twϕ
st (φst)
iXt (22)
Est rst+i = Trϕ
st (φst)
iXt (23)
where Tw and Tr are matrices such that wst = TwZst ; r
st = TrZ
st .
A number of papers (e.g. Carceles-Poveda and Giannitsarou, 2007 and Evans et al,
2011) omit this step. This is equivalent to assuming that (a) households know the
9
relations between prices, states and consumption (28) and (29) below (or can estimate
exactly such relations) and (b) households know that they are identical. In this case
there is no need to estimate processes for the prices, and (19) reduces to an expression
in the aggregate states and expectations thereof.
In the infinite horizon case, the final term in (19) can be dropped using the transversal-
ity condition. With finite forecast horizons, households need to forecast this term, their
own future capital. This is odd, and is analogous to households with "Euler equation
learning" needing to forecast their own consumption (see footnote 3). Since households
are only identical in equilibrium, the most consistent way of addressing this would be
to have households have a separate learning rule for their own capital. But to simply
things, and to allow comparison with Carceles-Poveda and Giannitsarou (2007), this pa-
per assumes that for the purposes of forecasting the final term households know that their
capital will always be equal to aggregate capital and so they can use (20) to forecast it.
Then the consumption function (19) can be written
cst = γck (γ1kt + γ3wst + γ5r
st ) + γscXXt (24)
where the expectational terms are captured in
γscX = (γcwTw + γcrTr)ϕstβφ
st
(I − (βφst)
T)
(I − βφst)−1 +
γcsTrϕstφt
(I − (φst)
T)
(I − φst)−1 − γ1γckTk (βφst)
T+1 (25)
In the case of T =∞, this expression reduces to
γcX = (γcwTw + γcrTr)ϕstβφ
st (I − βφst)
−1 (26)
and in the case of T = 1
γcX = (γcwTw + γcrTr)ϕstβφ
st + γcsTrϕ
stφ
st − γ1γckTk (βφst)
2 (27)
3.3 The actual law of motion
The derivations that follow are from the modeler’s perspective. No agent in the economy
has suffi cient knowledge to carry them out (which is another way of saying that they are
unable to form model-consistent expectations).
In equilibrium, all households are identical i.e. for any variable x, xst = xt ∀s. If
markets clear, prices are:
wt = λwkkt + λwaat + λwcct (28)
rt = λrkkt + λraat + λrcct (29)
10
and labour is
nt = ν ((1− α) kt + αat − ct) (30)
Expressions for the coeffi cients are given in Appendix A.3. Note that in the case of fixed
labour supply (η →∞) prices are independent of aggregate consumption, λwc = λrc = 0.
Given households are identical, (24) is also an expression for aggregate consumption
and substituting for current prices in terms of states from (28) and (29) gives
ct =γck
1− γck (γ3λwc + γ5λrc)
([γ1 + γ3λwk + γ5λrk
γ3λwa + γ5λra
]′+ γcX
)Xt (31)
where γcX is defined in (25).
Substituting this in the aggregate capital evolution equation allows the economy to
be written in the form
Xt+1 = T (Φ)Xt +
[0
1
]εt (32)
where εt is the innovation to the process for aggregate technology and
Φ =
[φ
ϕ
](33)
3.4 The model-consistent expectations equilibrium
The model-consistent equilibrium is a fixed point of
Φ = T (Φ) (34)
As in the standard case, there is no closed-form expression for the MCEE so it has to
be calculated numerically5. As would be expected, the model-consistent equilibrium is
independent of the forecast horizon. For the baseline calibration the law of motion for
the states is:
φ∗ =
[0.9635 0.0585
0.0000 0.9000
](35)
and for prices
ϕ∗ =
[0.4550 0.4886
−0.0237 0.0267
](36)
3.5 Learning rules
A general updating rule for φ can be written
5In practice, this is done by using a numerical equation solver (Matlab’s fsolve function) to find azero of T (Φ)− Φ.
11
Rt = Rt−1 + γt(Xt−1X
′t−1 −Rt−1
)(37)
φt = φt−1 + γtR−1t Xt−1
(X ′t −X ′t−1φt−1
)(38)
where γt is the gain. This paper will consider two cases, ordinary least squares learning
(γt = 1t) and constant gain learning (γt = γ). An updating rule of the same form is
specified for ϕ. Stacking the rules as in (33) gives an updating rule for Φ.
3.6 E-stability and learnability
Will the model-consistent expectations equilibria found in section 3.4 be e-stable and
learnable? A standard result (Evans and Honkapohja, 2001) is that the stability of a
system consisting of a PLM, (20), and ALM, (32) and a learning rule (37) and (38) is
related to the stability of an associated ordinary differential equation (ODE)
dΦ
dτ= h (Φ) , where h (Φ) = lim
t→∞E (T (Φ)− Φ) (39)
The economy with ordinary least squares learning (γt = 1t) will converge to Φ only if Φ
is a locally stable fixed point of the associated ODE i.e. the eigenvalues of the Jacobian
of h (Φ) have negative real parts. An analytical expression is only available for these
eigenvalues in the case of T = 1 (see Appendix C.2); in other cases they must be obtained
numerically.
Under constant gain learning (γt = γ > 0), things are more complex but Evans and
Honkapohja (2001, p162) show that if the gain is suffi ciently close to zero the PLM will
converge to a limiting normal distribution around the MCEE.
3.7 Projection
In the baseline case with an infinite horizon, the consumption function (31) is only de-
fined when (I − βφ) is invertible, see (26). Since the term comes from computing the
discounted sum of the expected future path of prices, the invertibility condition is the
same as requiring the sum to be bounded. This is summarised in the following definition
Definition 2. (stable PLM). A given φs is stable if it results in consumption being
bounded. This will be the case if the eigenvalues of φst are less than β−1 > 1 in ab-
solute value.
Theorem 4 of Ljung (1977, p. 557), which forms the basis of many convergence
results in the learning literature employs a "projection facility" constraining estimates to
remain in a region around the MCEE. This has been widely criticized (e.g. Grandmont
12
and Laroque, 1991 and Grandmont, 1998) since it involves endowing households with
knowledge of what they are supposed to be learning. Even though a projection facility
has been shown not to be necessary to proofs of convergence and stability in models
with a unique MCEE (Bray and Savin, 1986) or more generally (Evans and Honkapohja,
1998), it is crucial for any numerical implementation of learning. To see this note that
with a non-zero gain there is always a finite probability that particular sequence of shocks
will lead to a household estimating a PLM that is unstable in the sense of definition 2,
leading forecasts to grow without limit and consumption to be undefined.
The form of the consumption function (31) gives a natural way to define a projection
algorithm which escapes the critiques of Grandmont and Laroque.
Definition 3. (projection facility). After estimating the PLM households check the eigen-
values of φst . If they are greater than q the household discards the estimated φst and chooses
a different one.
If the projection facility is used there are many ways to pick a φst which are do not
involve endowing households with knowledge of the RPE. The simplest way is to use the
value from the previous period6.
In the remainder of the paper, q is taken to be unity which can be interpreted as
endowing households with the knowledge that the macroeconomy is stationary. There
are two justifications for this. Firstly, estimating a VAR of the form (20) is problematic
with non-stationary variables. Secondly, the consumption function is strongly non-linear
for PLMs with eigenvalues greater than unity (recall that as eig (φs) → β−1, cs → +∞)and allowing beliefs to enter this range means arbitrary amounts of volatility can be
generated in the macroeconomy (see the discussion in section 5.6).
Projection is rarely discussed in the context of numerical analysis. Williams (2003)
and Eusepi and Preston (2011) both mention in footnotes that they discard explosive
values though it is not clear if this includes rational bubble paths, though in the latter
paper at least the very small gains used means that such paths will be rare events.
With "Euler equation learning" (Evans et al, 2011), there is no infinite forward sum
in the consumption function so the issue does not arise although Carceles-Poveda and
Giannitsarou (2007, p2673) explicitly exclude non-stationary paths.
4 Ordinary least squares learning
The results on e-stability and learnability discussed in section 3.6 are local and asymp-
totic. To investigate the convergence properties of the model it is necessary to turn to
6From a Bayesian perspective, projection is equivalent to having a truncated prior. When a drawtriggers the projection facility, the response of a Bayesian would be to move the posterior in the directionof the non-stationary solution rather than simply ignoring the information. In practice, the method ofchoosing the “projected”value makes no difference to the properties of the model.
13
simulations. This section takes the case of OLS learning (something of a benchmark in
the learning literature); the next section deals with constant gain learning.
4.1 The speed of convergence
Figure 1 shows the convergence of the model in the two extreme cases of households who
only look forward one period (T = 1) and those with an infinite horizon (T = ∞). A
uninformative prior is chosen setting all the elements of φ0 and ϕ0 to zero. The model
is then simulated many times and the figures show the the mean path of each element of
φ− φ∗ (the difference to the value at the MCEE) along with the range in which 99% of
paths lie.
[FIGURE 1 HERE]
Comparing the two panels of the figure is striking - if households have infinite horizons
(T = ∞), the model converges very quickly, within a 100 periods or the elements of φare very close to their value at the MCEE. In contrast, if households only look forward
one period (T = 1) the economy has not converged within the thousand periods shown
on the diagram and in fact doesn’t converge at all.
To understand this result, take a simple example in which households have learnt a
PLM which implies no persistence for the aggregate states i.e. φ = 02x2. In this case,
given the baseline calibration, with T = 1 the consumption function (31) is approximately
ct ≈ 0.5 (kt + λ2at) (40)
and with T =∞ct ≈ 0.01 (kt + λ2at) (41)
This is a direct consequence of the limited forecast horizon - in the first instance house-
holds spread their total wealth (the term in parenthesis, which since households believe
there is no persistence consists simply their holdings of capital and the output arising
from current technology) over two periods; in the second instance they spread it over
their infinite horizon.
The resulting ALMs are with T = 1
kt+1 = 0.53kt + 0.04at (42)
and with T =∞kt+1 = 1.01kt + 0.07at (43)
14
Recall from (35) that under model-consistent expectations, the law of motion of capital
is
kt+1 = 0.96kt + 0.06at (44)
Comparison of these shows that with T = ∞ the ALM is very close to the MCEE even
with a PLM so far from the MCEE7.
In the next period, the PLM is updated so will move further towards the MCEE in
the case of T = ∞ than in the case of T = 1 and hence convergence is faster. With
OLS learning the gain falls as time passes and with T = 1 the economy gets stuck away
from the MCEE (see section 5.1 below for a similar case with constant gain learning). In
fact, with T = 1 convergence only occurs if the economy is initialized very close to the
MCEE.
To summarise, the speed of convergence is increasing in the forecast horizon because
given a PLM the higher the forecast horizon, the more the resulting ALM resembles the
MCEE so the faster the PLM is updated towards the MCEE.
Plotting the elements of φ provides a useful illustration of the speed of convergence
but doesn’t say much about how the economy along the convergence path compares with
that at the MCEE since in general, different elements of φ will have different impacts
on the equilibrium (and households are also learning ϕ). Figure 2 instead plots impulse
response functions along the convergence path. These are computed by using the same
data as for figure 1 then for each draw running an impulse response function assuming
the law of motion in the economy is fixed at its value at a particular point along the
convergence path.
[FIGURE 2 HERE]
They tell the same story - convergence in terms of the behaviour of the economy is
much quicker for the infinite horizon case - but also show another interesting feature.
The confidence intervals for the T = ∞ case are much wider than with T = 1. This is
because the consumption in the infinite horizon case is much more sensitive to φ than in
the case of T = 1, a simple consequence of the infinite sum in the consumption function.
So a given volatility of φ results in a higher volatility of consumption with T =∞ than
with T = 1. This forms the basis of the formal result used in the following section.
4.2√t convergence
Theorem 3 of Benveniste et al (1990, p110)8 studies a system of the form of (20) and (32)
under OLS learning (γt = 1t). It states that if the derivative of h (Φ) defined in (39) has
7The ALM for capital with T =∞ is actually explosive but recall that this is not a steady state butinstead just a point along the convergence path.
8Also used by Marcet and Sargent (1995) and Ferrero (2007).
15
all eigenvalues with real parts less than −0.5 then
√t (Φt − Φ∗)
D→ N (0, P ) (45)
where the matrix P satisfies the Lyapunov equation
[I/2 + hΦ (Φ∗)]P + P [I/2 + hΦ (Φ∗)]′ + EH (Φ∗, Xt)H (Φ∗, Xt)′ = 0 (46)
As pointed out by Marcet and Sargent (1995), this means that if the conditions are satis-
fied, there is root - t convergence, although the formula for the variance of the estimators
is modified from the classical case. As the eigenvalues become larger, convergence is
slower in the sense that the variance covariance matrix of the limiting distribution P is
larger.
An analytical expression is available for the eigenvalues only in the case of T = 1 (see
Appendix C.2) so in the general case they are calculated numerically. The Jacobian will
have two eigenvalues equal to −1. For T = 1 the other two are −0.074 and −0.042; for
T = ∞, −2.56 and −1.54, so the theorem holds for the latter case and not the former.
Figure 3 plots the largest eigenvalue for a range of forecast horizons. Under the baseline
calibration the theorem holds for T > 12.
[FIGURE 3 HERE]
Eigenvalues were then calculated for around 15,000 calibrations over a wide grid9.
With T = ∞, the largest eigenvalue increases with α, η, ρ and decreases with β, δ.
The only cases where the theorem is not satisfied are with very persistent of aggregate
technology, ρ = 0.99. On the other hand, with T = 1 there were no cases which satisfy
the theorem.
There are a number of ways convergence could be further studied. One would be to
calculate the variance-covariance matrix P of the limiting distribution in (45). Another
would be to follow Marcet and Sargent (1995) who propose a statistic that allows the
speed of convergence to be studied (Ferrero, 2007 is a more recent application). They
define the rate of convergence δ as
tδ ‖φt − φ∗‖D→ F (47)
for some non-degenerate and well-defined distribution F . This has the desirable property
that it captures convergence in a single statistic.
9The ranges were chosen to encompass values commonly used in the literature. The grid isnot particularly fine, but experimentation showed no evidence of any non-linear effects. δ ∈[0.001, 0.01,0.025, 0.10, 0.50] ;α ∈ [0.4, 0.5,0.6, 0.7, 0.8] ; ; β ∈ [0.96, 0.97, 0.98,0.99, 0.999] ; η ∈[0, 1,5, 10,∞] ; ρ ∈ [0.7,0.9, 0.95 , 0.97, 0.99]; σ ∈ [0.01, 0.5,0.7, 1, 10]. The bold figure represents thebaseline calibration.
16
A drawback is that in general different elements of the PLM will have different effects
on the dynamics of the economy, the statistic may not give much useful information about
how close the behaviour of the economy along the convergence path is to that at the
MCEE. An alternative approach would look at the convergence of the impulse response
functions Figure 3 shows how the response of the economy to a positive technology
shock changes along the convergence path, the dotted line is the response at the REE,
the solid line is the mean response and 99% of responses lie within the shaded area. A
similar statistic to (47) could be defined in terms of the impulse-response function, and
this would give a more economically meaningful measure of convergence.
Carrying out such exercises for the model of this paper gives little interesting informa-
tion. The impact of different calibrations on the rate of convergence is modest and the
results of the kind reported in figures 1 and 2 are robust to all the calibrations studied.
The forecast horizon is the dominant determinant of the speed of convergence.
5 Constant gain learning
Constant gain learning is often used to study business cycle dynamics since it captures
the idea that learning is perpetual and allows households to respond to changes in the
structure of the economy. The gain parameter can be chosen in various ways. Milani
(2007, 2011) estimate it along with the other parameters of the model. Eusepi and
Preston (2011) use survey data. Evans and Ramey (2006) allow households to choose
it optimally. This paper will study gain parameters in the range [0.001 0.05] which
encompasses all the values commonly used. A baseline value of 0.01 is chosen.
A simple way to interpret the gain is by noting that the weight on the forecast error
from τ periods ago relative to the weight from the most recent forecast error is given by
(1− γ)s. So a gain of 0.02 (as estimated in Milani, 2007) implies data from around 34
quarters ago is given approximately half the weight of current data. On the other hand,
a gain of 0.002 (the baseline value of Eusepi and Preston, 2011) means households put
half as much weight on data from 84 years ago as they do on current data.
5.1 The speed of convergence
Figure 4 shows how φkk, the autoregressive term on capital in the PLM, converges for
different values of the gain10. Panel A shows the case of T = 1, panel B the case of
T =∞. Again the figures in panel A are drawn 1,000 periods, whereas those in panel Bare over 100 periods.
[FIGURE 4 HERE]
10Other elements of φ are not shown to save space, but tell a similar story.
17
These figures reinforce the message of the previous section. With T =∞ convergence
is fast for all values of the gain. With T = 1, convergence is much slower.
5.2 The distribution of beliefs
Evans and Honkapohja (2001, Theorem 7.8, p165) show that under certain conditions
beliefs converge to a limiting normal distribution with mean at the MCEE and standard
deviation increasing in the gain. In practice, the conditions for the theorem may not be
satisfied and the distributions of beliefs needs to be investigated numerically.
Beliefs in the model economy are characterized by 8 variables defined by (20) and
(21). Figure 5 shows the distribution of the elements of φ (similar graphs can be drawn
for the elements of ϕ but they do not add much to the intuition given here). It is
important to remember that the variables are not independent and in fact the stationary
distribution is an 8-diminesional object. For low gains, the mean of the distribution is the
same as the value of the PLM at the MCEE, and the distribution is symmetric. As the
gain increases, there are three effects. Firstly, the standard deviation of the distribution
increases; secondly, the mean of the distribution falls and thirdly the distribution becomes
more skewed.
[FIGURE 5 HERE]
The first of these is a direct result of the increasing gain and as expected from the
theorem of Evans and Honkapohja (2003). Higher gains mean more weight in the learning
rule on forecast errors, so the PLM becomes more volatile. The second and third are a
consequence of the interaction between increasing gain and projection. The higher the
gain, the higher the standard deviation of beliefs so the more likely they will be unstable
(in the sense of Definition 2) and so be eliminated by the projection algorithm11. This
truncates the distribution to the right, so reducing the mean and making the distribution
more skewed to the left12.
Table 1 shows the first three moments of these distributions and, in addition to con-
firming the observations made in the last paragraph also shows a further feature, that
the mean of the distribution is lower than at the MCEE even for low values of the gain
for which the projection algorithm is not used.
[TABLE 1 HERE]
11It might be thought that the likelihood of the projection algorithm being used would depend onthe volatility of the driving process for technology. However this is not the case since the weightingmatrix R in (20) corrects for this. At the MCEE, R is simply the variance covariance matrix of capitaland technology, so when the technology shock has small standard deviation the inverse of the weightingmatrix is large.12This argument applies to the autoregressive parameters φkk and φaa. It is a priori unclear what
effect projection will have on the cross terms φka and φak or on the elements of ϕ.
18
Understanding the effects of these stationary distributions in the model economy is
complex, so first consider a simple example.
5.3 A simple example
To understand the effect of a stationary distribution of beliefs on the macroeconomy, it
is helpful to consider a simple univariate example13 in which capital and labour are fixed
and income follow an exogenous AR (1) process:
yt = ρyt−1 + εt (48)
Beliefs are parametrized by a scalar φ such that
Etyt+i = φiyt (49)
then the consumption function is
ct =r
1 + r
[(1 + r) bt +
1
1− φt (1 + r)−1yt
](50)
where bt is current wealth and the second term represents expectations about future in-
come. Note the first and second derivatives to φ of the second term are positive capturing
the positive and increasing effect of income persistence on consumption. Although only
the infinite forecast horizon case will be considered here, the second derivative of f is
positive as long as T > 0.
When beliefs are model-consistent, i.e. φt = ρ ∀t consumption will be a random walkand the standard deviation of the first difference of consumption is
σ∗∆c =r
1 + r
1
1− ρ (1 + r)−1σε (51)
Beliefs are updated by a simplified constant gain learning algorithm
φt+1 = φt + γ (yt − φtyt−1) (52)
How does the stationary distribution of φ affect the economy? Firstly, assume that
the distribution has a mean of ρ (the value of beliefs at the MCEE); non-zero standard
deviation and is symmetric. To understand the impact of this distribution on the un-
conditional properties of consumption consider the response of consumption to a positive
innovation to income. Taking ρ = 0.9, figure 6 shows the response in the three cases of
φ0 = φ∗ = ρ; φ0 = 0.95 > ρ and φ0 = 0.85 < ρ.
13Full details are in Appendix D.
19
[FIGURE 6 HERE]
5.3.1 Case 1: φ = ρ
If households’beliefs are correct, then the impact response of consumption will be exactly
that at the MCEE. In the second period, beliefs will be revised upwards. This will mean
consumption in period 2 is higher than it would be in at MCEE since households believe
income will be more persistent than it actually is. In the third period, there are two
effects. Firstly, beliefs will be revised downward towards the MCEE. Secondly, household
wealth will be lower than expected. Both of these tend to reduce consumption. As time
passes, these two effects continue, and at some point consumption will fall below its
value at the MCEE and remain there for the rest of history (as is required to satisfy the
intertemporal budget constraint).
To summarise, learning has no impact effect but causes consumption to rise above
its value at the MCEE for a number of periods after the initial one, then fall below this
value for the rest of time.
Proposition 1. If beliefs are initialized at the MCEE, the impulse response function withlearning implies a higher volatility of consumption growth than without learning
Proof. See Appendix D.1
5.3.2 Case 2: φ > ρ
In this case households believe that income is more persistent that it is at the MCEE so
on impact increase their consumption by more than with correct beliefs. In subsequent
periods there are two effects. Firstly, households wealth will be lower than expected
which will tend to reduce consumption. Secondly, beliefs will be revised, in the second
period upward and in subsequent periods downward back towards the MCEE. In the
second period the second effect dominates so consumption increases further, in subsequent
periods both effects go in the same direction and as time passes, consumption will fall
below its value at the MCEE and stay there for the rest of time. So the overall effect is
higher consumption than at the MCEE for some initial periods, then consumption lower
than at the MCEE for the rest of time.
5.3.3 Case 3: φ < ρ
The intuition for this case is simply the mirror image of that with φ > ρ. However
note the difference in magnitude. Since the derivative of the consumption function is
increasing in φ, the response is much smaller to a lower value of φ than to the higher one
of the previous section.
20
Given these three cases, the unconditional properties of consumption will be the aver-
age of the three cases weighted by the stationary distribution of φ. Since the distribution
is assumed to be symmetric, the larger impact of case 2 will dominate the smaller one of
case 3 and the volatility of consumption will increase.
So the distribution of beliefs will unambiguously increase the volatility of consump-
tion. The higher the gain, the higher will be the standard deviation of beliefs so the
higher will be the standard deviation of consumption. This is the simple intuition that
"learning increases aggregate volatility" though note that the volatility it increases is that
of consumption. As we shall see below, in general labour supply and investment effects
of this mean a reduction in the volatility of output.
There is a further effect. The theorem of Evans and Honkapohja (2001) that states
the mean of the distribution will be at the MCEE only holds for small values of the gain.
In practice, the mean will often be different from the MCEE. Since the distribution of
beliefs causes the mean response of consumption to be different from that at the MCEE,
the response of capital will also be different (if consumption responds by more capital
would be expected to be less persistent) and hence the mean ALM will be different from
the MCEE. So the mean of the distribution will be different from the MCEE, in this
case lower.
How do the properties of consumption change if the mean of the distribution is lower
than at the MCEE (either for the reason given in the previous paragraph or due to the
projection facility, as will be discussed in the next section)? If the mean is lower, draws
of φ from case 3 are more likely than those from case 2, and if the it is suffi ciently low
this will result in the standard deviation of consumption falling below its value at the
MCEE. Similarly, if the distribution is suffi ciently skewed to the left this will result in
the standard deviation of consumption falling.
To summarise, this simple example suggests that the stationary distribution of beliefs
will have the following effects:
1. If it is symmetrical, the non-linearity of consumption to beliefs will mean consump-
tion responds by more on impact and be more volatile. This will imply the mean
of the distribution is slightly lower than at the MCEE.
2. If the mean of the distribution is lower, this will offset the effects in (1) and make
consumption respond by less on impact and be less volatile
3. If the distribution is skewed to the left, this will further offset the effects.
5.4 Inspecting the mechanism
Returning to the model presented in section 2, the statistic used to study the effects of
learning will be the ratio of the standard deviation of a variable in the model with learning
21
to the standard deviation in the model under model-consistent expectations. Table 2
shows how this statistic varies with the learning gain. First note the key features: learning
increases the standard deviation of consumption, labour and investment but reduces that
of output. Overall the effects are small, at most a few percent. The impact on output
is in contrast to the conventional intuition which says learning increases volatility.
[TABLE 2 HERE]
The final column of the table shows that percentage of periods in which the projection
facility was used is increasing in the gain. This is because increasing gain implies in-
creasing volatility of beliefs so a higher likelihood that beliefs will correspond to "rational
bubble" or explosive paths for which the projection facility is invoked.
To understand these statistics, and the impact of learning it is useful to look at the
impulse response functions to a positive innovation to the process for technology in figure
714. These show the difference between the impulse response under learning and the
impulse response at the MCEE. As with all RBC-type models, the dynamics of all the
variables are driven by the response of consumption so start by focussing on this variable.
[FIGURE 7 HERE]
First take the case of a very low gain γ = 0.001. This eliminates the effect of
projection (see the final column of table 1) so the mean of the distribution is very close
to the MCEE (it will be slightly different due to the effect discussed in the previous
section). Then, as discussed in the previous section, because of the non-linearity of the
consumption function to beliefs the impact response of consumption will be higher than
at MCEE. This means labour supply responds by less than at the MCEE (see 30) so
output and investment also respond by less on impact.
In subsequent periods the intuition is similar to that given in the previous section.
Beliefs are updated and then adjust back to their value at the MCEE which tends to
lead to higher consumption than at the MCEE. Secondly, household wealth falls below
its value at the MCEE due to lower investment and this tends to reduce consumption.
As time passes, the first effect becomes weaker and the second effect comes to dominate:
consumption remains above its value at the MCEE for the first ten or so periods; then
the effect of lower wealth causes it to fall below the MCEE value. Investment and labour
supply remain below their value at the MCEE for fifteen or so periods; output is always
below its value at the MCEE, explaining its lower standard deviation.14Calculating impulse responses is complicated in a model with learning for two reasons. Firstly, the
size of the forecast error resulting from a shock (and hence the extent to which beliefs are updated andthe form of the IRF) will depend on the current state of the economy i.e. on Xt−1 in the perceived lawof motion (20). Secondly, the effect of beliefs on the subsequent evolution of the economy is non-linearso the economy can not be understood as sum of the IRFs in each period as is the case when everythingis linear. Impulse responses are therefore obtained by simulating a large amount of data and estimatingthe impulse response function as a high order MA.
22
Now consider the case of a higher gain γ = 0.02. Now the projection algorithm
will come into play (see the final column of table 2) and, as discussed in section 5.1, this
implies that mean beliefs correspond to less persistent expectations than at the MCEE.
So on impact consumption responds by less than at the MCEE. In the next period beliefs
are updated, but note the much larger effects of the updating due to the higher gain. In
subsequent periods, the intuition is the same as with a lower gain with the stronger effect
on labour supply and investment meaning output is further below its value at the MCEE.
5.5 Forecast horizon and dynamics
Table 3 shows the impact of varying the forecast horizon, holding the gain constant at
0.01. The table again shows the standard deviations of the variables relative to the
MCEE. The volatility of consumption is increasing in the forecast horizon, whereas the
volatility of output, labour and investment is falling. Note that for low values of the
forecast horizon we recover the simple intuition that learning unambiguously increases
volatility, but the effects are still small.
[TABLE 3 HERE]
To help understand these results, figure 8 shows the difference between the impact
response of consumption for T = 1 and T = ∞. The response with a forecast horizon
of 1 differs in two important respects: firstly, the impact effect is larger; secondly the
"kink" in the second period caused by learning is much smaller, so much so that con-
sumption remains below its value at the MCEE for the first 10 periods. These are both a
consequence of the observation made in section 4.1 that the shorter the forecast horizon,
the less sensitive is consumption to beliefs. The two effects arise because with T = 1
consumption is less affected by mean beliefs being lower (so the impact effect is larger);
and less affected buy the updating of beliefs in period 2 (so the kink is smaller).
[FIGURE 8 HERE]
In the case of T = 1, the fact that investment is higher than its value at the MCEE
means the capital stock is higher and so output is higher than at the MCEE in all
periods i.e. the response of output is unambiguously amplified and the simple intuition,
that learning increases volatility is recovered. Note this only happens in the special case
of a low forecast horizon.
5.6 The impact of projection
The results in the previous sections assumed the projection facility is implemented to
prevent households learning non-stationary "rational bubble" paths for the states. This
23
is a key part of the mechanism by which learning affects the dynamics of the economy.
Table 4 shows the relative standard deviations if this assumption is relaxed, and instead
the projection facility is set to allow PLMs with eigenvalues in the range(1, β−1
)i.e. non-
stationary "rational bubble" paths can be learned, but explosive paths are excluded.
[TABLE 4 HERE]
The first two rows are the same as those of table 2 since for small values of the gain
projection is not used. Larger values of the gain can result in explosive paths for the
endogenous variables (recall that both Williams, 2003 and Eusepi and Preston, 2011 write
that they discard such explosive solutions) and the final column of the table shows how
often such explosive paths occur. This issue arises because as the eigenvalues of the PLM
approach β−1, consumption asymptotes to infinity. In a suffi ciently long sample of a
model with a suffi ciently high gain, PLMs with such eigenvalues will be learnt resulting
in a response of consumption which is arbitrarily large.
This emphasizes the importance of always simulating learning models for very large
samples. In a short sample, particularly if the gain is in the middle of the range used
in this paper, it is possible that PLMs with eigenvalues close to β−1 will not be drawn,
so the economy appears to be stable. A potential way of dealing with this is to choose
to invoke the projection facility at some point within the range(1, β−1
). However this
allows any level of consumption volatility to be generated, and there seems to be no way
to support a particular choice.
5.7 Other sensitivities
Table 5 shows how the relative standard deviation varies with η, the intertemporal elastic-
ity of labour supply. As η increases labour supply becomes less elastic and the volatility
of consumption increases. Recall that, other things equal, labour supply varies inversely
with consumption and hence if consumption increases, labour supply falls which reduces
current period output causing investment to fall and thus reducing output and consump-
tion in the next period. The less elastic is labour supply, the weaker is this effect so the
volatility of consumption rises with rising η. The more volatile is consumption the less
volatile will be investment and hence the capital stock, so the volatility of output will
also fall.
[TABLE 5 HERE]
Table 6 shows the effect of ρ, the persistence of the technology process. As ρ falls the
volatility of both consumption and output rises. The reason for this is straightforward.
The lower is ρ, the less likely it is that the projection facility will be used, so the closer
is the mean of the distribution of the PLM to its value at the MCEE. This mutes the
24
effect described in section 5.4 so increasing the volatility of the variables relative to the
MCEE.
[TABLE 6 HERE]
5.8 An intercept in the learning rule
The previous sections assumed the learning rules were specified in terms of the variables
in the MSV representation of the economy with model-consistent expectations. Many
other specifications are possible, for example including other aggregates on the right hand
side of (20) such as consumption or returns; or including more lags. Such changes to the
learning rule will change the effects of learning on the economy, but it is not clear how
any particular learning rule can be justified.
A number of recent papers (Milani, 2011, Eusepi and Preston, 2011) include an in-
tercept in the learning rule, interpreted as capturing households’uncertainty about the
steady state. It is straightforward to augment the model of this paper with an intercept
by changing the state vector in the PLM (20) to Xt =[
1 kt at
]′. Table 7 shows the
effect of this change on relative standard deviations. Comparing this table with table 2
shows that including an intercept strengthens the effect of learning. For example, con-
sidering a gain of 0.01, with the intercept consumption is 4% more volatile than at the
MCEE compared to 2% without the intercept; and output is 4% less volatile, compared
to 2% without the intercept. The final table of the column shows the percentage of runs
that are unstable (in which consumption exceeds machine limits). For a gain of 0.02,
around 10% of runs are unstable, for a gain of 0.05 all runs are unstable.
[TABLE 7 HERE]
Why should uncertainty about the intercept translate into high volatility and insta-
bility? To answer this, modify the simple example of section 5.3 to include an intercept.
Consumption is then
ct =r
1 + r
[(1 + r) bt +
1
1− (1 + r)−1φ1 +1
1− φt (1 + r)−1yt
](53)
where the second term picks up the effect of the intercept, a discounted forward sum of
a constant. For the discount factor of the baseline calibration, 11−(1+r)−1
≈ 100 which, if
the persistence of income is 0.9 is around 10 times higher than the coeffi cient on income.
Take a case where the household has estimated a positive value of φ1. This means
consumption will be high and labour supply low and so the wage will be higher than the
household expected. This will mean in the next period φ1 is higher, and so on until
the economy explodes. With small values of the gain, deviations in φ1 are small so this
25
mechanism is dominated by the shocks. With larger gain, the economy becomes unstable.
Note that the instability only if households are very forward looking - in models that
take the "Euler equation learning" approach (for example Milani, 2011, which uses an
intercept) it is not an issue since the amplification of the effect of learning a non-zero
intercept is much smaller.
6 Discussion
This paper described a baseline case which allows the impact of the assumption of model-
consistent expectations to be studied in isolation. If households are individually rational,
the assumption of model-consistent expectations does not seem important in the sense
that if households are endowed with simple learning rules, the economy with learning
is very similar to that with model-consistent expectations. The response of output
is actually muted under learning, which shows that the simple intuition that learning
increases volatility does not hold in this benchmark model. The mechanism by which
learning affects the economy is in fact quite complex.
The degree of individual rationality, captured by parametrizing households’forecast
horizons, has significant effects on the properties of the model, in particular it is the
key parameter which affects the speed of convergence. Also only in the case of very
short forecast horizons (similar to "Euler equation learning") can learning increase the
volatility of all variables, in keeping with simple intuition.
One way of reading the results of this paper is as support for the assumption of
model-consistent expectations. However there are a number of implicit assumptions in
the paper which mean that this reading should be taken with a pinch of salt.
The first is that if households are endowed with suffi cient cognitive ability estimate
VARs and solve infinite horizon problems, it seems strange to assume that they do not
realize they are identical and that solving their consumption problem is the same as
solving the aggregate economy. However a previous version of this paper studied a model
with heterogeneity among households (created by a household specific productivity shock)
and while this greatly complicated the model it did not have any dramatic effect on the
results (though it did mitigate the instability problem with an intercept). So perhaps
the assumption that households do not know they are identical is not as important as it
might initially seem.
The second is that the results are derived under the assumption of full information.
A large literature (see Hellwig, 2006 for a review) argues that this is a very strong as-
sumption and Graham and Wright (2010) solve a model similar to this paper under
incomplete information and model-consistent expectations and find that imperfect infor-
mation has a dramatic effect on the properties of the model. Graham (2011) extends
this to a model with learning but finds that while information matters, the assumption
26
of model-consistent expectations is not important.
Thirdly, the assumption of linearity means that the household’s problem is a relatively
simple one. Embedding learning rules into a non-linear model (of the style of Krusell
and Smith, 1997) seems an interesting avenue for future research.
References
Adam, K., Marcet, A., 2011. Internal Rationality, Imperfect Market Knowledge and Asset
Prices. Journal of Economic Theory, 146, pp. 1224-1252.
Benveniste, A., Metivier, M., Priouret, P., 1990. Adaptive Algorithms and Stochastic
Approximations. Springer, Berlin.
Branch, W., Evans, G., McGough, B. 2010. Finite Horizon Learning. University of Oregon
Economics Department Working Papers 2010-15.
Bray, M. M., Savin, N. E., 1986. Rational Expectations Equilibria, Learning, and Model
Specification. Econometrica, 54, pp. 1129-1160.
Bullard, J., Mitra, K., 2002. Learning about monetary policy rules. Journal of Monetary
Economics, 49(6), pp. 1105-1129.
Campbell, J.Y., 1994. Inspecting the mechanism: an analytical approach to the stochastic
growth model. Journal of Monetary Economics 33, pp. 463-506.
Carceles-Poveda, E., Giannitsarou, C., 2007. Adaptive learning in practice. Journal of
Economic Dynamics and Control, 31, 8, pp. 2659-2697.
Dawid, H., 2005. Long horizon versus short horizon planning in dynamic optimization
problems with incomplete information. Economic Theory, 25(3), pp. 575-597.
Ellison, M., Pearlman, J., 2011.Saddlepath Learning. Journal of Economic Theory, forth-
coming.
Eusepi, S. and Preston, B., 2008. Expectations, Learning and Business Cycle Fluctua-
tions. NBER Working Papers 14181, National Bureau of Economic Research, Inc.
Evans, G., Honkapohja, S., 2001. Learning and Expectations in Macroeconomics. Prince-
ton University Press, Princeton.
Evans, G., Honkapohja, S., Mitra, K., 2009. Anticipated fiscal policy and adaptive
learning. Journal of Monetary Economics, 56(7), pp. 930-953.
27
Evans, G., Honkapohja, S., Mitra, K., 2011. Notes on Agents’Behavioral Rules Under
Adaptive Learning and Studies of Monetary Policy. CDMA Working Paper 11/02.
Evans, G., Ramey, G., 2006. Adaptive expectations, underparameterization and the
Lucas critique. Journal of Monetary Economics, 53(2), pp. 249-264.
Ferrero, G., 2007. Monetary policy, learning and the speed of convergence. Journal of
Economic Dynamics and Control, 31(9), pp. 3006-3041.
Graham, L., 2011. Learning, information and heterogeneity. Working paper.
Graham, L., Wright S., 2010. Information, market incompleteness and heterogeneity.
Journal of Monetary Economics, 57, 2, pp. 164-174.
Grandmont, J.-M., 1998. Expectations formation and stability of large socioeconomic
systems. Econometrica, 66(4), pp. 741-781.
Grandmont, J.-M., Laroque G., 1991. Economic dynamics with learning: some insta-
bility examples, In: Barnett, W.A. et al. Eds.., Equilibrium Theory and Applications,
Proceedings of the Sixth International Symposium in Economic Theory and Economet-
rics. Cambridge University Press,Cambridge, pp. 247—273.
Hellwig, C., 2006. Monetary Business Cycle Models: Imperfect Information. UCLA
Working Paper
Krusell, P., Smith, A., 1998. Income and Wealth Heterogeneity in the Macroeconomy.
Journal of Political Economy 106(6), pp. 867-96.
Ljung, L., 1977. Analysis of recursive stochastic algorithms. IEEE Trans. Auto. Control
AC 22, pp. 551—575.
Marcet, A., Sargent, T.J., 1995. Speed of convergence of recursive least squares: learning
with autoregressive moving-average perceptions. In: Kirman, A., Salmon, M. (Eds.),
Learning and Rationality in Economics. Basil Blackwell, Oxford, pp. 179—215.
Milani, F., 2007. Expectations, learning and macroeconomic persistence. Journal of Mon-
etary Economics, 54, pp. 2065—2082.
Milani, F., 2011. Expectation Shocks and Learning as Drivers of the Business Cycle.
Economic Journal, 121 (552), pp. 379-401.
Williams, N., 2003. Adaptive Learning and Business Cycles. Working paper.
28
Figure 1A:Convergence under OLS learning with T = 1 over 1,000 periods
φkk φkaConfidence intervals for phi(k,k)
Periods
Dev
iatio
n fro
m M
CE
100 200 300 400 500 600 700 800 900 10001.2
1
0.8
0.6
0.4
0.2
0
0.2Confidence intervals for phi(k,a)
Periods
Dev
iatio
n fro
m M
CE
100 200 300 400 500 600 700 800 900 10000.07
0.06
0.05
0.04
0.03
0.02
0.01
0
0.01
0.02
0.03
φak φaaConfidence intervals for phi(a,k)
Periods
Dev
iatio
n fro
m M
CE
100 200 300 400 500 600 700 800 900 10004
3
2
1
0
1
2
3Confidence intervals for phi(a,a)
Periods
Dev
iatio
n fro
m M
CE
100 200 300 400 500 600 700 800 900 10002
1.5
1
0.5
0
0.5
x-axis is time; y-axis is the deviation of the element of the PLM from its value at the MCEE,
95% of responses lie within the shaded areas. Graphs taken from 25,000 repetitions.
29
Figure 1B:Convergence under OLS learning with T =∞ over 100 periods
φkk φkaConfidence intervals for phi(k,k)
Periods
Dev
iatio
n fro
m M
CE
10 20 30 40 50 60 70 80 90 1001.2
1
0.8
0.6
0.4
0.2
0
0.2Confidence intervals for phi(k,a)
Periods
Dev
iatio
n fro
m M
CE
10 20 30 40 50 60 70 80 90 1000.1
0.05
0
0.05
0.1
0.15
0.2
0.25
φak φaaConfidence intervals for phi(a,k)
Periods
Dev
iatio
n fro
m M
CE
10 20 30 40 50 60 70 80 90 1004
3
2
1
0
1
2
3Confidence intervals for phi(a,a)
Periods
Dev
iatio
n fro
m M
CE
10 20 30 40 50 60 70 80 90 1002
1.5
1
0.5
0
0.5
x-axis is time; y-axis is the deviation of the element of the PLM from its value at the MCEE,
95% of responses lie within the shaded areas. Graphs taken from 25,000 repetitions.
30
Figure 2A:Convergence under OLS learning with T = 1 , impulse responses
t = 20 t = 50Confidence intervals for IRF at time 20
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7Confidence intervals for IRF at time 100
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
t = 50 t = 500Confidence intervals for IRF at time 200
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7Confidence intervals for IRF at time 500
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
t = 700 t = 1000Confidence intervals for IRF at time 700
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7Confidence intervals for IRF at time 1000
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
x-axis shows number of from impulse; y-axis the deviation of the consumption from its steady
state. The solid line is the impulse response at the MCEE, the dotted line the mean response
under learning. 99% of the responses under learning lie within the shaded areas. Graphs taken
from 25,000 repetitions.
31
Figure 2B:Convergence under OLS learning with T =∞ , impulse responses
t = 10 t = 20Confidence intervals for IRF at time 10
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5Confidence intervals for IRF at time 20
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
t = 50 t = 100Confidence intervals for IRF at time 50
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7Confidence intervals for IRF at time 100
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
t = 200 t = 500Confidence intervals for IRF at time 200
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5Confidence intervals for IRF at time 500
Periods
Dev
iatio
n fro
m s
tead
y st
ate
5 10 15 20 25 30 35 40 45 500
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x-axis shows number of from impulse; y-axis the deviation of the consumption from its steady
state. The solid line is the impulse response at the MCEE, the dotted line the mean response
under learning. 99% of the responses under learning lie within the shaded areas. Graphs taken
from 25,000 repetitions.
32
Figure 3: The largest eigenvalue for different forecast horizons
0 100 200 300 400 500 6003
2.5
2
1.5
1
0.5
0
Forecast horizon, T
Max
imum
eig
enva
lue
33
Figure 4A:Convergence of φkk under constant gain learning with T = 1 over1,000 periods for different values of the gain
γ = 0.001 γ = 0.002Gain 0.001: Confidence intervals for phi(k ,k )
Periods
Dev
iatio
n fro
m M
CE
100 200 300 400 500 600 700 800 9001
0.8
0.6
0.4
0.2
0
0.2
0.4Gain 0.002: Confidence intervals for phi(k ,k )
Periods
Dev
iatio
n fro
m M
CE
100 200 300 400 500 600 700 800 9001
0.8
0.6
0.4
0.2
0
0.2
0.4
γ = 0.01 γ = 0.02Gain 0.01: Confidence intervals for phi(k ,k )
Periods
Dev
iatio
n fro
m M
CE
100 200 300 400 500 600 700 800 9001
0.8
0.6
0.4
0.2
0
0.2
0.4Gain 0.02: Confidence intervals for phi(k ,k )
Periods
Dev
iatio
n fro
m M
CE
100 200 300 400 500 600 700 800 9001
0.8
0.6
0.4
0.2
0
0.2
0.4
x-axis is time; y-axis is the deviation of the element of the PLM from its value at the MCEE,
95% of responses lie within the shaded areas. Graphs taken from 25,000 repetitions.
34
Figure 4B:Convergence of φkk under constant gain learning with T =∞ over100 periods for different values of the gain
γ = 0.001 γ = 0.002Gain 0.001: Confidence intervals for phi(k ,k )
Periods
Dev
iatio
n fro
m M
CE
10 20 30 40 50 60 70 80 901.2
1
0.8
0.6
0.4
0.2
0
0.2Gain 0.002: Confidence intervals for phi(k ,k )
Periods
Dev
iatio
n fro
m M
CE
10 20 30 40 50 60 70 80 901.2
1
0.8
0.6
0.4
0.2
0
0.2
γ = 0.01 γ = 0.02Gain 0.01: Confidence intervals for phi(k ,k )
Periods
Dev
iatio
n fro
m M
CE
10 20 30 40 50 60 70 80 901.2
1
0.8
0.6
0.4
0.2
0
0.2Gain 0.02: Confidence intervals for phi(k ,k )
Periods
Dev
iatio
n fro
m M
CE
10 20 30 40 50 60 70 80 901.2
1
0.8
0.6
0.4
0.2
0
0.2
x-axis is time; y-axis is the deviation of the element of the PLM from its value at the MCEE,
95% of responses lie within the shaded areas. Graphs taken from 25,000 repetitions.
35
Figure 5: The stationary distribution of φ
(best viewed in colour; if viewed in monochrome, note that as the gain rises, the mean
of the distributions fall)
φkk φka
0.95 0.955 0.96 0.965 0.97 0.9750
2000
4000
6000
8000
10000
120000.020.010.0050.0020.0010.0001
0.05 0.055 0.06 0.0650
2000
4000
6000
8000
10000
12000
14000
16000
180000.020.010.0050.0020.0010.0001
φak φaa
0.2 0.15 0.1 0.05 0 0.05 0.1 0.15 0.20
0.5
1
1.5
2
2.5x 10 4
0.020.010.0050.0020.0010.0001
0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 10
2000
4000
6000
8000
10000
12000
14000
160000.020.010.0050.0020.0010.0001
36
Figure 6: A simple example: impulse responses of consumption withdifferent beliefs
0 5 10 15 20 25 30
0.04
0.045
0.05
0.055
0.06
0.065
0.07phi0=phi*; no learning
phi0<phi*
phi0=phi*
phi0>phi*
37
Figure 7: Changing gain: difference between impulse responses withlearning and those at the MCEE
consumption output
0 10 20 30 40 500.02
0.01
0
0.01
0.02
0.03
0.04
0.05
0.060.0010.02
0 10 20 30 40 500.025
0.02
0.015
0.01
0.005
0
0.005
0.010.0010.02
labour investment
0 10 20 30 40 500.035
0.03
0.025
0.02
0.015
0.01
0.005
0
0.005
0.010.0010.02
0 10 20 30 40 500.25
0.2
0.15
0.1
0.05
0
0.05
0.10.0010.02
38
Figure 8: Changing forecast horizon: difference between impulse responseswith learning and those at the MCEE
consumption output
0 10 20 30 40 500.02
0.015
0.01
0.005
0
0.005
0.01
0.015
0.02
0.025
0.031inf
0 10 20 30 40 500.015
0.01
0.005
0
0.005
0.011inf
labour investment
0 10 20 30 40 500.02
0.015
0.01
0.005
0
0.005
0.01
0.0151inf
0 10 20 30 40 500.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.081inf
39
Table 1: Properties of the stationary distribution of the economy withconstant gain learning
φkk (φ∗kk = 0.964) φka (φ
∗ka = 0.058)
Gain Mean SD Skewness
0.0001 0.964 0.000 0.000
0.001 0.963 0.001 0.000
0.002 0.963 0.002 0.007
0.005 0.962 0.003 0.026
0.01 0.962 0.005 0.093
0.02 0.962 0.005 0.317
Gain Mean SD Skewness
0.0001 0.059 0.000 0.000
0.001 0.059 0.001 0.000
0.002 0.059 0.001 −0.157
0.005 0.059 0.002 −0.735
0.01 0.058 0.003 −1.049
0.02 0.058 0.004 −0.944
φak (φ∗ak = 0.000) φaa (φ
∗aa = 0.9000)
Gain Mean SD Skewness
0.0001 0.000 0.004 0.000
0.001 −0.001 0.014 0.000
0.002 −0.002 0.020 −0.012
0.005 −0.004 0.033 −0.077
0.01 −0.007 0.050 −0.106
0.02 −0.014 0.077 −0.159
Gain Mean SD Skewness
0.0001 0.900 0.004 0.000
0.001 0.899 0.011 −0.210
0.002 0.898 0.016 −0.283
0.005 0.897 0.025 −0.351
0.01 0.893 0.035 −0.432
0.02 0.886 0.054 −0.558
40
Table 2: Changing gain: ratio of moments of key variables in the economywith learning to their value at the MCEE
Gain c y n x Proj
0.001 1.002 0.999 1.003 1.001 0.00%
0.002 1.003 0.997 1.004 1.000 0.00%
0.005 1.008 0.993 1.005 0.997 0.00%
0.01 1.017 0.984 1.007 0.993 0.14%
0.02 1.030 0.977 1.019 0.990 1.05%
0.05 1.085 0.975 1.029 0.989 4.12%
Table 3: Changing forecast horizon: ratio of moments of consumption in theeconomy with learning to their value at the MCEE
T c y n x Proj
1 1.001 1.002 1.009 1.006 0.33%
2 1.001 1.001 1.007 1.005 0.33%
5 1.001 0.999 1.002 1.000 0.31%
10 1.002 0.996 0.997 0.996 0.29%
50 1.007 0.990 0.998 0.995 0.21%
∞ 1.017 0.984 1.007 0.993 0.14%
Table 4: Changing gain with rational bubble paths: ratio of moments of keyvariables in the economy with learning to their value at the MCEE
Gain c y n x Errors
0.001 1.002 0.999 1.003 1.001 0%
0.002 1.003 0.997 1.004 1.000 0%
0.005 1.011 0.990 1.011 1.000 0%
0.01 1.078 0.983 1.120 1.045 9%
0.02 2.023 1.031 2.685 1.985 31%
0.05 n/a n/a n/a n/a 100%
41
Table 4: Changing elasticity of labour supply, η : ratio of moments of keyvariables in the economy with learning to their value at the MCEE
η c y n x
0 1.000 0.981 1.024 1.005
1 1.006 0.981 1.016 0.997
2 1.010 0.982 1.012 0.994
5 1.017 0.984 1.007 0.993
10 1.021 0.986 1.005 0.990
∞ 1.028 0.991 1.003 0.989
Table 5: Changing persistence of technology, ρ : ratio of moments of keyvariables in the economy with learning to their value at the MCEE
ρ c y n x Proj
0.5 1.024 0.998 1.005 1.002 0.03%
0.6 1.024 0.996 1.004 1.000 0.04%
0.7 1.023 0.994 1.003 0.998 0.05%
0.8 1.022 0.992 1.003 0.996 0.06%
0.9 1.017 0.984 1.007 0.993 0.14%
0.95 1.010 0.977 1.034 0.991 0.37%
Table 6: Changing gain with intercept in learning rule: ratio of moments ofkey variables in the economy with learning to their value at the MCEE
Gain c y n x Errors
0.001 1.009 0.995 1.009 1.000 0%
0.002 1.019 0.983 1.021 0.998 0%
0.005 1.025 0.974 1.028 0.995 0%
0.01 1.041 0.960 1.031 0.986 4%
0.02 1.063 0.942 1.046 0.979 9%
0.05 n/a n/a n/a n/a 100%
42
A Linearisation
A.1 The steady state
From the Euler equation (4):
R = β−1 (A.1)
From (7)
rk = (1− α)y
k(A.2)
so
(1− α)Y
K= β−1 + (1− δ) (A.3)
K
Y=
1− αr − 1 + δ
(A.4)
Then from the capital evolution equation (3)
I = δK (A.5)
Then from the resource constraint
C
Y= 1− I
Y(A.6)
A.2 Linearisation
The Euler equation (4):
Est∆c
st+1 = Es
t rst+1 (A.7)
Definition of return to capital
rt = κrkt (A.8)
κ =R− 1 + δ
R(A.9)
FOC for labour (5):
nst = ς (wst − cst) (A.10)
where
ς =1− nNη
(A.11)
The household’s budget constraint (2):
Cs
Y scst +
(1− Cs
Y s
)ist = α (wst + nst) + (1− α) (rskt + kst ) (A.12)
43
and the capital evolution equation (3)
kst+1 = (1− δ) kst + δist (A.13)
The firm FOCs (7)
wt = yt − nt (A.14)
rt = yt − kt (A.15)
The production function (6)
yt = αat + αnt + (1− α) kt (A.16)
and the aggregate resource constraint is
yt =C
Yct +
(1− C
Y
)it (A.17)
A.3 Market clearing prices
For the next two sections, the derivation is from the modeler’s perspective i.e. with the
knowledge that in equilibrium all households are identical.
To get an expression for the wage, use (A.10) and (A.14) to write
wt =αat + (1− α) kt + (1− α) ςct
1 + (1− α) ς(A.18)
rkt = αat − αkt + αnt (A.19)
= (αat − αkt + να (αat + (1− α) kt − ct)) (A.20)
Then using (A.8)
rt = κ [(ν (1− α)− 1) kt + (1 + να) at − νct] (A.21)
Write these as
wt = λwkkt + λwaat + λwcct (A.22)
rt = λrkkt + λraat + λrcct (A.23)
44
where
λwk = ν (1− α) (A.24)
λwa = να (A.25)
λwc = ν (1− α) ς (A.26)
λrk = λ3 (ν (1− α)− 1) (A.27)
λra = λ3 (1 + να) (A.28)
λrc = −λ3ν (A.29)
Then (A.22) can be combined with (A.10) to give a relation between labour supply and
consumption
nt = ς
(αat + (1− α) kt + (1− α) ςct
1 + (1− α) ς− cst
)(A.30)
= ς
(αat + (1− α) kt − ct
1 + (1− α) ς
)(A.31)
A.4 A useful representation
This follows Campbell (1994) to write the capital evolution in terms of states and con-
sumption. Combining (A.13) and (A.17) gives:
kt+1 = (1− δ) kt +i
k
(yiyt −
c
xct
)(A.32)
= λ1kt + λ2 [at + nt] + λ4ct (A.33)
where
λ1 = (1− δ) +y
k(1− α) =
1
β(A.34)
λ2 =y
kα (A.35)
λ4 = − ck
(A.36)
Then substituting (A.16) into (A.15) gives
rkt = αat − αkt + αnt (A.37)
and using (A.8)
rt = κ (at − kt + nt)
45
Next substitute in (A.10) to get
kt+1 = λ1kt + λ2 [at + nt] + λ4ct (A.38)
= λ1kt + λ2 [at + ς (wt − ct)] + λ4ct (A.39)
and finally use the (A.22) to give
kt+1 =1
γ
(λ1kt + λ2
[at + ς
(αat + (1− α) ςct + (1− α) kt
1 + (1− α) ς− ct
)]+ λ4ct
)(A.40)
= λ1kt + λ2at + λ4ct (A.41)
where
λ1 = λ1 + λ2 (1− α) ν (A.42)
λ2 = λ2 (1 + αν) (A.43)
λ4 = λ4 + λ2ς [1− (1− α) ν] = λ4 + λ2ν = 1− λ1 − λ2 (A.44)
ν =ς
1 + (1− α) ς(A.45)
B Optimal consumption given prices
Substituting for investment from (A.13) and labour from (A.10) into (A.12) gives
c
ycst +
k
y
(kst+1 − (1− δ) kst
)= α (wst + ς (wst − cst)) + (1− α) (rkt + kst ) (B.1)
rearranging, and substituting for the return from (A.8)
kst =1
γ1
k
ykst+1 +
1
γ1
[γ2cst − γ3w
st − (1− α) rt] (B.2)
where
γ1 = (1− δ) ky
+ (1− α) (B.3)
γ2 =c
y+ ας (B.4)
γ3 = α (1 + ς) (B.5)
γ5 =(1− α)
κ(B.6)
and note from (A.4)1
γ1
k
y=
1
(1− δ) + (1− α) yk
= β (B.7)
46
so
kst = βkst+1 +1
γ1
[γ2cst − γ3w
st − γ5r
st ] (B.8)
Then solving T periods forward gives
γ1kst = γ2
T∑j=0
βjcst+j − γ3
T∑j=0
βjwst+j − γ5
T∑j=0
βjrst+j + γ1βT+1kst+T+1 (B.9)
Rearranging, then taking expectations gives
γ2Est
T∑j=0
βjcst+j = γ1kst + γ3w
st + γ5r
st + Es
t
T∑j=1
βj(γ3w
st+j + γ5r
st+j
)− γ1β
T+1Est k
st+T+1
(B.10)
From the Euler equation (A.7)
Est cst+j = cst + Es
t
j∑i=1
rst+i (B.11)
so
Est
T∑j=0
βjcst+j =1− βT+1
1− β cst + Est
T∑i=0
βji∑
j=1
rst+i (B.12)
The second term of this is
i∑j=1
rst+i = βrt+1 + β2 (rt+1 + rt+2) + β3 (rt+1 + rt+2 + rt+3) + ...
= β(rt+1 + βrt+2 + ...+ ..βT rt+T .) + β2(rt+1 + βrt+2 + ...βT−2rt+T−1
)+ ...+ ...βT rt+1
= β(1 + β + ...+ βT−1
)rt+1 + β2
(1 + β + ...+ βT−2
)rt+2 + ...
=β(1− βT
)1− β rt+1 +
β2(1− βT−1
)1− β rt+2 + ...
=β
1− β (rt+1 + βrt+2 + ...)− βT+1
1− β (rt+1 + rt+2 + rt+3 + ....)
=1
1− β
T∑j=1
βjrst+j −βT+1
1− β
T∑j=1
rst+j (B.13)
then (B.12) becomes
γ2Est
T∑j=0
βjcst+j = γ2
1− βT+1
1− β cst + γ2
1
1− β Est
T∑j=1
βjrst+j − γ2
βT+1
1− β Est
T∑j=1
rst+j (B.14)
47
Substituting this into (B.10):
γ2
1− βT+1
1− β cst + γ2
1
1− β Est
T∑j=1
βjrst+j − γ2
βT+1
1− β Est
T∑j=1
rst+j =
γ1kst + γ3w
st + γ5r
st + Es
t
T∑j=1
βj(γ3w
st+j + γ5r
st+j
)− γ1β
T+1Est k
st+T+1
and rearranging gives
cst =1− β
γ2
(1− βT+1
)
γ1kst + γ3w
st + γ5r
st + γ3E
st
T∑j=1
βjwst+j+(γ5 − γ2
11−β
)Est
T∑j=1
βjrst+j + γ2βT+1
1−β Est
T∑j=1
rst+j − γ1βT+1Es
t kst+T+1
or
cst = γck [γ1kst + γ3w
st + γ5r
st ] + γcwE
st
T∑j=1
βjwst+j + γcrEst
T∑j=1
βjrst+j (B.15)
+γcsEst
T∑j=1
rst+j − γ1γckβT+1Es
t kst+T+1 (B.16)
where
γck =1− β
γ2
(1− βT+1
) (B.17)
γcw = γ3γck (B.18)
γcr = γck
(γ5 − γ2
1
1− β
)(B.19)
γcs = γckγ2
βT+1
1− β (B.20)
C Learning
The state vector is vector is
Xt =[kt at
]′and the PLM for states
Xt = φstXt−1 + εφt (C.1)
48
and prices
Zst = ϕstXt + εϕt (C.2)
where Zt =[wst rst
]′.
Then
Estw
st+i = Twϕ
st (φst)
iXt (C.3)
Est rst+i = Trϕ
st (φst)
iXt (C.4)
Tw and Tr are matrices such that wst = TwZst ; r
st = TrZ
st .
Then the expectational terms in (B.15) are
Et
T∑j=1
βjwt+j = Twϕt
T∑j=1
βiφitXt (C.5)
= Twϕtβφt(I − βTφTt
)(I − βφt)
−1Xt (C.6)
similarly
Et
T∑j=1
βjrt+j = Trϕtβφt(I − βTφTt
)(I − βφt)
−1Xt (C.7)
so can write consumption function (B.15) as
cst = γckγ1kst + γck (γ3wt + γ5rt) + (γcwTw + γcrTr)ϕtβφt
(I − βTφTt
)(I − βφt)
−1Xt(C.8)
+γcsTrϕtφ(I − φTt
)(I − φt)
−1Xt − γ1γckβT+1Tkφ
T+1t Xt (C.9)
= γck (γ1kt + γ3wt + γ5rt) + γcXXt (C.10)
where
γcX = (γcwTw + γcrTr)ϕtβφt(I − βTφT
)(I − βφ)−1+γcsTrϕtφt
(I − φTt
)(I − φt)
−1−γ1γckTk (βφt)T+1
C.1 Special cases
T =∞
cst =1− βγ2
(γ1kt + γ3wt + γ5rt) + γcXXt (C.11)
γcX = (γcwTw + γcrTr)ϕtβφ (I − βφ)−1 (C.12)
49
γcw =γ3 (1− β)
γ2
(C.13)
γcr =γ5 (1− β)
γ2
− 1 (C.14)
T = 1
cst =1
γ2 (1 + β)(γ1kt + γ3wt + γ5rt) + γcXXt (C.15)
γcX = (γcwTw + γcrTr)ϕtβφt +β2
1− β2Trϕtφt −γ1
γ2 ((1 + β))Tk (βφt)
2 (C.16)
γck =1
γ2 (1 + β)(C.17)
γcw =γ3
γ2 (1 + β)(C.18)
γcr =1
γ2 (1 + β)
(γ5 − γ2
1
1− β
)(C.19)
γcs =β2
1− β2 (C.20)
C.2 Comparison with Carceles-Poveda and Giannitsarou (2007)
Rearranging (A.41) gives
λ4ct = kt+1 − λ1kt − λ2at (C.21)
λ4Etct+1 = Etkt+2 − λ1Etkt+1 − λ2Etat+1 (C.22)
and using (A.23)
Etrt+1 = θrkEtkt+1 + θraEtat+1 + θrcEtct+1 (C.23)
Substituting these into the Euler equation (A.7) gives
kt+1 = µ1kt + µ2at + µ3Etat+1 + µ4Etkt+2 (C.24)
50
where
µ1 =λ1
1 + λ1 (1− θrc) + λ4θrk(C.25)
µ2 =λ2
1 + λ1 (1− θrc) + λ4θrk(C.26)
µ3 =λ2 (1− θrc) + λ4θra
1 + λ1 (1− θrc) + λ4θrk(C.27)
µ4 =1− θrc
1 + λ1 (1− θrc) + λ4θrk(C.28)
(this is equivalent to equation (20) in Carceles-Poveda and Giannitsarou, 2007).
If the PLM is
kt = φkkkt−1 + φkaat−1 (C.29)
at = φaktkt−1 + φaatat−1 (C.30)
and
φ =
[φkk φka
φakt φaa
](C.31)
So
Etkt+1 = φkktkt + φkatat (C.32)
Etkt+2 = φkktEtkt+1 + φkatEtat+1 (C.33)
= φkkt (φkktkt + φkatat) + φka (φaktkt + φaaat) (C.34)
=(φ2kk + φkaφak
)kt + (φkkφka + φkaφaa) at (C.35)
Substituting this in the ALM (C.24) gives
kt+1 = µ1
[(φ2kk + φkaφak
)kt + (φkkφka + φkaφaa) at
]+ µ2kt + µ3 [φaktkt + φaaat] + µ4at
=[µ1
(φ2kk + φkaφak
)+ µ2 + µ3φak
]kt + [µ1φka (φkk + φaa) + µ3φaa + µ4]µat
T (φ) =
[µ4
(φ2kk + φkaφak
)+ µ1 + µ3φak µ4φka (φkk + φaa) + µ3φaa + µ2
0 ρ
](C.36)
T (φ)−φ =
[µ4
(φ2kk + φkaφak
)+ µ1 + µ3φak − φkk µ4φka (φkk + φaa) + µ3φaa + µ2 − φka−φak ρ− φaa
]
51
J =
2µ4φkk − 1 µ4φak µ4φka + µ3 0
µ4φka µ4 (φkk + φaa)− 1 0 µ4φka + µ3
0 0 −1 0
0 0 0 −1
(C.37)
µ4φkt + ρµ4 − 1 (C.38)
2µ4φkt − 1 (C.39)
While it would take many pages of algebra to prove that the consumption function
implies by (C.24) is equivalent to (27), it is easy to check numerically that the eigenvalues
(and indeed the coeffi cients in (C.24)) are identical to those for the model of this paper.
C.3 Market clearing
The modeler, using the knowledge that in equilibrium all households are identical, can
sum (C.10) over households to give an expression for aggregate consumption
ct = γck (γ1kt + γ3wt + γ5rt) + γcXXt (C.40)
Substituting for market clearing prices from (A.22) and (A.23) gives
ct = γck (γ1kt + γ3 (λwkkt + λwaat + λwcct) + γ5 (λrkkt + λraat + λrcct)) + γcXXt (C.41)
or
ct = γXXt (C.42)
where
γX =1
1− γck (γ3λwc + γ5λrc)[γck (γ1 + γ3λwk + γ5λrk)Tk + γck (γ3λwa + γ5λra)Ta + γcX ]
C.4 The actual law of motion
Substitute(C.42) into the capital evolution equation (A.41) to give
kt+1 =([
λ1 λ2
]+ λ4γX
)Xt (C.43)
stack this in top of the process for aggregate technology ( to give
Xt+1 =
[ λ1 λ2
]+ λ4γX
0 ρ
Xt +
[0
1
]εt (C.44)
52
D Simple example
Take a representative household maximizing expected discounted lifetime utility
max{ct+i}∞i=0
Et
∞∑i=0
1
(1 + r)iln ct+i (D.1)
subject to a budget constraint
bt+1 = (1 + r) bt + yt − ct (D.2)
where yt is an exogenous process
yt = ρyt−1 + εt (D.3)
and the innovation is drawn from N (0, σ).
Let the household have belief φt about the persistence of the income process, so
Etyt+i = φit yt (D.4)
Note in the case of model-consistent expectations φt = ρ
The first-order condition for consumption is
ct = Etct+1 (D.5)
and, using the transversality condition limt→∞
1(1+r)t
bt = 0, optimal consumption is
ct =r
1 + r
[(1 + r) bt +
1
1− φt (1 + r)−1yt
](D.6)
Let beliefs be updated according to a simple constant gain algorithm
φt+1 = φt + γ (yt − φtyt−1) (D.7)
Note that at the MCEE consumption is a random walk (taking initial wealth to be zero)
∆c∗t =r
1 + r
[1
1− ρ (1 + r)−1
]εt (D.8)
and
σ∗∆c =r
1 + r
[1
1− ρ (1 + r)−1
]σ (D.9)
53
D.1 Proof of proposition 1
Starting from φ0 = ρ, in response to an innovation ε0 beliefs in period 1 are φ1 = ρ+ γε0
and then are revised back towards the MCEE i.e. φt > ρ ∀t > 2. Since f ′ (φ) > 0 this
implies c > c∗ for a number of initial periods then c < c∗ for the rest of history. If
we define the impulse response of consumption as a function IRF then the standard
deviation of the first difference of consumption is given by
σ∆c = σ
∞∑0
∆IRF 2t (D.10)
Since φ0 = ρ = φ∗, ∆IRF0 = ∆IRF ∗0
σ∆c = σ∗∆c + σ
∞∑1
∆IRF 2t > σ∗∆c (D.11)
54
www.st-and.ac.uk/cdma
ABOUT THE CDMA
The Centre for Dynamic Macroeconomic Analysis was established by a direct grant from the
University of St Andrews in 2003. The Centre facilitates a programme of research centred on macroeconomic theory and policy. The Centre is interested in the broad area of dynamic macroeconomics but has particular research expertise in areas such as the role of learning and expectations formation in macroeconomic theory and policy, the macroeconomics of financial globalization, open economy macroeconomics, exchange rates, economic growth and development, finance and growth, and governance and corruption. Its affiliated members are Faculty members at St Andrews and elsewhere with interests in the broad area of dynamic macroeconomics. Its international Advisory Board comprises a group of leading macroeconomists and, ex officio, the University's Principal.
Affiliated Members of the School
Dr Fabio Aricò. Dr Arnab Bhattacharjee. Dr Tatiana Damjanovic. Dr Vladislav Damjanovic. Prof George Evans (Co-Director). Dr Gonzalo Forgue-Puccio. Dr. Michal Horvath Dr Laurence Lasselle. Dr Peter Macmillan. Prof Rod McCrorie. Prof Kaushik Mitra (Director). Dr. Elisa Newby Dr Geetha Selvaretnam. Dr Ozge Senay. Dr Gary Shea. Prof Alan Sutherland. Dr Kannika Thampanishvong. Dr Christoph Thoenissen. Dr Alex Trew.
Senior Research Fellow
Prof Andrew Hughes Hallett, Professor of Economics, Vanderbilt University.
Research Affiliates
Prof Keith Blackburn, Manchester University. Prof David Cobham, Heriot-Watt University. Dr Luisa Corrado, Università degli Studi di Roma. Prof Huw Dixon, Cardiff University. Dr Anthony Garratt, Birkbeck College London. Dr Sugata Ghosh, Brunel University. Dr Aditya Goenka, Essex University. Dr Michal Horvath, University of Oxford. Prof Campbell Leith, Glasgow University. Prof Paul Levine, University of Surrey. Dr Richard Mash, New College, Oxford. Prof Patrick Minford, Cardiff Business School. Dr Elisa Newby, University of Cambridge. Prof Charles Nolan, University of Glasgow.
Dr Gulcin Ozkan, York University. Prof Joe Pearlman, London Metropolitan University. Prof Neil Rankin, Warwick University. Prof Lucio Sarno, Warwick University. Prof Eric Schaling, South African Reserve Bank and
Tilburg University. Prof Peter N. Smith, York University. Dr Frank Smets, European Central Bank. Prof Robert Sollis, Newcastle University. Prof Peter Tinsley, Birkbeck College, London. Dr Mark Weder, University of Adelaide.
Research Associates
Mr Nikola Bokan. Mr Farid Boumediene. Miss Jinyu Chen. Mr Johannes Geissler. Mr Ansgar Rannenberg. Mr Qi Sun. Advisory Board
Prof Sumru Altug, Koç University. Prof V V Chari, Minnesota University. Prof John Driffill, Birkbeck College London. Dr Sean Holly, Director of the Department of Applied
Economics, Cambridge University. Prof Seppo Honkapohja, Bank of Finland and
Cambridge University. Dr Brian Lang, Principal of St Andrews University. Prof Anton Muscatelli, Heriot-Watt University. Prof Charles Nolan, St Andrews University. Prof Peter Sinclair, Birmingham University and Bank of
England. Prof Stephen J Turnovsky, Washington University. Dr Martin Weale, CBE, Director of the National
Institute of Economic and Social Research. Prof Michael Wickens, York University. Prof Simon Wren-Lewis, Oxford University.
www.st-and.ac.uk/cdma
RECENT WORKING PAPERS FROM THE CENTRE FOR DYNAMIC MACROECONOMIC ANALYSIS
Number
Title Author(s)
CDMA10/08 Does Ricardian Equivalence Hold When
Expectations are not Rational? George W. Evans (Oregon and St Andrews), Seppo Honkapohja (Bank of Finland) and Kaushik Mitra (St Andrews)
CDMA10/09 Scotland: A New Fiscal Settlement Andrew Hughes Hallett (St Andrews and George Mason) and Drew Scott (Edinburgh)
CDMA10/10 Learning about Risk and Return: A Simple Model of Bubbles and Crashes
William A. Branch (California) and George W. Evans (Oregon and St Andrews)
CDMA10/11 Monetary Policy and Heterogeneous Expectations
William A. Branch (California) and George W. Evans (Oregon and St Andrews)
CDMA10/12 Finance and Balanced Growth Alex Trew (St Andrews)
CDMA10/13 Economic Crisis and Economic Theory Mark Weder (Adelaide, CDMA and CEPR)
CDMA10/14 A DSGE Model from the Old Keynesian Economics: An Empirical Investigation
Paolo Gelain (St Andrews) and Marco Guerrazzi (Pisa)
CDMA10/15 Delay and Haircuts in Sovereign Debt: Recovery and Sustainability
Sayantan Ghosal (Warwick), Marcus Miller (Warwick and CEPR) and Kannika Thampanishvong (St Andrews)
CDMA11/01 The Stagnation Regime of the New Keynesian Model and Current US Policy
George W. Evans (Oregon and St Andrews)
CDMA11/02 Notes on Agents' Behavioral Rules Under Adaptive Learning and Studies of Monetary Policy
Seppo Honkapohja (Bank of England), Kaushik Mitra (St Andrews) and George W. Evans (Oregon and St Andrews) CDMA11/03 Transaction Costs and Institutions Charles Nolan (Glasgow) and Alex Trew (St Andrews)
www.st-and.ac.uk/cdma
CDMA11/04 Ordering Policy Rules with an Unconditional
Tatjana Damjanovic (St Andrews), Vladislav Damjanovic (St Andrews) and Charles Nolan (Glasgow)
CDMA11/05 Solving Models with Incomplete Markets and Aggregate Uncertainty Using the Krusell-Smith Algorithm: A Note on the Number and the Placement of Grid Points
Michal Horvath (Oxford and CDMA)
CDMA11/06 Variety Matters Oscar Pavlov (Adelaide) and Mark Weder (Adelaide, CDMA and CEPR)
CDMA11/07 Foreign Aid-a Fillip for Development or a Fuel for Corruption?
Keith Blackburn (Manchester) and Gonzalo F. Forgues-Puccio (St Andrews)
CDMA11/08 Financial intermediation and the international business cycle: The case of small countries with big banks
Gunes Kamber (Reserve Bank of New Zealand) and Christoph Thoenissen (Victoria University of Wellington and CDMA)
CDMA11/09 East India Company and Bank of England Shareholders during the South Sea Bubble: Partitions, Components and Connectivity in a Dynamic Trading Network
Andrew Mays and Gary S. Shea
CDMA11/10 A Social Network for Trade and Inventories of Stock during the South Sea Bubble
Gary S. Shea (St Andrews)
CDMA11/11 Policy Change and Learning in the RBC Model
Kaushik Mitra (St Andrews and CDMA), George W. Evans (Oregon and St Andrews) and Seppo Honkapohja (Bank of Finland)
CDMA11/12 Individual rationality, model-consistent expectations and learning
Liam Graham (University College London)
..
For information or copies of working papers in this series, or to subscribe to email notification, contact:
Kaushik Mitra Castlecliffe, School of Economics and Finance University of St Andrews Fife, UK, KY16 9AL
Email: [email protected]; Phone: +44 (0)1334 462443; Fax: +44 (0)1334 462444.