time inconsistency and optimal policies in deterministic and stochastic worlds

9
Journal of EconomicDynamicsand Control 10 (1986) 191-199. North-Holland TIME INCONSISTENCY AND OPTIMAL POLICIES IN DETERMINISTIC AND STOCHASTIC WORLDS David CURRIE Queen MatT College, University of London. London El 4NS. England Paul LEVINE London Business School arid Pol.vtechnic of the South Bank, London N WI 4SA, England I. Introduction The problem of time inconsistency of optimal policies in models with forward-looking rational expectations has attracted considerable attention in the macroeconomics literature. [See, for example, Kydland and Prescott (1977), Calvo (1978), Driffill (1982), Buiter (1983), Cohen and Michel (1984), Miller and Salmon (1984).] This problem arises because, if optimal policy is for- mulated by means of standard dynamic programming procedures, the passage of time may lead to an incentive to renege on the initially formulated policy and to adopt a new one. Since the private sector can foresee this, it may well anticipate the switch in policy so that the initial policy lacks credibility. Appreciation of this problem has motivated the examination of policies that are optimal within the constraint of being time-consistent [Buiter (1983), Levine and Currie (1983), Miller and Salmon (1984), Cohen and Michel (1984)]. In this paper we examine and compare these alternative procedures for deriving time-consistent optimal policies. This analysis provides the basis for a re-examination of the question of time inconsistency and the sustainability of an optimal (but time-inconsistent) policy. As demonstrated formally elsewhere [Currie and Levine (1985)], the optimal policy is sustainable in a stochastic setting provided that the rate of discount is sufficiently low. The result arises because the stochastic problem has the characteristic of a repeated game, in which the performance of policy in the face of future disturbances is important provided that the future is not too heavily discounted. This result illustrates the important differences be- tween the stochastic and deterministic cases, showing the class of time-con- sistent policies is larger in the stochastic setting as compared with the deterministic case. 0165-1889/86/$3.50©1986, ElsevierSciencePublishersB.V. (North-Holland)

Upload: david-currie

Post on 21-Jun-2016

216 views

Category:

Documents


4 download

TRANSCRIPT

Journal of Economic Dynamics and Control 10 (1986) 191-199. North-Holland

TIME INCONSISTENCY AND OPTIMAL POLICIES IN DETERMINISTIC AND STOCHASTIC WORLDS

David CURRIE

Queen Mat T College, University of London. London El 4NS. England

Paul LEVINE London Business School arid Pol.vtechnic of the South Bank, London N WI 4SA, England

I. Introduction

The problem of time inconsistency of optimal policies in models with forward-looking rational expectations has attracted considerable attention in the macroeconomics literature. [See, for example, Kydland and Prescott (1977), Calvo (1978), Driffill (1982), Buiter (1983), Cohen and Michel (1984), Miller and Salmon (1984).] This problem arises because, if optimal policy is for- mulated by means of standard dynamic programming procedures, the passage of time may lead to an incentive to renege on the initially formulated policy and to adopt a new one. Since the private sector can foresee this, it may well anticipate the switch in policy so that the initial policy lacks credibility.

Appreciation of this problem has motivated the examination of policies that are optimal within the constraint of being time-consistent [Buiter (1983), Levine and Currie (1983), Miller and Salmon (1984), Cohen and Michel (1984)]. In this paper we examine and compare these alternative procedures for deriving time-consistent optimal policies.

This analysis provides the basis for a re-examination of the question of time inconsistency and the sustainability of an optimal (but time-inconsistent) policy. As demonstrated formally elsewhere [Currie and Levine (1985)], the optimal policy is sustainable in a stochastic setting provided that the rate of discount is sufficiently low. The result arises because the stochastic problem has the characteristic of a repeated game, in which the performance of policy in the face of future disturbances is important provided that the future is not too heavily discounted. This result illustrates the important differences be- tween the stochastic and deterministic cases, showing the class of time-con- sistent policies is larger in the stochastic setting as compared with the deterministic case.

0165-1889/86/$3.50©1986, Elsevier Science Publishers B.V. (North-Holland)

192 D. Currie and P. Levine, Time inconsistency and optimal policies

2. Optimal policy with precommitment

2.1. The control problem

The economic systems considered in this paper have, as a general form, the following linear stochastic differential equation:

[dZdx e ] = A [ z ] d t + B w d t + d v ' x (1)

where z is an (n - m) x 1 vector of predetermined variables, x is an m x 1 vector of non-predetermined variables which can freely jump in response to 'news' and w is an r x 1 vector of control instruments. The stochastic environment is modelled by v, an n x 1 vector of white noise disturbances with do - N(0, Vdt) and dx e denotes the rational expectations of dx formed at time t on the basis of the information set l(t)= (z(s), x(s): s < t} and knowledge of the model (1). The matrices A, B and V all have. time-invariant coefficients and, in addition, V is symmetric and non-negative definite. All variables are measured as deviations from some long-run trend deterministic equilibrium. Exogenous variables that follow ARIMA processes may be incor- porated into (1) by a suitable extension of the vector z. The initial conditions of the system at time 0 are given by z(0).

The controller is concerned about outputs

s = Cy + # w , (2)

where s is an n ° x l vector, y = [z ] and C and O are time-invariant matrices. The welfare loss to be minimised at time t = 0 is assumed to be of the form E(W(z(0))) where

IV(z(O)) = { fo°°e-°t[srQs + wrRw] dt, (3)

where O > 0 is a discount factor, Q and R have time-invariant coefficients and, in addition, R is symmetric and positive definite and Q is symmetric and non-negative definite.

The control problem is then to minimise (3) subject to (1) and (2). We prodeed by decomposing this problem into separate deterministic and stochas- tic components. To do this let y = [~-] and w = ~ denote the solution to the deterministic problem' with

) = 4 ; + (4)

D. Currie and P. Levine, Time inconsistency and optimal policies 193

where ~(0)= z(0) are the initial conditions and we have used the fact that ~ e = x in this deterministic case. Now let 3 3 = y - ~ and k = w - ~ . Then subtracting (4) from (1) we obtain

[d~e ] = A y d t + Bfvd t+dv . (5)

Substituting y =.~ + y and w = ~ + k into (3) and noting that y is of order do and E(d v) = 0 we arrive at

E(W(z(O))) = (6)

where W(z(0)) and l~V(z(0)) are given by (3) and (2) with (y, w) replaced by (~, ~) and (~, #), respectively. The two parts of the control problem are then the deterministic problem given by the minimisation of W(z(O)) subject to (4) and the stochastic problem given by the minimisation of E(W(z(0))) subject to (5).

2.2. The deterministic component

The deterministic problem has been the subject of a number of recent papers [see Calvo (1978), Driffill (1982), Buiter (1983), Miller and Salmon (1984), Levine and Currie (1984)]. The solution, employing Pontryagin's maximum principle, may be outlined as follows. Let us define the Hamiltonian

I-t=½e-°t[jTQj+~TR~] + )k(ay + B~), " (7)

where g = C~ + D~ and )~(t) is a 1 × n row vector of costate variables. The first-order conditions for a minimum are OH~Oh =~, OH/a~ = O, and ~ = -gH/Oy. Differentiating (7), these lead to the following dynamic s2¢stem under optimal control:

(8) L U R * - I U r - Q * o I - A r + U R * - I B r ] ~ _ ~ '

where

U=CTQD. R * = R + D T Q D . O*=c oc. r.

The 2n boundary conditions are given by ~(0),ff2(0)=0 (partitioning fi

_- P_I ) and the transversality condition lim,._.~e-°tfi(t)=O. The optimal

po tself is given by

~ = - R*-X[ Brff + Ury] . (9)

194 D. Currie and P. Leuine. Time inconsistency and optimal policies

Assuming that the dynamic matrix in (8) denoted by J has the saddlepoint property (exactly n eigenvalues with positive real part), then the solution to (8) takes the form

where

[Pl]=.~ _N[72] , (10)

N = M~IM21,

Mn M12 ] M=[M21 M22j partitioned conformably with [p]

being the matrix of left eigenvectors of J arranged so that the last n rows are associated with the unstable eigenvalues. Then using (9) and (10) the optimal policy may be expressed in feedback form as

where

w = G'[P2]' (11)

~= _R.-, [ u~T_ B,TN._ U~-N~I, S~T_ B"N,~- U~TN~d, t

with

U = B = and U 2 , B 2 ,

[Nn N12] N=[N21 N22j partitioned conformably with [pZ] .

The welfare loss under optimal policy at time t = 0 can be found as follows. By Pontryagin's maximum principle we have

dW(~(O)) dr(O) = ~ ( 0 ) .

HenCe, from (10) integrating and using W(O) = 0 we obtain

W(£(0)) = - ½£r(0) Nn~(0 ) = - ½tr(NnZ(0)) , (12)

where 2 ( 0 ) = ~(0)~r(0).

D. Currie and P. Levine, Time inconsistency and optimal policies 195

2.3. The combined deterministic and stochastic problem

By an appeal to certainty equivalence [shown to apply for RE models in

Levine and Currie (1984)], the optimal feedback rule ~ = (~ /52 applies to the

stochastic component too. Hence for the combined problem ~ -- G P2 is the optimal policy in feedback form. The welfare loss for the combined determinis- tic and stochastic problem can be shown to be given by

E(W(z(0))) = - ½tr[Nxl (Z(0) + p - l V l i ) ] . (13)

2.4. The time inconsistency of the optimal policy

At time t the welfare loss is a function of z(t) and p2(t) which is no longer zero. Write the welfare loss as E(W) where

W = W ( z ( t ) , p 2 ( t ) ) = ½ e-P(~-')[srQs+wrRwldr.

By Pontryagin's principle the equation analogous to (13) then is

E(W(z( t ) , p2(t)))= -½t r [NH(Z( t ) + p-IVH) + N22p2(t)pr(t)],

(14)

where it can be shown that N22 < 0. The optimal policy is time-inconsistent because the controller can, at time t, reoptimise by putting p2(t)= 0 and thereby reduce the welfare loss by an amount - ½tr[N22p2(t)p~(t)]

3. Time-consistent policies

3.1. The dynamic programming solution

The following summarises a solution procedure for obtaining an optimal policy which satisfies Bellman's principle and is therefore time-consistent. It generalises a solution of Cohen and Michel (1984) and provides a continuous- time analogue of the solution of Oudiz and Sachs (1985). It leads by a different route to the same time-consistent solution proposed by Miller and Salmon (1984) and Levine and Currie (1983).

A time-consistent rule cannot contain a term involving p2(t) which can be shown to consist of a discounted linear combination of past values of z. We therefore seek a solution w = Glz for the time-consistent feedback rule.

1 9 6 D. Currie and P. Levine, Time inconsistency and optimal policies

Substituting into (1) we then have

[ ] dz = [ A + [ B 1 0 l G , l [ x dx ¢

which has a solution

x = - N ( G 1 ) z ,

where

N = M~IM2x,

Mu M12] z M = [ M21 Mz 2 j partitioned conformably with [x]

being the matrix of left eigenvector of the dynamic matrix in (15) arranged so that the last m rows are associated with the unstable eigenvalues. (We assume that G 1 is such that A + [B 1 0]Gx in (15) has the saddlepoint property.)

Substituting x = - N z we may write the expected welfare loss at time t as E(W), where

f/O0 W = W ( z ( t ) ) = ½ e-O('-t)(zrOz + 2zr(lw + wrR*w) dt, (16)

where

Q. -- Q~I - NTQ5 -- Q~I N + Nrt3* N ~ 2 2 ,

0 = u 1 - N T u 2,

[ U I ] Q. [ Q~'I Q~'2 ] partitioned conformably with [z] U~ , U 2 La~'l O~'2]

A time-consistent solution must satisfy Bellman's 'Principle of Optimality' for which 'an optimal policy has the property that whatever the initial state and decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision'. It follows from this that the optimal value of W and time t, W*(z ( t ) ) say, must satisfy the following' fundamental recurrence relation':

W*(z(t)) =rnin[l(z(t),w(t))At+ l, Vt*~,(z(t+ At))], (17)

D. Currie and P. Levine, Time inconsistency and optimal policies 197

where

1 = ½ [zTQz + 2zrUw+ wrR*w].

From (17) the following iterative procedure is obtained for G 1.

G~ 1} = -R*-X[B1TS(I} + uT], (18)

where S °~ satisfies the Riccati equation

S(1)(/~ - l p I ) -~- ( 2~- 10I ) TS(1)

+ O_ - (SB' + O)R*-I(BITS + 0 T) : 0. (19)

[See, for example, Levine and Currie (1984).] Letting T ~ ~ we thus have an ~(o) n(1) .. which hopefully converges to a stationary value infinite sequence --1 , "1 ,.

G~. The rule w = Glz then satisfies Bellman's Principle of Optimality and is time-consistent. Corresponding to (15) the welfare loss at time t is given by

E(W(z( t ) ) )= -½tr (S(Z( t )+ p - l v u ) ), (20)

where S is the stationary value of the sequence S ~°), S ~x) . . . . .

3.2. Buiter's solution

The previous time-consistent equilibrium may be obtained if one assumes that the controller chooses G1 = f ( N ) for a given private sector 'reaction function' x = -Nz, whilst the private sector chooses N = g(G1) for a given policy rule w = Gxz [Miller and Salmon (1984)]. Then G 1 is obtained as the fixed point of f . g. Because both government and private sectors have closed- loop perceptions of each other's actions, the solution can be considered to be a closed-loop Nash equilibrium.

Suppose instead the controller chooses w = Glz taking the level of the free variables x as given. The resulting equilibrium can then be regarded as open-loop Nash. The controller's minimisation problem is then to minimise the welfare loss subject to

dz = (AllZ + A12x )d t + Blwdt + do 1. (21)

Introducing costate variables q, the equilibrium for this system under control

198 D. Currie and P. Leoine, Time inconsistency and optimal policies

can then be shown to be

dz] [Gt F. dq = ] G t l Hll G12

dx e LE2I F21 Ett j (22)

where

E = A - B R * - I u r, F = - B R * - I B r,

G = U R - 1 u r - Q*, H = pI - A r + UR* - 1Br.

Eq. (22), it should be noted, can be obtained from (8) by putting P2 = 0 and PI = q- Solving (22), as for (8), gives the feedback rule w = Gtz.

4. The sustainability of the optimal policy

Consider a single government with an infinite life. A government that wished to pursue the optimal time-inconsistent policy will need to establish a reputation for being a regime that does not renege on earlier commitments. It follows that only a switch to a time-consistent policy is credible and the incentive to renege at any time may be measured by the welfare loss difference E ( W ° P ( z ( t ) , p 2 ( t ) ) - E ( w T C ( z ( t ) ) , where we denote the optimal policy by OP and the time-consistent policy by TC. The welfare loss for these two cases is given by (14) and (20) for OP and TC policies, respectively (adopting the closed-loop Nash solution as the appropriate TC policy).

The following propositions can then be proved [see Currie and Levine (1985)]:

Proposition 1. In a stochastic context i f do has a bounded distribution then i f rate o f discount p is sufficiently low, there is no incentive to switch from the OP to the T C policy.

Proposition 2. In a stochastic context i f do - N(0, S d t ) then an incentive to switch f rom the OP to the TC policy exists in a unit time interval but with probability a, where a can be made to be as close to zero as we like by choosing p to be sufficiently low.

References

Buiter, Willem H., 1983, Optimal and time consistent policies in continuous time rational expectations models, NBER technical working paper no. 29.

Calvo, Guillermo A., 1978, On the time consistency of optimal policy in a monetary economy, Econometrica 46, 1411-1428.

D. Currie and P. Levine, Time inconsistency and optimal policies 199

Cohen, Daniel and Phillippe Michel, 1984, Toward a theory of optimal precommlL, l,,.nt: An analysis of the time-consistent equilibria, Paper presented to the CEPR Manchester Conference on the European Monetary System: Policy Coordination and Exchange Rate Systems, Sept.

Currie, David A. and Paul Levine, 1985, Credibility and precommitment in a stochastic world, PRISM paper no. 36 (Queen Mary College, London).

Driffill, E. John, 1982, Optimal money and exchange rate policy, Greek Economic Review 4. Kydland, Finn E. and Edward C. Prescott, 1977, Rules rather than discretion: The inconsistency

of optimal plans, Journal of Political Economy 85, 473-491. Levine, Paul and David A. Currie, 1984, The design of feedback rules in linear stochastic rational

expectations models, PRISM paper no. 20 (Queen Mary College, London). Levine, Paul and David A. Currie, 1985, Optimal feedback rules in an open economy macromodel

with rational expectations, European Economic Review 27, 141-163. Miller, Marcus H. and Mark H. Salmon, 1984, Dynamic games and the time inconsistency of

optimal policy in open economies, Economic Journal 85, 124-137. Oudiz, Gilles and Jeffrey Sachs, 1985, International policy coordination in dynamic macroeco-

nomic models, in: Willem H. Buiter and Richard C. Marston, eds., The international coordina- tion of economic policy (Cambridge University Press, Cambridge).