applied macroeconometrics - wordpress.com · slides for lecture 5 ensae 2018/2019 1/74. introducion...

Herve Le Bihan

Banque de France

Direction des Etudes Monetaires et Financieres

[email protected]

Applied Macroeconometrics

Estimation of rational expectations and DSGE models

Slides for Lecture 5

ENSAE 2018/2019

1 / 74

Introducion

Where do we stand in the course?

Lectures 1, 2 , 3 have developed specification and estimation oftraditional large scale macro models, and structural VAR models

Lecture 4 has introduced DSGE models, and how to solve them

These lectures 5 and 6 : how to estimate Rational Expectation andDSGE models

Remark: DGSE models are a subset of RE models.But RE models include a broader set of models: non fully micro-foundedmodels, partial equilibrium

2 / 74

Zoom on Plan of part III of the course i.e.“Rational Expectation and DSGE models”Lecture 4 Specification and resolution of DSGE models.(Done byAurelien Poissonnier)Lecture 5 (HLB) Estimation of DSGE model:5.1. GMM approach5.2. Simulation-based estimators

Lecture 6 (HLB) Estimation of DSGE model: likelihood-based approaches6.1. Kalman filter and Maximum likelihood6.2. The bayesian approachLecture 7 (BC) Simulation and use of DSGE models.

3 / 74

The issue

Example of expectations in macro models

• Consumption-saving decision:

U ′(ct) = β(1 + rt)EtU′(ct+1)

(ct consumption, rt real interest rate) with e.g. U(ct) = c(1−σ)/(1−σ)t

• “New Keynesian” Phillips curveπt = ωf Etπt+1 + ωbπt−1 + µzt + εt (πt inflation, zt marginal cost)

• Objective is to estimate structural parameters β,σ, or ωb, ωf , µBut expectations of future variables Etπt+1 are not observed

4 / 74

Overview of various estimation strategies

• The general set-up: EtF (θ,Yt ,Yt+1,Xt , εt) = 0or (less general) Yt = F (θ,EtYt+1,Xt , εt)θ vector of parameter we seek to estimate.EtYt+1 vector of expectation variables – unobserved.

Two main approaches:

• Using instrumental variables to proxy EtYt+1 : GMM (1)

• Solve for EtYt+1 using the model structureSeveral approaches:- minimize distance between model-generated moments and datamoments (2)- maximize a likelihood (3)- use a bayesian approach (4)

5 / 74

Remark: Course mainly focuses on linear models

• ... or models linearized around the stationary equilibrium

• The GMM approach can be easily applied to non-linear models

• Simulation based approaches also work provided one can simulate thenon-linear rational expectation model

• But solving and “full information” estimation of non-linear modelrational expectation models is however more complex

• Alternative to linearization:- value function iteration,- second-order (or higher order) approximationsThese approaches will not be detailed here.See Canova (2007), De Jong et Dave (2007), Juillard et Ocaktan(2008).

6 / 74

Main reference textbooks

De Jong D., Dave C. (2007) Structural Macroeconomics, PrincetonUniversity PressCanova F. (2007) Methods for Applied Macroeconomic Research,Princeton University PressOther references

Favero C. (2001) Applied Macroeconometrics, Oxford University PressAdda J., Cooper R. (2003) Dynamic Economics, MIT PressIn French Feve P. (2005) La modelisation macroeconomique dynamique,Note d’Etude et de Recherche # 129, Banque de FranceEconomie et Prevision (2008) La modelisation macroeconomique DSGE(Modeles Dynamiques Stochastiques d’Equilibre General) Numero Specialn183− 184.Economie et Statistique (2012) La modelisation macroeconomiqueContinuites, tensions, Numero Special n451− 453.

7 / 74

1. GMM Generalized Method of MomentsAn instrumental variables approach

A simple benchmark model

yt = θ′(Etxt+1) + εt

θ parameter vector of size (p, 1), εt a stochastic shock.xt+1, a vector of size p of forcing variables:xt+1 = (x1t+1, x2t+1, ..., xpt+1)′.Some remarks:- Contemporaneaous variables (e.g. xit+1 = wt) lags or leads of thedependent variable are allowed (e.g. denoting xit+1 = yt−1, xit+1 = yt+1)- GMM is valid for non linear models, that can be written as

yt = Et f (θ, xt+1, εt)

Note that this does not extend to other approaches studied in this course.(ML and Bayesian approach as studied here rely on linear models)- Here we use the structural equations, not the reduced form

8 / 74

→ Starting point : let us substitute expectations by futures realizationsxt+1 (which are observed by the econometrician)

yt = θ′xt+1 + εt + θ′(Etxt+1 − xt+1)

We define the “residual” ut = εt + θ′(Etxt+1 − xt+1)ut is a combination of the structural shock of the model and of anexpectation error.

9 / 74

Ordinary Least Squares Bias

• In regression in which Etxt+1 (an unobserved variable) is substitutedwith xt+1

yt = θ′xt+1 + ut

OLS estimator is biased !This is because Ext+1ut 6= 0

yt = θ′xt+1 + εt + θ′(Etxt+1 − xt+1)

• A simple analytical example : xt+1 = m + vt+1

Then Etxt+1 = m and ut = εt − θvt+1

Ext+1ut = −θEv2t

Evaluating the bias: P lim θOLS = θ( m2

m2+σ2v) < θ

10 / 74

The instrumental variables approachMotivation #1 An analogy with measurement errors (Mc Callum, 1976)

• xt+1 is akin to the expectation contaminated with measurement error :xt+1 = Etxt+1 + vt+1

→ Thus we look for instrumental variables wt :- correlated with Etxt+1,but:- orthogonal to shock εt- orthogonal to the forecast error vt+1 = xt+1 − Etxt+1

11 / 74

Motivation (2): moment condition

• Orthogonality Condition derived from the model, under rationalexpectation

yt = θ′(Etxt+1) + εt

⇒ Et(yt − θ′xt+1 − εt) = 0⇒ E (yt − θ′xt+1 − εt |It) = 0 where It is the agent information set⇒ E (yt − θ′xt+1)wt = 0 for wt ⊂ It , and Eεtwt = 0.

12 / 74

Standard IV estimators: a refresher

Matrix expression of relations between observable variables

Y = Xθ + u

X is a matrix of size (T , p) stacking observations; row t of matrix X isx ′t+1

u vector of “residuals” ut

13 / 74

Instrumental Variables : the just identified case(vector of instruments wt with size k , k = p)Moment Condition Ewt(yt − x ′t+1θ) = 0This implies θ = (Ewtx

′t+1)−1(Ewtyt)

(note: wtx′t+1 is a square matrix)

“Indirects least squares”/Instrumental Variables Estimator

θIV = (1

T

∑wtx

′t+1)−1(

1

T

∑wtyt)

Matrix formulation

θIV = [W ′X ]−1[W ′Y ]

14 / 74

The over-identification case ( wt of size k , k > p)The usual “TSLS” Estimator

θTSLS = [X ′W (W ′W )−1W ′X ]−1[X ′W (W ′W )−1W ′Y ]

• A “two-step” Interpretation

θTSLS = [X ′X ]−1[X ′Y ]

where X = PWX is here interpreted as proxy for unobserved expectationEtxt+1

15 / 74

Properties:

• Convergence PlimθTSLS = θ

• Normality

√T (θTSLS − θ)→ N(0,V )

and V = σ2u[X ′W (W ′W )−1W ′X ]−1 only if errors are independent

• But in general independence of forecast errors do not hold in rationalexpectation models!

var lt

16 / 74

Generalized Method of Moments (GMM)

• Estimation relying on moment conditions:

Eht = 0

ht is a function of parameters and observations

• In the case above: Eht = Ewt(yt − θ′xt+1) = 0

• Why GMM is a Generalization of IV-TSLS estimator:- Several equations or moment conditions- Few restriction on function ht (eg. can be non-linear in parameters)- But most crucially: Autocorrelation/heteroscedasticity of errors isallowed

17 / 74

GMM Estimator: general principle

• Example:p = 1 parameterk = 2 moments conditions (2 instruments w1t and w2t)

• Two theoretical moment conditions:

E (yt − θxt+1)w1t = 0E (yt − θxt+1)w2t = 0

18 / 74

• Two corresponding empirical moment conditions:g1(θ) = 1

T

∑(yt − θ′xt+1)w1t

g2(θ) = 1T

∑(yt − θ′xt+1)w2t

• Two estimators built from the simple method of momentsθi = (

∑witxt+1)−1(

∑wityt)

for i = 1, 2. Each is obtained by setting to zero one of the momentconditions

19 / 74

GMM Estimator: the main principe

An estimator obtained minimizing

Q(θ) =

[g1(θ)g2(θ)

]′M

[g1(θ)g2(θ)

]where M is a weighting matrix, assigning a weight to each momentcondition

20 / 74

GMM Estimator: formal definition

• Population (theoretical) moment condition Eh(θ, ωt) = 0where ωt = (yt , xt ,wt),h(.) : a function RpxR l → Rk possibly on-linear

• Empirical moment condition g(θ) = 1T

T∑t=1

h(θ, ωt)

• The GMM estimator GMM is solution of the program:

θGMM = Arg min g ′(θ)Mg(θ)

where M is a semi-definite positive weighting matrix, with size (k, k)

21 / 74

Properties of GMM estimator (Hansen, 1982)

• Convergence ofθGMM : Plim θGMM = θ

• Optimal weighting matrix :

M∗ = S−1 where S = limT→+∞

TE (g .g ′)

et S =+∞∑

k=−∞Γj , with Γj = E [h(θ, ωt)h(θ, ωt−j)

′]

Γj : autocovariance of order j .

• Normality

√T (θGMM − θ)→ N(0,V )

with V =[DS−1D ′

]−1where D ′ = P lim

[∂g∂θ′ |θ=θ0

]

22 / 74

Properties: some particular cases of GMM

• The case of a linear model - analytical expression of the estimator:

θGMM = [X ′W (S)−1W ′X ]−1[X ′W (S)−1W ′Y ]

where S is an estimator of the “long run variance” of moment conditions

• When in addition errors are independent and homoscedasticθGMM = θTSLS since S = σ2

u1T (W ′W )

• Case with autocorrelation/heteroscedasticity: θTSLS consistent but

θGMM more efficient

23 / 74

Specification testTest for validity of over-identifying restrictions (J-stat) :Under H0 : Eh(θ0, wt) = 0,[√

Tg(θGMM

)′]S−1

[√Tg

(θGMM

)]L→ χ2(k − p)

Main test Statistic : the ”J-stat ”:

J = Tg(θGMM

)′ (S)−1

g(θGMM

)= TQ(θGMM)

Some intuition : if k = p,Q(θGMM) = 0 (just-identified case)

24 / 74

Test on parameters

• Wald testH0 : Rθ = 0R matrix (r,p), r restrictions

W = (R θ)′

[RVR ′

T

]−1(R θ)

L→ χ2(r)

with V =[DS−1D ′

]−1• LM Tests, ”quasi-LR” (see Hamilton, 1994)

• Tests for breaks at a known date (Andrews and Fair, 1988) or anunknown date (Andrews, 1993: Sup-Wald)Sup-Wald Statistic : supW = sup

λ∈[ε,1−ε]W (λ)

where W (λ) is the statistic testing for a break in parameters at dateT1 = λT with T: sample size; λ ∈ [ε, 1− ε] covers all candidate dates forthe break. (Note ε > 0 is required)Distribution of the supW : see Andrews (1993) Not the standardChi-square distribution

25 / 74

Implementing the GMM

Some important practical problems for implementing GMM

• Choosing the instruments (or the moment conditions)

• Computing the optimal weighting matrix

• Iteration over parameters/weighting matrix

26 / 74

Practical problem #1 Instrument choice

Criteria for selecting instruments :

• Number of instruments must be sufficient (identification)

• Validity (orthogonality between instruments and errors)

• Relevance: correlation with expected variable.→ Consequences of “weak” instrumental variables: bias, non-gaussiansmall sample distribution of the estimator ... even for relatively largesample sizes.

27 / 74

Practical problem #2Computing the optimal weighting matrix

• One wishes to estimate S = Γ0 +∞∑j=1

(Γj + Γ′j)

where Γj = E [h(θ, ωt)h(θ, ωt−j)′] (note: Γ′j = Γ−j)

• S is the ”long run variance ” of the moment condition

• Problems:T has a finite sample in practiceΓT−1 is not a consistent estimator of ΓT−1 (there only one

observation!)

28 / 74

Computing long run variance: motivation

• A simple example xt = m + εtwhere εt non iid : Eεtεt−k = γk

• OLS Estimator m = 1T Σxt

m is the moment estimator associated to Eht = E (xt −m) = 0, thecorresponding moment condition is gt = 1

T Σ(xt −m)

• Variance of m?E (m −m)2 = E ( 1

T Σ(m + εt)−m)2 = E 1T 2 (Σεt)(Σεt) = Egg

T .Egg = 1T (ε0 + ε1 + ...+ εT )(ε0 + ε1 + ...+ εT )

29 / 74

• Computing the cross-productsT .E (m−m)2 = 1

T (TEεt +2(T−1)Eεtεt−1+2(T−2)Eεtεt−2...+2εT ε0)

= γ0 + 2T−1T γ1 + ...+ 2T−2

T γ2 + ...+ 2 1T γT

It can be shown that T −→∞ (from stationnarity property), γj −→ 0

limEgg = limT .E (m −m)2 = γ0 + 2∞∑j=1

γj

back

30 / 74

Estimators of the long term covariance matrix

• Truncated Estimator S = Γ0 +L∑

j=1

(Γj + Γ′j)

for L fixed. L = bandwidth (size of window)

• But S potentially non semi definite-positive (risk ending up with anegative variance!)

• Newey West Procedure (1987) :

S = Γ0 +L∑

j=1

w (j) (Γj + Γ′j) where Γj =1

T

T∑t=i+1

ut ut−i(wtw

′t−i)

where w (i) = 1− iL+1 is the ”Bartlett kernel.”

31 / 74

• Example L = 3S = Γ0 + 3

4 (Γ1 + Γ′1) + 24 (Γ2 + Γ′2) + 1

4 (Γ3 + Γ′3)

• This procedure fulfills that S is semi definite-positive

• Question : how to choose L ? In general L has to be an increasingfunction of T, increasing at rate lower than T 1/4...Schwert criterium L = 4(T/100)2/9

32 / 74

Long terme Variance : further procedures

• Procedures of Andrews (1991) and Newey-West (1996)- Data-dependent bandwidth: L function of observations- ”Quadratic spectral” and “Parzen” kernel : alternative choices of w(i)

• VAR-HAC (Den-Haan, Levin 1997) :estimate a VAR model on moment conditions de moment; analyticalcomputation of the long term variance

• West (1997) In rational expectation models, moment condition followsa MA(H) process where H is the forecast horizon.→ L is known.

33 / 74

Practical issue #3

Iteration between parameters and matrix S: Two-step GMM ,Iterated GMM.Practical problem: θGMM depends on S and S depends on θGMM .Solution 1: TS-GMM (Two-step GMM)

- First step : TSLS θ(1) = θTSLS

- Then compute S = S(u(θ(1))) through a HAC (Newey-West) procedure

- Second step : GMM using Sθ(2) = θGMM = [X ′W (S)−1W ′X ]−1[X ′W (S)−1W ′Y ]

34 / 74

Solution 2 : Iterate the previous procedure

θ(i) → S (i)(θ(i))→ θ(i+1)

stopping criterium: θ(i+1) − θ(i) ”small”

Solution 3 Continous Udpdating-GMMObjective Function :

min{θ}

g (θ)′(S (θ)

)−1g (θ) .

The three above procedures are asymptotically equivalent.But finite sample properties are different.

35 / 74

Applications of GMM - examples

• Consumption and asset pricing :Hansen and Singleton (1982)Survey of applications in finance : Jannagathan et al. (2002)

• Central banks reaction functions.Clarida, Gali, Gertler (1998, 2002)

• Inflation and the ”New Keynesian Phillips curve”Gali and Gertler (1999), Eichenbaum and Fisher (2007)

36 / 74

Gali, Gertler (1999) ”Inflation dynamics: A structural econometricanalysis”, Journal of Monetary Economics, 44, pp 195-222.Estimation of a ”new hybrid Phillips curve”

37 / 74

The new Phillips curve

πt = βEtπt+1 + λmct + εt

Variables: πt inflation mct marginal costParametersβ : discount factorλ = (1− θ)(1− βθ)/θθ : degree of price stickiness ; (1− θ) : probability of a price changeTheoretical foundations: optimal price-setting under monopolisticcompetition, and nominal rigidities.Agregating firms with unchanged prices and firms having revised theirprices.

38 / 74

The new hybrid Phillips curve

πt = γf Etπt+1 + γbπt−1 + λmct + εt

Theoretical foundations: as aboveNew Ingredient : a fraction ω of firms is ”backward-looking”.They rely on a ”rule of thumb”: pbt = p∗t−1 + πt−1.Parameters:λ = (1− ω)(1− θ)(1− βθ)/φγf = βθ/φγb = ω/φφ = θ + ω[1− θ(1− β)]

39 / 74

Questions• Does NKPC perform better than the ”traditional” Phillips curve?:

πt = πt−1 + αyt + εt

where yt is output gap• What is the share of ”backward-looking” agents ω ?Empirical specificationQuarterly data USA , 1960 : 1− 1997 : 4.Inflation : Growth rate of GDP deflatorMarginal cost: under Cobb-Douglas production function it holds thatLabor Share in Value Added is a proxy for marginal cost.

40 / 74

Moment conditionsNormalisation (1)

E [(φπt − βθπt+1 − ωπt−1 − (1− ω)(1− θ)(1− βθ)mct)zt ]

Normalisation (2)

E [(πt − βθφ−1πt+1 − ωφ−1πt−1 − (1− ω)(1− θ)(1− βθ)φ−1mct)zt ]

Instruments zt : lags of (4 lags) inflation, labor share , interest ratespread, commodity prices , wages growth rate , output gap.

41 / 74

Estimation Results: New Phillips curve

42 / 74

Estimation Results: New hybrid Phillips curve

43 / 74

Estimation Results: robustness

44 / 74

Estimation Results: robustness

45 / 74

Introducing “full information” approaches

• GMM (part 1 of this lecture): a limited information approach.Uses a moment condition but does not specify the dynamics of forcingvariable xtConsequences: - efficiency of estimator could be improved

- simulation of the model is not possible

• Full information approach :Specify the dynamics for xt , and solve the term Etxt+1 as a function ofobservables.Two steps: - solving the model

- estimation by minimization of distance bewteen dataand model generated moments or maximisation of likelihood

46 / 74

Minimum Distance Estimators (MDE).

• The approach:(1) Compute auxiliary parameters ψ from the data.Example: moments, autocorrelation coefficients, Response functions(IRF),...→ Often requires an auxiliary model (eg: a VAR model).

(2) For each parameter set θ, compute the counterpart in the theoreticalmodel ψ(θ) of the auxiliary parameters.

Then estimate parameters by minimizing the distance between auxiliaryparameters ψ and their counterpart in the theoretical model ψ(θ).

Note: ψ(θ) may also be a function of the data (omitted here to simplifythe notation)

47 / 74

• FormallyMDE estimator is the solution of

θMDE = Arg min(ψ − ψ(θ))′W (ψ − ψ(θ))

where W is a semi definite positive weighting matrix

• Foundation for the estimator:Maintained assumption: for the true parameter θ0,

P lim ψ = EψAuxiliary Model = ψ(θ0) .

48 / 74

• Remark: formally the MDE estimator is a particular case of GMM

• Properties: convergence, normality (as with GMM)

√T (θMDE − θ)→ N(0,V )

where V = [D ′WD]−1[D ′WΦWD][D ′WD]−1 withD = P lim(∂ψ(θ)/∂θ′)and Φ is the asymptotic variance of moment (or parameter) ψ

• Remark: the use of GMM is here distinct from that in previous ’GMM’section of the course!Here it is applied to moments predicted by the modelThe moment condition here exploits:- properties of the reduced form of the model- Not the first order conditions (as in section 1). Hence there are noexplicit instruments here.

49 / 74

• Advantages:

- the MDE approach uses the structure of the model

- computation of the likelihood is not needed (helpful in particular whenthere is no explict analytical form of likelihood)

- in some cases (fitting IRF to a shock) it is not deeded to specify thedynamic processes for all shocks in the model: a parcimony advantage.

• Drawbacks:- estimates depend on the auxiliary parameters that have been chosen; onthe identification of shocks (in case of fitting an IRF)- bias if there is a specification error in the auxiliary model (even ifequation of interest is well specified).

• Example. Christiano, Eichenbaum and Evans (2005).Estimate a DSGE macro model through fitting the Impulse ResponseFunction (IRF) to a monetary policy shock

50 / 74

Christiano L. J., M. Eichenbaum, and C. Evans (2005) NominalRigidities and the Dynamic Effects of a Shock to Monetary Policy, Journal of Political Economy, 113(1),1-45.

A New Keynesian DSGE modelEstimated through MDE (Minimum Distance Estimation), US dataQuestion adressed: role of nominal and real rigidities in the response to amonetary shock

51 / 74

Effects of a monetary policy shock• Assessed with a VAR model• Quarterly US data , 1965:3-1995:3• 9 variables in the VAR model (GDP, Consumption , deflator,investment, wage, productivity, Fed funds, M2,profits )• Recursive Identification of the monetary policy shock : interest rate M2and profits are variables that react simultaneously to monetary policyshocks. Other variables no.• Stylised fact: “hump-shaped” response of real variables and inflation.

52 / 74

IRF (Impulse Response Functions)Responses to a monetary shock

53 / 74

IRF (Continued)

54 / 74

A DSGE model with real and financial frictionsConsumptionEuler Equation with (internal) habit formation→ Utility depends on past aggregate consumptionψc,t = Et(ψc,t+1) + it − Et πt+1

ψc,t marginale utility of consumption

ct = ( b1−βb )ct−1 + ( β

1−βb )Et ct+1 − 1(1−βb)σc

(ψc,t)

55 / 74

Investment:• Households hold the capital stock and rent it to firmsAccumulation equation

K t = (1− τ)K t−1 + τ It−1Pk′,t : shadow real price of capital stockrKt+1 : “rental rate” of capital stock

Pk′,t = −(it − Et πt+1) + β(1− δ)Et Pk′,t+1 + (1− β(1− δ))rKt+1

56 / 74

• Frictions:- capital adjustment cost (parameter κ), attached to the growth rate ofcapital stock (Tobin’s Q).

It = 11+β It−1 + β

1+βEt It+1 + 1κ Pk′,t

- variable capital utilisation rate //

Kt − K t = 1σa

(w t − πt + Rt + Lt − Kt)

where K t , physical capital , Kt utilized capital

57 / 74

FirmsMonopolistic CompetitionWorking capital: firms have to borrow to finance intermediaryconsumption and wagesst marginal costst = αrKt + (1− α)(wt + it)Labor demand (standard)

Lt − Kt = −(wt + it) + rKt

58 / 74

Prices and wages:Calvo-style nominal rigiditie : prices and wages have an exogenousprobability not to be able to change.• Inflation process given by the New Keynesian Phillips curve

πt = β1+βEt πt+1 + 1

1+β πt−1 + 11+β

(1−βξp)(1−ξp)ξp

[st ]

st marginal cost• Wages: given by the wage New Keynesian Phillips curve[bw (1+βξ

2w )−λw

bwξw

]wt

= βEtwt+1 + wt−1 + [β(Et πt+1 − πt)− (πt − πt−1)] + 1−λw

bwξw[−ψc,t + Lt ]

wt real wagesIndexation: Wt and Pt are adjusted through indexation rule when theyare not re-optimized

59 / 74

Monetary Policy ruleGeneralized Taylor rule(equivalent to postulation a rate of growth rule for M2)The monetary policy shock is the only one explicitely introduced in themodel (!)Money Demandqt = − 1

σq[ R1−R it + ψc,t ]

60 / 74

EstimationSome parameters are calibrated: discount factor, labor share, etc.7 parameters are estimated:b : habit formationξw , ξp : price and wage rigidityλf : mark-up (or equivalently the elasticity of substitution)σq : money demand parametersκ : investment adjutment costsσa : capital utilization variation cost parameter

61 / 74

Same data as for the VAR model

Estimation method: MDE fitting the IRFsResponse function to a monetary policy shock

J = Minγ

[Ψ−Ψ(γ)]′V−1[Ψ−Ψ(γ)]

where γ is the model parameter vectorΨ(γ) is the full set of parameters of the model-generated IRFs (afunction of γ)

Ψ is the set of parameters of estimated empirical IRFs

62 / 74

Estimation results

63 / 74

IRF for variants of the model - role of nominal frictions

... wage rigidity is crucial

64 / 74

IRF for variants of the model - role of real frictions

...a variable capital utilization rate is crucial

65 / 74

Simulated Methods of Moments (SMM)

• As in MDE: use of auxiliary model or auxiliary moments.• The difference with MDE: function ψ(θ) is evaluated simulatingtrajectories of the modelExample: ψ(θ) = Euh(θ, u) where u is a latent unobserved and ψ(θ) isnot analytically tractable.Simulation of S trajectories of u.

Computation across trajectories ψ(θ) = 1S

S∑s=1

h(θ, us)

66 / 74

• The SMM estimator solves

θSMM = Arg min(ψ − ψ(θ))′W (ψ − ψ(θ))

• SMM estimator is consistent even for finite S• Variance of θSMM dis a decreasing function of S .

V (θ) = V (1 + T/S)

where V is the estimator of the variance of ψ associated to MDE (seeabove).

67 / 74

Indirect Inference .

• As in MDE and SMM: rely on an auxiliary model or auxiliary moments.• Indirect Inference: generalization of SMM

• For each parameter set, the structural model is simulated H times, andthe auxiliary model is estimated on the simulated data

• Estimation is performed by minimizing the distance between :→ the “average” parameters of the simulated auxiliary modeland→ the auxiliary model parameter as estimated from the actual data

68 / 74

• FormallyThe II (Indirect Inference) is the solution of

θII = Arg min(ψ − ψH(θ))′W (ψ − ψH(θ))

where W is a weighting matrix, H is the number of simulations andψH(θ) = 1

H

∑ψj(θ)

ψj(θ) auxiliary parameter estimate obtained with simulation # j

• Properties of θII : asymptotically consistent , gaussian(see De Jong, Dave, 2007, chap.7, or Gourieroux, Monfort,1996)

69 / 74

Indirect Inference : example.

• Estimated Model MA(1)

yt = εt + θεt−1

εt → N(0, 1)i .i .d• Estimation by maximum vraisemblance: relatively long• Estimation through simulation: use auxiliary model AR(p)

yt = ρ1yt−1 + ...+ ρpyt−p + vt

• Example AR(2).

ψ =

[ρ1ρ2

]estimated on actual data

70 / 74

• For each parameter set θ, simulation H trajectories of the MA modelyht (θ)

All trajectories have the same size T as the actual empirical sample.• H estimations using MCO:

yht (θ) = ρ1y

ht−1(θ) + ρ2y

ht−2(θ) + vt

ψH(θ) = 1H

∑(ρh1ρh2

)average of parameters estimated using simulated

data• Estimation carried out through minimization of a relatively simplefunction

θII = Arg min(ψ − ψH(θ))′(ψ − ψH(θ))

71 / 74

• Properties of the estimator (Studied through a Monte Carlo simulation)Gourieroux, Monfort (1996).Case studied: θ0 = −0.5, H = 1,T = 250

Estimateur Average S.D RMSEII-AR(1) -0.481 0.105 0.106II-AR(2) -0.491 0.065 0.066ML -0.504 0.061 0.061

Estimator nearly as efficient as Maximum Likelihood for p = 2. Shortercomputation time.

72 / 74

• Advantages of indirect inference:

- No need to compute a likelihood (same as GMM, MDE,...).

- Convenient when there are unobserved variables (all needed is to beable to simulate the model)

- Estimation is consistent even if the auxilaury model is ill-specified !The “II” procedure corrects for bias in the auxiliary modelSuch as small sample bias, “misspecification”, ...

• Drawbacks:- large number of simulations needed to obtain precision of the estimator.• Example of application of SMM and Indirect Inference: Coenen andWieland (2005), Kim and Ruge Murcia (2009), Knotek (2010)

73 / 74

Lecture 5 : Wrapping up

• GMM easy to implement, but suffer from small-sample bias and lack of“model consistency”.• Useful to get quick/first estimates, or if model is complex; but not anymore the standard approach• SMM and Indirect inference are closer to full-information. Especiallyuseful if the researcher is focused on fitting specific moments (or IRFs) orif model is non-linear and not manageable with likelihood approaches

74 / 74

applied macroeconometrics - wordpress.com · slides for lecture 5 ensae 2018/2019 1/74. introducion...

Documents