presentation on gmm

Download Presentation on GMM

If you can't read please download the document

Upload: moses-sichei

Post on 19-Jan-2017

310 views

Category:

Documents


1 download

TRANSCRIPT

FISCAL DEVELOPMENTS AND MONETARY PROGRAMME

The Generalised Method of Moments (GMM)

1

DR. MOSES SICHEI

A PRESENTATION IN MOZAMBIQUE 27TH JUNE 2013

1

OutlineBasic principle of GMMEstimation of GMMHypothesis testingExtensions of GMMDSGE and GMMSome applied tips2

2

Basic principle of GMMGMM, introduced by Hansen (1982-Econometrica), is one of two developments in econometrics that revolutionized empirical in macroeconomics-other being unit roots and cointegration The starting point of GMM estimation is a theoretical relation that parameters of interest should satisfyThe basic idea is to choose the parameter estimates so that the theoretical relation is satisfied as closely as possibleThe theoretical relations is replaced by its sample counterpart and the estimates are chosen to minimise the weighted distance between the theoretical and actual values3

3

Why GMM?GMM encompasses may common estimators of interest in econometrics and provide useful framework for their comparisons and evaluationGMM provides a computationally convenient method for estimation of nonlinear dynamic models without complete specification of the probability distribution of the data to facilitate maximum likelihood estimationGMM approach links nicely to economic theory where orthogonality conditions can serve as moment functions, often arising from optimising behaviour of agents

4

4

What are population moments?We define a population moment say as the expectation of some continuous function of a random variable x:

The mean or first moment, where is the identity function given by

Higher moments:The uncentred second moment is given byThe variance of x can be expressed as a function of the two moments and

Functions of moments, var(x), are also called moments

5

5

What are sample moments?So far discussed population momentsBut in estimation, we need to define sample momentsThe sample moment is merely the sample version of the population moment in a particular random sample by replacing expectations operator with summation sign:

Moment levelPopulation momentSample momentFirst momentSecond moment

6

6

Method of momentsMethod of moments by Pearson (1893,1894,1895)The method of moments entails nothing more than estimating any population moment (or function of population moments) by using the corresponding sample moments (or functions of sample moments)

Suppose two series and have zero meansThe following sample moment conditions define the traditional variances and covariance estimators

7

7

Method of moments cont.If we want the correlation instead of the covariance, then we change the last moment condition to:

This must be estimated jointly with the first two moments

8

8

From Method of moments to GMMThe generalised method of moments implies several generalisations such as:Conditional moments may be used as well as unconditional ones;Moments may depend on unknown parametersThere can be more moment conditions than parameters

9

9

OLS estimator as a MM and GMMThe usefulness of GMM comes from the fact that the object of interest in many estimation exercises is simply a function of momentsFor a correctly specified linear regression model

The simple moment conditions are

PopulationSample

10

10

OLS estimator as a MMThe linear regression implies thatThe moment conditions are:

The MM estimator can be derived from the above condition as:

11

11

IV as a MM EstimatorIV is a MM estimatorIf but there is another variable to be referred to as an instrument with the following characteristics:It is informative or relevant i.e. the instrument is correlated with the endogenous regressor and

The instrument must be exogenous or valid (exclusion restrictions) The assumptions above (validity and relevance) enable us to identify the parameters of the modelIf we have data on the valid and relevant instrument we are in business

12

12

IV as a MM Estimator cont.The model is fully identifiedThe moment conditions (or orthogonality condition) is

Provided the matrix has a full rank i.e. we can invert and solve for

The is consistent under the assumptions we have made

Using Slutskys theorem

As the sample size goes to infinity converges in probability to the true value

13

13

TSLS and MMWe can obtain the same results by using a two-stage procedure:Regress the endogenous variable on the instrument and calculate the predicted value of the endogenous variableUse the predicted value of the endogenous variable (instead of the actual value) from the first regression as the explanatory variable in the structural equation and estimate by OLS

If one uses the endogenous variable as the instrument one can see that OLS=IV=2SLS in which all variables are assumed exogenous.

14

14

TSLS and IV: use of projection matrixA common way of writing the two stage least squares is

Where the is the projection matrix

15

15

Generalised MM EstimatorMany statistical and economic models feature q moment or orthogonality conditions

From which we want to estimate vector or parameters The GMM estimator is designed to estimate without requiring either distributional assumptions about or explicit solution for the endogenous variablesThe methodology can be applied to both linear and nonlinear specifications; it can be used for univariate and multivariate setups and requires only mild regularity conditions to produce estimators with good properties

16

16

Generalised MM Estimator cont.Let be the population momentof , let be its sample moment and let be a symmetric positive definite matrixThen a GMM estimator solves (minimises)

The GMM estimator makes the sample version of certain orthogonality conditions close to their population counterpart and the distance or weighting matrix describes what close means

17

17

Generalised MM Estimator cont.The GMM estimator minimises the weighted quadratic form (equivalent to a loss function):

Where is the sample average of and is some symmetric positive definite weighting matrix

18

18

Another generalisation Let any (nonlinear) moment conditions be:

The sample counterpart is:

The optimisation problem is to minimise:

If we have more instruments than coefficients, we need to choose to minimise the above problemHow should the weighting matrix be?

19

19

Weighting matrix It turns out that any symmetric positive definite matrix of yields consistent estimates for the parametersHowever, it does not yield efficient ones in terms of Cramer-Rao Lower conditionsHansen (1982) derives the necessary (not sufficient) conditions to obtain asymptotically efficient estimates for the coefficientsHansen(1982) suggests that the appropriate weighting matrix is

Intuition: denotes the inverse of the covariance matrix of sample moments. This matrix is chosen because it means that less weight is placed on the more imprecise moments

20

20

Implementation process Implementation suffers from a circularity problemBefore we can estimate we need an estimate of Before we can estimate the matrix we need an estimate of To deal with this problem the implementation is generally undertaken in two-steps:Any symmetric positive definite matrix yields consistent estimates of the parameters. Thus exploit this fact. Using any symmetric positive definite matrix-any arbitrary matrix such as the identity matrix is normally used to obtain the first consistent estimator

Using these parameters to construct the weighting matrix and from that we can undertake the minimisation problem. This process is then iterated many times until convergence is achieved

21

21

Covariance estimators Choosing the right weighting matrix is very important for GMMThere have been many econometric papers written on this subject matterThe reason is that the estimation results are very sensitive to the choice of weighting matrixWhat happens when heteroscedasticity and autocorrelation is part of the model? as is always the caseWe need to modify the covariance matrixWe should write our covariance matrix of empirical moments as:

Where is the qth row of the matrix of sample moments

22

22

Covariance estimators cont. Define the autocovariances

We can now express in terms of the above expressions:

If there no serial correlation, the expression for are all zero (since the autocovariances will be zero):

This is very similar to a Whites(1980) heteroscedastic consistent estimator

23

23

Covariance estimators cont. If this looks like a White (1980) heteroscedastic consistent estimator, implementation should be straight-forward:Take the standard heteroscedastic version:

24

24

Covariance estimators cont. The appropriate problem and weighting matrix are

The weighting matrix can be consistently estimated by using any consistent estimator of the models parameters and substituting the expected values of the squared residuals by the actual residualThe only difference here is that we are generalising the problem by allowing instruments i.e.

25

25

Covariance estimators cont.-autocorrelation The problem is that with autocorrelation it is not possible to replace the expected values of the squared residuals by the actual values from the first estimationWhy? Because it would lead to an inconsistent estimate of the autocovariance matrix of order The problem of this approach is that asymptotically, the number of estimated autocovariances grows at the same rate as the sample sizeThis means that while W is unbiased, it is not consistent in the mean squared error sense.

26

26

Covariance estimators cont.-autocorrelation Thus we require a class of estimators that circumvent these problemsA class of estimators that prevent autocovariances from growing with the sample size are:

Parzen termed the the lag windowThese estimators correspond to a class of kernel (spectral density) estimators evaluated at frequency zero

27

27

Covariance estimators cont.-autocorrelation The key is to choose the sequences of such that the sequence of weights approaches unity rapidly enough to obtain asymptotic unbiasedness but slowly enough to ensure that the variances converge to zeroThe type of weights used in Eviews correspond to a particular class of lagged windows termed scale parameter windowsThe lag window is expressed as

Where is the bandwidth

28

28

COVARIANCE ESTIMATIONHansen (1982) showed that an optimal choice for the weighting matrix is a heteroscedasticity (an autocorrelation) consistent estimate of the inverse of the asymptotic variance matrix:

Where is the true value of the parameter The description of the optimal weighting matrix is somewhat circular:Before we can estimate we need an estimate of Before we can estimate the matrix we need an estimate of How do we solve this problem?

29

29

The Weighting matrix We therefore follow a two step iterative process:Estimate the model with some (symmetric and positive definite) weighting matrix (the identity matrix is typically a good choice generally used)This gives consistent estimates of the parameters which can then be used to produce an initial consistent estimate Use the consistent to define a new weighting matrix as The algorithm is run again to give asymptotically efficient estimators of Iterate until the point estimates converge

30

30

SUMMARY An estimate of is necessary to:Calculate asymptotic standard errors for the GMM estimatorUtilize the optimal weighting matrix In case the disturbances are serially uncorrelated, can be estimated using the Whites (1980) heteroscedasticity consistent estimatorWhat happens with serially correlated disturbances of known or unknown order?With autocorrelation, it is not possible to replace the expected values of squared residuals by the actual valuesIt would lead to an inconsistent estimate of the autocovariances The best approach is to use heteroscedasticity and autocorrelation consistent (HAC) covariance matricesHAC estimates are typically of two types: kernel-based (non-parametric) and parametric basedIn both cases a number of choices, which may influence the properties of the estimates must be made

31

31

Estimating the covariance Matrix-choices Existing Monte Carlo evidence for estimating recommends VAR pre-whitening and either the quadratic spectral (QS) or Parzen kernel estimation, together with Andrews(1991) automatic bandwidth parameterAlthough the QS kernel estimator may be preferred to the Parzen kernel estimator because of its asymptotic normality, it takes more time to calculate the QS kernel estimator than the Parzen kernel estimator

32

32

HYPOTHESIS TESTINGBecause models estimated by GMM are subject to few restrictions, their specification is not demanding at allOnce you have the moment conditions the only other things required to test are:Non(linear) restrictionsSets of any overidentifying restrictionsParameter constancy

33

33

TESTING RESTRICTIONS WITH GMMOne of the reasons for the popularity of GMM is that it allows for clear procedure to test restrictions that come out of a well-specified econometric modelIf the number of moment restrictions q is greater than the number of parameters k the GMM estimator is overidentifiedUnder the null that restrictions are valid, the test for overidentifying restrictions (also known as the J-test) is distributed as This test is always misunderstoodIt is not a test for whether the instruments are validInstead the test answers the following question: Given that a subset of instruments is valid and exactly identifies the coefficients, are the extra instruments valid?

34

34

TIPS ON INTERPRETATION OF J-STATISTICIf the J-statistic is surprisingly large it means that either the orthogonality conditions or the other assumptions (or both) are likely to be falseThe finite sample or actual size of the J-test in small samples far exceeds the nominal size (i.e. the test rejects too often)The small sample evidence for the reliability of the over-identifying tests is mixed: Results depend onWeighting matrix W, on whether a fully iterated or two-step GMM method is usedWhether instruments carry good informationNumber of instrumentsIt is recommended that you experiment with various alternatives- with various estimates of W and instruments before making conclusions about parameter estimates and the quality of the model

35

35

EXTENSIONS OF GMM: NONLINEAR GMM ESTIMATIONThe GMM estimator and the corresponding test would be of the same form as the linear GMMFor a general nonlinear model

Recall that a corresponding linear model would be

The orthogonality conditions used in linear GMM are

The corresponding orthogonality condition for nonlinear model are

Where Z are instruments

36

36

EXTENSIONS OF GMM: NONSTATIONARY DATAThe maintained assumption with GMM is that the observed vector of variables are strictly stationaryThus even if the data are nonstationary, the data can be transformed in such a way that stationarity of the transformed model is a reasonable assumption e.g. write the model in growth rates

37

37

EFFICIENCY BOUNDS IN GMMIn general the more instruments the more the precision but only up to a certain limitDespite this the assyptotic covariance matrix of the GMM estimator which can be constructed from instruments contained in the information set is bounded from below Hansen (1985) provides the greatest lower bounds for the asymptotic covariance matrices of members of the infinite dimensional class of GMM estimatorsHansen and West (2002) provides some theoretical results that can help guide practitioners in the choice of instruments and efficiency bounds as well as feasible selection of instruments

38

38

GMM AND DSGEThe DSGE models deliver orthogonality conditions of the form:

Where is a set of instruments in agents information sets and are residuals of an Euler equationThis restriction implies that but also that

for any measurable function

To minimize one should select to maximise the correlation between and

It is optimal to set

39

39

ESTIMATION OF DSGE MODELS USING GMM Estimation of DSGE model parameters was initially done by applying instrumental variables such as GMM to the Euler equations underlying themThis approach aimed to account for the presence of endogenous variables and future expectations that appear in these relationsBut there are pitfalls in using GMM to estimate DSGE modelsProblem 1: Weak instrumentsSingle equation GMM estimators of DSGE parameters which work on the moments coming from Euler equations utilise the complete system only to the extent of suggesting the reasonable instrumentsBut simulations have shown that this is not appropriate due to weak instruments (Fukac et al. 2006)

40

40

ESTIMATION OF DSGE MODELS USING GMM CONT.Problem 2: Sample sizes:Although the parameters are rarely unidentified, very large samples are generally needed to produce useful inferencesAs shown by Hansen and West, first-order asymptotic approximations for and for the t tests and J-tests work poorly in samples of typical sizesKey advantage of GMM Despite these weaknesses, GMM is very useful in some cases as a means to estimate DSGE parametersWhen the residuals of the Euler equations of a DSGE model display serial correlation of known formThis could be attributed to presence of adjustment costs, time non-separability of the utility function, multi-period forecasts or time aggregation may produce orthogonality conditions which display serial correlation of known form

41

41

Applications of GMMNonlinear rational expectations models, Euler equationsMonetary policy reaction functionsAsset price modelsContinuous time interest rate modelsModel of changing volatilityDynamic panel data models-e.g.Arrelano and Bond and Arrelano and Bover methods42

42

REFERENCESFukac, M. Pagan, A. and Pavlov, V. (2006), Econometric issues arising from DSGE models, mimeo.Hansen, L.P. (1982), Large sample properties of generalised method of moments estimators, Econometrica, Vol.50, No.4 Pgs 1029-1054.Hansen, L.P. (1985), A method of calculating bounds on the asymptotic covariance matrix of generalised method of moments estimators, Journal of Econometrics, Vol.30, No. 1-2 Pgs 203-238.Hansen, L.P. and West, K.D. (2002), Generalised method of moments estimators and macroeconomics, Journal of Business and Economic Statistics, Vol.20, No.4 Pgs 460-46943

43