2009 structural break estimation a survey

Upload: ponku75

Post on 14-Jan-2016

213 views

Category:

Documents


0 download

DESCRIPTION

Structural Break ANalysis

TRANSCRIPT

  • Structural Break Estimation: A Survey

    Ozan Eksi

    Universitat Pompeu Fabra

    Working Paper

    October 2009

    Abstract

    This paper surveys the literature on structural break estimation with classical econometrics meth-ods. I start by classifying different types of structural breaks. Then, I review differentiating the pres-ence of structural breaks from the unit root. They are often easily confused because the tests lookingfor the existence of one could give supporting evidence for the presence of the other. Finally, thispaper refers to both testing for the presence of breaks in the data, and also estimating their dates withsingle and multiple break estimators

    JEL Codes: C10Keywords: Structual Breaks (vs Unit Roots)

    I would like to thank to Prof. Jess Gonzalo for useful suggestions.E-mail: [email protected] Web page: www.econ.upf.edu/~oeksi/

    1

  • 1 Introduction

    There are two well-known problems with structural break estimation. The rst one is the difculty ofdifferentiating data that is subject to a structural break (before and after which data shows stationary andtrend stationary patterns) from data having a unit root. The second one is that although break locationsin data can be estimated consistently, there is no efciency condition for the limiting distribution ofthe estimates. Although consistency is a sufcient condition for the purpose of many empirical papers,efciency could still be of interest if the aim is to obtain the smallest condence intervals around thebreak dates. This paper makes a non-technical survey of the literature on both topics.The stated reason behind these difculties of estimating structural breaks is that the problem is

    nonstandard; a break date only appears under the alternative hypothesis, not under the null of no break.Perron (2005) paper makes a comprehensive review of both problems; however it is very technical, andseemingly there is a lack of resources summarizing the relevant literatures.The rest of the paper is organized as follows: The next section summarizes types of structural breaks.

    Section 3 reviews rst the literature on differentiating data with the structural break from the data havingunit root, and then the literature on structural break estimation and testing.

    2 Types of Structural Breaks

    Let's specify a basic AR(1) aggregate income process as

    1yt D ct C t1yt1 C "t

    For this process, a change in the conditional mean at time k means

    ct D c1 t D 1 f or t k ct D c2 t D 2 f or t > k

    1 and a change in the conditional variance at time l

    var."t/ D 21 f or t l var."t/ D 22 f or t > lnally for stationary series, a change in the unconditional variance of yt at time m can be shown by

    var.1yt/ D 2

    1 2 D 21 f or t m ::::::var.yt/ D 22::: f or t > m

    so the change in the nal term ( 2) includes the changes in both coefcient () & variance of the shock( 2).

    Searching for a break in the conditional mean term of the income process is straightforward andonly requires supplying endogenous and exogenous variables to the programs testing for a break. Theseprograms usually provide the option for keeping variance of residuals the same across the segmentsseparated by the break at time k. If residual variances are allowed to be different across segments, it

    1If change occurs in only one of the coefcients, either in c or , then it is called partial structural change.

    2

  • means the timing of break in the uncondional variance of the data is assumed to occur at the same timeas the timing of break in the uncondional mean of the data (k=l).If one wants to allow k and l to be different; rst the breaks in the conditional mean of the data can be

    found. Then the residuals of this regression can be collected and tested for the break in their variances.This can be done, for example, by running the regression below and testing whether the value of c staysconstant through time or not

    ."t "t/2 D ctwhere "t is the mean of error terms. In fact, "t can already be assumed to be zero as long as one allowsbreaks in the mean of the AR(1) regression, that is used to collect these error terms.Instead of the regression above; i.e. looking for a change in the square deviations of error terms,

    one can also use Stock and Watson (2002)'s methodology and look for a break in the absolute residuals.This is to avoid giving too much weight to outliers.

    j "t "t jD ct

    Finally, to search for a break in the unconditional variance a similar approach to the last one can befollowed.

    j 1yt 1yt jD ctHowever, this time the presence of a break the mean of the data,1yt , should be tested in advance, unlikeit is done for error terms. If there is a break in the mean, then the deviations should be calculated fromthe relevant means. For illustrative purposes we can apply this data measure on US GDP growth data.

    Figure 1

    -4

    -2

    0

    2

    4

    6

    8

    10

    45 50 55 60 65 70 75 80 85 90 95 00 05

    D1US

    YEARLY GROWTH OF US GDP

    0

    1

    2

    3

    4

    5

    6

    45 50 55 60 65 70 75 80 85 90 95 00 05

    ADD1US

    ABSOLUTE VALUE OF DEMEANED US GDP GROWTH

    It is clear that the reduction in the variance of the data in the rst gure can be captured by regressingthe second on a constant and checking if the value of the constant remains the same or not. In Eksi(2009), like in many others including Stock and Watson (2002), 1984 is found as a signicant break datein the GDP volatility, as shown in the second gure.

    3

  • 3 Literature Review on Structural Breaks

    3.1 Unit Root vs Structural Breaks

    A data can be found to be non-stationary if it has a unit root, or if it includes a structural break, beforeand after which data shows different patterns. The rst graph in the following gure illustrates the lattercase from my paper, Eksi(2009)

    Figure 2

    45

    50

    55

    60

    65

    70

    75

    80

    45 50 55 60 65 70 75 80 85 90 95 00 05

    US Inequality Index

    -.12

    -.08

    -.04

    .00

    .04

    .08

    .12

    45 50 55 60 65 70 75 80 85 90 95 00 05

    Growth Rate of US Inequality Index

    There it seems that income inequality in US follows one stationary and one trend stationary pattern;however, when this data is tested for a unit root, the null hypothesis of a unit root cannot be rejected. It isalso problematic to estimate the break date in this data because checking for a break in a series requiresstationary residuals, and we cannot just assume that residuals would be so after accounting for the breakin the series, since the null hypothesis of the test is that there is no break in the data2.As it is sometimes called in the literature, this is part of the intricate play between unit roots and

    structural breaks (Perron 1989, 2005). Most tests that attempt to distinguish between a unit root and a(trend) stationary process will favor the unit root model when the true process is subject to structuralchanges, but is otherwise (trend) stationary within regimes specied by the break dates. Also, most teststrying to assess whether a structural change is present will reject the null hypothesis of no structuralchange when the process has a unit root component but also constant model parameters. Accordingly,there is voluminous literature on testing for a unit root under structural break(s). These tests also givebreak dates as a by-product, but they are not as efcient as the break estimators. The details follow.The early inuential paper Perron (1989) tests null hypothesis of unit root under the assumption

    of known (exogenous, pre-tested) break date in both null and alternative hypotheses. Later Christiano(1992) criticizes Perron's known date assumption as data mining. He argues that the data based proce-dures are typically used to determine the most likely location of the break, i.e. by pre-test examination ofthe data, and this approach invalidates the distribution theory underlying conventional testing. Zivot andAndrews (1992) and Perron (1997) proposed determining the break point endogenously from the data.However, these endogenous tests were criticized for their treatment of breaks under the null hypothesis.They do not allow for break(s) under the null hypothesis of unit root and derive their critical values ac-cordingly. So they exclude the possibility that there may be a unit root process with a break, and under

    2One way of overcoming this problem would be taking log difference of the data, which made the series stationary, andlook for a break in the growth rate of the series. However, in my experience it would be wise to avoid data conversions thatsmooth the data; especially when the data is not long enough or includes outliers. It is because under these conditions breakestimation tends to catch any kind of one time deviation in the data rather than nding a change in trend or in mean. Forexample in the gure above, it nds a break around 1994.

    4

  • this case, these tests declare data as stationary with breaks. So it seems literature on this subject arrives atthe approach of Lee and Strazicich (2003), (2004), employing minimum Lagrange Multiplier (LM) tests.One test allows for two-breaks in time series data, and the other allows one. While testing for a unit root,they both estimate break date(s) endogenously from the data, and also allow break(s) both under the nulland alternative hypotheses. By simulation exercises they show that their test outperforms existing ones.Besides, recently Glynn, Perera and Verma (2007) analysed existing tests and mentioned the superiorityof Lee and Strazicich's test. However, they also point out that instead of univariate models, commonfeature analysis of unit root with breaks has more potential, while indicating the development in thisarea is very limited. See Hendry and Massmann (2007) for full discussion on this analysis. In short,whether it is applied to test unit roots under structural breaks, or directly to test for structural breaks,this analysis rests on the principle that there is an appropriate combination of variables, having a breakin common, that does not display the breaks any longer. But this very reason also prevents co-featureanalysis from always being applicable. Applying them requires using more than one series, which aresuspected to have common breaks. For example in Eksi (2009), I need to deal with breaks in the growthrate of GDP, as well as breaks in income inequality series. These variables require using different typeregressions and cannot be tested at a time with co-breaking analysis. Alternatively, each variable mayhave been tested for a break independently from the other. In a way that I could nd similar series likeconsumption or investment to be used with GDP. However, I did not have this option for the inequalityindex. So I chose univariate models and applied Lee and Strazicich's methodology. For further reviewof this literature, again please refer to Perron (2005).

    3.2 Structural Break Estimation and Testing

    Structural break tests can be divided into three categories. The Chow test is used within the rst category.It tests whether the series has a break in the tested date. The tests in the second category look for thepresence of a break in the series, which may exist at any time within the sample period. Some tests inthis category also reveal the most possible break date as a by-product. The tests in the last category arein fact estimators, they rst estimate the unknown date of the break, then test it.For any type of break I have outlined above (in the conditional mean or in (un)conditional variances),

    the date of the break, if it exists, is unknown so that it falls into the third category. But to understandthe basics of the structural break estimators that are used to nd unknown break dates and test them, itis better to start with the Chow Test. It is because unknown date estimators that use more complicatedtests basically rest on the same principles as this test.Chow test looks forthe following. Whether splitting data from the possible break point and esti-

    mating two generated sub-samples separately by least square gives signicantly better t than using thewhole sample at once; if the answer is yes, the null hypothesis of no break is rejected. The resultingstatistics would be; F-statistics, log likelihood ratio or the Wald statistic.Given this information on the Chow test, I will now write about the tests (estimators) which fall into

    third category, and mention the tests in the second category when it is necessary. However, as therecan be more than one break in the data, the estimators can be further divided into two categories; singlebreak estimators and multiple break estimators. Actually it is theoretically proven that consistency forthe break date estimates is satised for single break estimators even more than one break in the data exist

    5

  • (Bai 1997b; Bai and Perron 1998). This works by rst nding one break in the data, and then splittingthe data from there and searching for new breaks in the new samples3. However, as there is no efciencycondition for any estimator, multiple break estimators are used to get more precise estimates, i.e. to ndsmaller condence intervals around the breaks, and also to increase the rate of convergence to the breakdates. This increases efciency in the estimation of parameter values subject to the structural change.However, since efciency is not always the concern of applied economists, I will review the literaturefor both type estimators. Finally, Multi-Equations Systems is used to get more precise estimates for anytype of estimator.

    Single Break Estimators: For the unknown break date, Quandt (1958, 1960) proposed likelihoodratio test statistics for an unknown change point, called Supremum (Max)-Test , while Andrews (1993)supplied analogous Wald and Lagrange Multiplier test statistics for it. Then Andrews and Ploberger(1994) developed Exponential (LR, Wald and LM) and Average (LR, Wald and LM) tests. These testsare calculated by using individual Chow Statistics for each date of the data except from some trimmedportion from both ends of it. While the Supremum test is calculated for and nds the date that maximizesChow Statistics, the most possible break point, the Average and Exponential tests use all the Chowstatistic values and are only informative about existence of the break but not its date. The deciencies ofthe Supremum test are, however, as follows. It only has power if one break occurs under the alternativehypothesis, and is valid as long as residuals from the regression follow i.i.d.. This means they do notshow heterogeneity before and after the break, as is also a necessary condition for the Chow test. Buteven Figure 1 is quite informative for suggesting this is not always the case. Heteroscedasticity andautocorrelation robust version of this test (also called Quandt Likelihood Ratio or Andrews-Quandtstatistics, which is the estimator used most commonly in this literature) can be used, even though it stillgives the most possible break date (it is so because of small sample properties). It also strongly suffersfrom large condence intervals around the break date. Finally, and again for the single break model,Bai, Lumsdaine and Stock (1998) use quasi likelihood estimation in a VAR setting and show that withcommon breaks across equations, the precision of the estimates increases with the number of equationsin the system. However, their methodology obviously can only be carried out as long as equations areexpected to show a break in the same time period. This could be the case when several variables areco-integrated, as in their study (they also use output, consumption and investment data). Besides, thistest is designed for a single break and there could be more than one break date in the data, in which casethese test exhibits non-monotonic power function (Vogelsang 1997, 1999).

    Multiple Break Estimators: Perron and Qu (2006), following the work of Bai and Perron (1998)& (2003), rst dene minimum segment length (in proportion to the total data). Given this constraint,they then search for the optimal partition of all possible segments of data to obtain global minimizers ofthe sum of squared residuals. By this way, they obtain the location of breaks, minimizing their objectivefunction for any possible number of breaks. Then they sequentially test for whether an additional breakdate signicantly reduces the sum of squared errors. Their methodology inherits both pure and partialstructural change models. Though this method consistently identies the break dates, Perron's (2005)

    3This is due to the following result. When estimating a single break model in the presence of multiple breaks, the estimateof the break fraction will converge to one of the true break fractions, the one that is dominant in the sense that taking it intoaccount allows the greatest reduction in the sum of squared residuals (in the case of two breaks that are equally dominant, theestimate will converge with probability 1/2 to either break).

    6

  • comment on this procedure states that the fact that the method of estimation is based on the least-squaresprinciple implies that, even if changes in the variance of error terms are allowed, provided they occurat the same dates as the breaks in the parameters of the regression, such changes are not exploited toincrease the precision of the break date estimators. This is due to the fact that the least-squares methodimposes equal weights on all residuals. Allowing different weights, as needed when accounting forchanges in variance, requires adopting a quasi-likelihood framework. Finally, Perron and Qu (2007)bring what I think of as a novel approach to structural change analysis that I also used in my paper, andis able to nd considerably small condence intervals around the break dates.Perron and Qu (2007) use a multiple equation model. They rst dene the minimum segment length

    of the data that could be separated with breaks. Given this constraint, they then search for the optimalpartition of all possible segments of data which the model ts, where the objective function being max-imized is a quasi-likelihood one based on normal errors. Their methodology identies the break eventhough it only exists in one of the used equations4. The reason for using multiple equations is that,in their words, We show that the precision of the estimate of a particular break date in one equationcan increase when the system includes other equations, even if the parameters of the latter are invariantacross regimes. All that is needed is that the correlation between the errors be nonzero. This result is expost fairly intuitive, because a poorly estimated break in one regression affects the likelihood functionthrough the residual variance of that equation and also via the correlation with the rest of the regressions.Hence, by including ancillary equations without breaks, additional forces are in play to better pinpointthe break dates for the same reason that efciency is improved using the SUR estimator compared toordinary least squares(OLS) equation by equation. Finally, in their work the error process is allowed tobe autocorrelated as well as conditionally heteroskedastic.I conclude this survey by indicating that although the Perron and Qu (2007) code met my needs and

    could be used in my paper, to obtain theoretical results about the consistency and limit distribution ofthe break dates, some conditions need to be imposed on the regressors, the errors, the set of admissiblepartitions and the break dates. To my knowledge, the most general set of assumptions are those inPerron and Qu (2005). The need for these assumptions arises from the following. There are many teststatistics for the signicance of the break dates, optimality and distribution of which change dependingon type of the breaks and estimation technique. But when you want to use an estimator supplied by aneconometrician, you are bound to use the same testing statistics as him/her when testing the signicanceof the break dates.

    4I conrmed their result by running simulations

    7

  • References

    Andrews, D. W. K. (1993) Tests for parameter instability and structural change with unknown changepoint Econometrica 61, 821856

    Andrews, D. W. K. and Ploberger, W. (1994) Optimal Tests when a Nuisance Parameter is Present OnlyUnder the Alternative Econometrica, Vol. 62, No. 6 (Nov., 1994), pp. 1383-1414

    Bai, J. (1997b) Estimating multiple breaks one at a time Econometric Theory, 13,. 315-352

    Bai, J., Lumsdaine, R. L. & Stock, J.H. (1998) Testing For and Dating Common Breaks in MultivariateTime Series The Review of Economic Studies, Vol. 65, No. 3 pp. 395-432

    Bai, J. and Perron, P.(1998) Estimating and testing linear models with multiple structural changesEconometrica 66, 47-78

    Bai, J. and Perron, P.(2003a) Computation and Analysis of Multiple Structural ChangeModels Journalof Applied Econometrics 18 , 1-22Christiano, L.J. (1992) Searching for a Break in GNP, Journal of Business and Economic Statistics,10, pp. 237-249

    Eksi, O. (2009) Lower Volatility, Higher Inequality: Are They Related? www.econ.upf.edu/~oeksi/

    Glynn, J., Perera, N.and Verma, R. (2007) Unit Root Tests and Structural Breaks: A Survey withApplications Journal of Quantitative Methods for Economics and Business Administration

    Hendry, D. F. and Massmann, M. (2007) Co-Breaking: Recent Advances and a Synopsis of the Lit-erature Journal of Business & Economic Statistics, American Statistical Association, vol. 25, pages33-51, January

    Lee, J. and Strazicich, M.C. (2003) Minimum LMUnit Root Test with Two Structural Breaks, Reviewof Economics and Statistics, 63, pp.1082-1089

    Lee, J. and Strazicich, M.C. (2004) Minimum LMUnit Root Test with One Structural Break, WorkingPaper, Department of Economics, Appalachain State University

    Perron, P. (1989) The great crash, the oil price shock, and the unit root hypothesis, Econometrica, 57,pp.1361-1401

    Perron, P. (1997) Further Evidence on Breaking Trend Functions in Macroeconomic Variables Journalof Econometrics, 80 (2), pp.355-385

    Perron, P. (2005) Dealing with Structural Breaks, Mimeo forthcoming in the Vol. 1 Handbook ofEconometrics: Econometric Theory

    Perron, P. and Qu, Z. (2006) Estimating Restricted Structural Change Models Journal of Econometrics134 373-399Perron, P. and Qu, Z. (2007) Estimating and Testing Structural Changes in Multivariate RegressionsEconometrica, Econometric Society, vol. 75(2), pages 459-502

    Quandt, R.E. (1958) The estimation of the parameters of a linear regression system obeying two sepa-rate regimes Journal of the American Statistical Association 53, 873-880

    Quandt, R.E. (1960) Tests of the hypothesis that a linear regression system obeys two separate regimesJournal of the American Statistical Association 55, 324-330

    8

  • Stock, J. H. and Watson, M. W. (2002) Has the Business Cycle Changed and Why? NBER WorkingPapers 9127

    Vogelsang, T.J. (1997) Wald-type tests for detecting breaks in the trend function of a dynamic timeseries Econometric Theory 13, 818-849

    Vogelsang, T.J. (1999) Sources of nonmonotonic power when testing for a shift in mean of a dynamictime series Journal of Econometrics 88, 283-299

    Zivot, E. and Andrews, D.W.K. (1992) Further evidence on the great crash, the oil price shock and theunit root hypothesis Journal of Business and Economic Statistics 10, 251-270

    9