structural change tests under heteroskedasticity: joint
TRANSCRIPT
Structural Change Tests under Heteroskedasticity:
Joint Estimation versus Two-Steps Methods
Pierre Perron∗
Boston University
Yohei Yamamoto†
Hitotsubashi University
August 22, 2020
Abstract
There has been a recent upsurge of interest in testing for structural changes in
heteroskedastic time series, as changes in the variance invalidate the asymptotic dis-
tribution of conventional structural change tests. Several tests have been proposed
that are robust to general form of heteroskedastic errors. The most popular use a
two-steps approach: first estimate the residuals assuming no changes in the regression
coefficients; second, use the residuals to approximate the heteroskedastic asymptotic
distribution or take an entire sample average to construct a test for which the variance
process is averaged out. An alternative approach was proposed by Perron, Yamamoto
and Zhou (2020) who provided a test for changes in the coefficients allowing for changes
in the variance of the error term. We show that it transforms the variance profile into
one that effectively has very little impact on the size of the test. With respect to the
power properties, the two-steps procedures can suffer from non-monotonic power prob-
lems in dynamic models and in static models with a correction for serial correlation in
the error. Most have power equals to size with zero-mean regressors. Even when the
two-steps tests have power, it is generally lower than that of the latter test.
JEL Classification Number: C14, C22
Keywords: heteroskedasticity, structural change test, non-monotonic power, vari-
ance profile, likelihood ratio test, CUSUM test, U-statistic.
∗Department of Economics, Boston University, 270 Bay State Rd., Boston, MA, 02215 ([email protected]).†Department of Economics, Hitotsubashi University, 2-1 Naka, Kunitachi, Tokyo, Japan, 186-8601
1 Introduction
There has been a recent upsurge of interest in testing for structural changes in heteroskedastic
time series, as changes in the variance invalidate the asymptotic distribution of the conven-
tional structural change tests in a conditional mean model, such as the cumulative sum
() and supremum of likelihood ratio (sup) tests. Several methods have been
proposed in both the statistic and econometrics literature to have tests that are robust, at
least asymptotically, to general forms of heteroskedastic errors or variance profiles. The most
popular method can be labelled as a two-steps approach: first estimate the residuals assum-
ing the null hypothesis of no change in the regression coefficients; second, use the residuals to
approximate the heteroskedastic asymptotic distribution or take an entire sample average to
construct a test for which the variance process is averaged out. Examples of the first group
include Zhou (2013) who proposed a wild bootstrap procedure to mimic the heteroskedastic
Wiener process, Xu (2015) who used the null residuals to construct the time transformed
Wiener process, and Górecki, Horváth and Kokiszka (2018) who applied a Karhunen-Loeve
expansion of the Brownian bridge with the kernels constructed using the null residuals. Ex-
amples of the second group include Dalla, Giraitis and Phillips (2017) who provided a simple
U-statistic for which the heteroskedastic null residuals are averaged out in the entire sample
long-run variance estimate and Zhang and Wu (2019) who proposed a U-statistic avoiding
estimating the long-run variance using a feasible generalized least squares procedure. These
tests control size asymptotically under general variance profiles and are successful in pro-
viding an exact size close to the nominal level in finite samples. However, when a change
in the conditional mean caused by some parameter change occurs, the residuals assuming
the null hypothesis may be a poor approximation to the true errors. Hence, power issues
become a concern; see Pitarakis (2004) and Perron and Yamamoto (2019) who investigated
the power problem when structural changes are present both in the coefficients and in the
error variance but either ones are neglected; see also Hansen (2000).1
Perron, Yamamoto and Zhou (2020) provided a comprehensive treatment to test for
structural changes in both the coefficients and the variance of the errors in a linear regression
model. An important ingredient of their framework is that it jointly estimates structural
1Cavaliere and Taylor (2006, 2008) considered a wild bootstrap method for testing changes in the per-
sistence coefficient. Pein, Sieling, and Munk (2017) also considered structural change tests for both mean
and variance but restricted to occur at the same dates. Bock, Collilieux, Gullamon, Lebarbier and Pascal
(2020) used a model selection approach. Dalla, Giraitis and Robinson (2020) developed asymptotic theory
for a semi-parametric model with changing mean and variance, but did not proposed testing methods.
1
changes in the regression coefficients and error variance, allowing the break dates to be
different or overlap. Related to the topic of this paper, they proposed the sup3 test for
changes in the coefficients allowing for changes in the variance of the error term. Because
the changes in the coefficients are jointly accounted for in constructing the test, the variance
process can be consistently estimated even under the alternative hypothesis. This suggests
that substantial power gains can be expected. Even though, under the null hypothesis of
no change in coefficients, it does allow for a one-time change, it still has the correct size
asymptotically (and in finite samples). However, a potential drawback, compared to the
tests mentioned above, is that it accounts for heteroskedasticity only partially via a few
major abrupt changes. Hence, some size-distortions may be expected in the context of a
general class of variance profiles. This is something we shall address extensively.
In this paper, we carefully assess the size and power properties of the tests discussed
above. With respect to size, our findings are summarized as follows. First, the exact size of
the two-steps tests is close to nominal size. Second, although the size of the sup3 test
is not completely immune to general forms of variance profiles other than abrupt structural
changes, accounting for a few structural changes in the variance can flatten or detrend the
original variance process, thus considerably reduce the size distortions. Another way to state
this important fact is as follow. In many cases, the original sup test that ignores potential
changes in the variance of the errors is very little affected by various types of variance profiles
(e.g., increases early in the sample, periodic variations, etc.). By allowing just a few breaks
in variance (one or two in most cases), the sup3 test in a sense transforms the variance
profile into one that effectively has very little impact on the size of the test. It can either
completely flatten, partially flatten or detrend the variance process by rescaling the pre- and
post- break levels of variances. We provide some analytic explanations about this feature.
With respect to the power properties, we find that the two-steps methods can, in impor-
tant practical cases, partially or completely lose power. First, the two-steps methods often
suffer from a non-monotonic power problem, i.e., power going to zero as the magnitude of
the change under the alternative increases. This occurs most notably in dynamic models
with lagged dependent variables and in static models when a correction for serial correlation
in the errors is applied.2 Second, many of the two-steps methods have trivial power, i.e.,
2This issue can be traced back to Perron (1991) and further documented by Vogelsang (1999). Crainiceanu
and Vogelsang (2007) investigated the problem when a heteroskedasticity and autocorrelation consistent
variance is used. See Deng and Perron (2008) for some asymptotic analyses in the case of CUSUM tests,
Kim and Perron (2009) for the tests proposed by Andrews (1993) and Andrews and Ploberger (1994), and
Perron and Yamamoto (2016) for Elliott and Müller’s (2006) test.
2
power equal to or below size, when the regressors associated with changing coefficients have
zero-mean. This is because these tests simply look at the unconditional mean of the depen-
dent variable when estimating the variance profile since they do not account for parameter
variations in the coefficients. Hence, they are fooled and assign all the variance change in the
dependent variable to a change in the variance of the errors. Third, even when the two-steps
tests have power, they are generally less powerful than the sup3 test.
The rest of the paper is structured as follows. Section 2 presents the model, the hy-
potheses and conventional tests valid with constant variance as they are key ingredients of
the procedures to be discussed later; the and sup tests. We also discuss the
test proposed by Perron, Yamamoto and Zhou (2020). Section 3 provides a review of five
two-steps methods designed to be robust to heteroskedasticity. These are picked as repre-
sentative of this approach. In Section 4, we investigate the size of the tests via simulations,
followed by an analytic explanation about why the sup3 test can have good size even
when the variance process is not an abrupt structural change. In (almost) all cases covered,
the sup3 allowing for one (or sometimes two) changes in variance is remarkably robust
(exact size close to nominal size) to various forms of changes even if they are designed to
model abrupt changes. Section 5 investigates the power properties of the sup3 test
compared with those of the two-steps method. Both abrupt changes and random walk pa-
rameter variations are considered. Overall, the results show a clear power advantage for the
sup3 test. Section 6 provides brief concluding remarks with a discussion of the main
remaining contentious issues. An appendix contains some technical derivations.
2 Model, hypotheses, and conventional tests
We consider a linear regression model
= 0 + for = 1 , (1)
where is a scalar independent variable, is a -dimensional vector of regressors, is
a -dimensional vector of coefficients which are possibly time varying and is a scalar
error term that satisfies () = (|) = 0 for all . The goal is to test if structural
changes are present in the coefficients, so that the null hypothesis is 0 : = for all
versus the alternative hypothesis of 1 : 6= for some . In particular, we allow for
the unconditional variance of the error term to be time varying such that (2 ) = 2 . For
simplicity, we assume 2 to be a deterministic function. Note that the model above could
be generalized to allow partial structural changes, i.e., some of the elements of are not
3
subject to change. Since this has no impact at all on the message we wish to convey, we
simply keep the pure structural change model (1) for simplicity of exposition.
It is useful to first discuss two leading tests for structural changes, valid with constant
variance (2 = 2 for all ) as they are key ingredients of the procedures to be discussed
later. The first is the test proposed by Brown, Durbin, and Evans (1975):
= max1≤≤ −1P
=1
h( − ( ) )
√i22
where =P
=1 e and e = −0(P
=1 0)−1P
=1 are the OLS regression residuals
from estimating (1) under 0 and 2 is a consistent estimate of the long run variance of
(or 2 times the spectral density function at frequency zero of when it is stationary)
using the entire sample given by
2 = 0 + 2P−1
=1 ( ) (2)
where = −1P
=+1 ee− for = 0 −1, with (·) some weight function and a
bandwidth parameter whose exact choice varies across different papers; we shall specify the
exact form used when discussing the various tests considered. The asymptotic distribution
is given by the supremum of the squared Brownian bridge:
⇒ sup0≤≤1[ ()− (1)]2 (3)
where “⇒” denotes weak convergence under the Skorohod topology and () for ∈ [0 1] isthe standard Wiener process. The CUSUM test as proposed by Brown, Durbin, and Evans
(1975) uses recursive residuals. However, Ploberger and Kramer (1992) showed that using
OLS residuals instead of recursive residuals yields a valid test, though the limit distribution
under the null hypothesis is different, namely as stated in (3) with the Brownian Bridge
() − (1) instead of a scaled Wiener process. Since the papers we shall review use
the OLS based CUSUM, we only consider this version. The second test is the sup test
proposed by Quandt (1958, 1960) and further analyzed by Andrews (1993):
sup = 2£sup∈Λ log (
)− log
¤
where = [ ] with ∈ (0 1) a constant called the break fraction. As required, isrestricted to be in a subset of the [0 1] interval and we assume that ∈ Λ = [ 1 − ]
called a set of permissible break fractions with being a small positive constant, called a
4
trimming parameter. Throughout, we set = 015 although this does not affect any of the
qualitative results. The log-likelihood function under the alternative hypothesis is
log () = −(2)(log 2 + 1)− (2) log 2
with 2 = −1P
=1 2 where are the OLS residuals from estimating (1) assuming a
structural change in the coefficients at date . Under 0, the log-likelihood function is
log = −(2)(log 2 + 1)− (2) log e2with e2 = −1
P
=1 e2 where e are the OLS regression residuals assuming no change in thecoefficients (as previously defined). The asymptotic distribution of the test is
sup ⇒ sup∈Λ[(1)−(
)]0[(1)−()]
(1− ) (4)
where () is a -dimensional standard Wiener process. The asymptotic null distributions
of the and the sup tests are not valid when heteroskedasticity is present
in the errors. We first review in Section 3, the test proposed by Perron, Yamamoto, and
Zhou (2020) based on the likelihood approach and in Section 4 five different proposals using
variants of the CUSUM methodology.
3 Test jointly estimating structural changes in coefficients and in variance
Perron, Yamamoto, and Zhou (2020) (henceforth PYZ) recently provided a comprehensive
framework allowing various tests for multiple structural changes in the coefficients and vari-
ance of the errors in a linear regression model. An important ingredient of their approach
is that it jointly estimates structural changes in the regression coefficients and those in the
error variance, allowing for the break dates in the two components to be different or over-
lap. Hence, joint tests can be constructed. Here, the interest is in testing for changes in
the coefficients allowing for changes in the variance. Hence, the relevant test from PYZ is
their sup3 test. Let the coefficients be subject to structural changes at dates
1
with
= [ ] for = 1 (where [·] denotes the greatest lower integer)
and let the error variance 2 be subject to structural changes at dates 1
with
= [
]; the break dates in coefficients and variance can be different, the same, or over-
lap; the only restriction is some separation between the break dates; see below. To estimate
the break dates, the algorithm of Qu and Perron (2007), building on Bai and Perron (2003),
5
is used. The test is
sup3 = 2[ sup(1
;
1
)∈Λ
log (1
;
1
)− sup
(1 )∈Λ
log (1
)]
where the log-likelihood function under the alternative hypothesis is
log (1
;
1
) = −(2)(log 2 + 1)−
P+1
=1 [( −
−1)2)] log 2 (5)
where 2 = ( −
−1)−1P
= −1+1(−0)
2 for = 1 +1 with, for = 1 +1,
= (P
= −1+1(
0
2))
−1P = −1+1
(2)
The log-likelihood under the null hypothesis is
log (1
) = −(2)(log 2 + 1)−
P+1
=1 [( −
−1)2)] log e2 where e2 = (
− −1)
−1P= −1+1
( − 0e)2 for = 1 + 1 withe = (P
=1(0e2))−1P
=1(e2)the coefficient estimate assuming no structural change. We let 2 denote
2 if
−1 + 1 ≤
≤ , likewise for e2. The set Λ is the union of the permissible break fractions for the
coefficients and variance breaks and Λ that for the variance break fractions. Hence,
Λ = (1 1 ) ; for (1 ) = (1 ) ∪ (1 )|+1 − | ≥ ( = 1 − 1) 1 ≥ ≤ 1−
and Λ =©(1
) ;
¯+1 −
¯≥ ( = 1 − 1) 1 ≥ ≤ 1−
ªwith the
total number of breaks dates in coefficients and variance. Note that this test encompasses
the sup setting = 1 and = 0. PYZ showed that a bound on the limit distribution
is the asymptotic distribution in Bai and Perron (1998). Namely,
sup3 ⇒ sup(1
)∈Λ
X=1
||(+1)− +1(
)||2
+1(
+1 − )
≤ sup(1
)∈Λ
X=1
||(+1)− +1(
)||2
+1(
+1 − )
where
Λ =
©(1
) ; for(1 ) = (
1
) ∪ (01 0 )
|+1 − | ≥ ( = 1 − 1) 1 ≥ ≤ 1−
6
and Λ = (1 ) ; |+1− | ≥ ( = 1 − 1) 1 ≥ ≤ 1− . This impliesthat, theoretically, some conservative size distortions are present even asymptotically. They
show, however, that they are very minor and of no important consequences for inference.
Note that this test is designed to account for abrupt structural changes in the error variance,
not for general form of heteroskedasticity. We shall address this problem carefully later. In
anticipation of the results, it will be shown that the sup3 test is surprisingly robust to
a wide range of possible time variation in the variance 2 .
When allowing and correcting for serial correlation in the errors, we use the following
robust Wald type statistic: sup(1)∈Λ 3 ( | = 0 ), where
3 ( | = 0 ) = 00( ()0)−1 (6)
with = (01
0+1)
0 the QMLE of under a given partition of the sample, is the con-
ventional matrix such that ()0= (01−02 0−0+1) and () is an estimate of the co-
variance matrix of robust to serial correlation and heteroskedasticity, i.e., a consistent esti-
mate of () = plim→∞ ¡ 0
¢−1Ω
¡ 0
¢−1, whereΩ
= lim→∞( 0
0),
with = ¡1
+1
¢,
= ( −1+1
)0, = (
1
)0, = () and
= (), for 0−1 ≤ 0
( = 1 + 1). In practice, the computation of this
test can be very involved. Following Bai and Perron (1998), we first use the dynamic pro-
gramming algorithm to get the break points corresponding to the global maximizers of the
likelihood function (5), then plug the estimates into (6) to construct the test. This will not
affect the consistency of the test since the break fractions are consistently estimated.
4 Two-steps methods
We consider five methods recently proposed in this growing literature. They all take what
we call a two-steps approach. In the first step, they estimate the regression residuals under
the null hypothesis of no change in the coefficients. In the second step, they use these
residuals to construct a correction so that the asymptotic distribution of the test is valid
under heteroskedasticity. Some take an entire sample average to construct a test in which
the variance process is averaged out. In no way do we dispute their theoretical results and
the validity of their tests under the null hypothesis.
The work considered are the following. Zhou (2013) who proposed a wild bootstrap pro-
cedure for partial sums of the null residuals to mimic the heteroskedastic Wiener process;
Xu (2015) who used the null residuals to construct the time indicator for the heteroskedas-
tic Wiener process; Górecki, Horváth and Kokiszka (2018) who applied a Karhunen-Loeve
7
expansion to (the square of) the Brownian bridge with the kernels constructed using the
residuals under the null hypothesis. The last two papers use a two-steps approach to av-
erage the null residuals: Dalla, Giraitis and Phillips (2017) provided a simple U-statistic
in which the heteroskedastic null residuals are averaged out in the entire sample using a
long run variance estimate; Zhang and Wu (2019) proposed a U-statistic avoiding the use
of a long-run variance estimate using a Feasible Generalized Least Squares (FGLS) pro-
cedure. The estimate of the variance (when assuming serially uncorrelated errors) or the
long-run variance (when correcting for possible serial correlation) using the entire sample is
constructed by
2 =
⎧⎨⎩ 0 when no serial correlations are accounted for
0 + 2P−1
=1 ( ) when serial correlations are accounted for (7)
where = −1P
=+1 ee− for = 0 1 − 1, with e being the regression residualsunder the null hypothesis of no structural change in the coefficients. When it is constructed
using a subsample of e for = 1 , it is denoted by 2 . Unless otherwise stated, we usethe Quadratic Spectral kernel
() = [25(1222)] (sin(65)(65)− cos(65))
with the bandwidth selected via the (1) approximation of Andrews (1991), namely
= 13221£4e2(1− e)4¤15 , (8)
where e is the OLS estimate in the regression e = ee−1 + e. When the authors imposea specific choice for such as the correction of Andreou (2008) used by Xu (2015) and a
minimum of given by 12515 used by Zhang and Wu (2019), we incorporate them.
4.1 Zhou (2013)
Zhou (2013) considered a test statistic based on the test without estimating the
subsample long run variance, given by:
= max1≤≤ | − ( ) | √
where =P
=1 e. Its asymptotic distribution involves a heteroskedastic Wiener processand is approximated by replications of the bootstrapped statistic
()
( = 1 2 ):
()
= [ ( − + 1)]−12P
=1 (− ( ) )
()
8
for = 1 , where =P+−1
= e and () ∼ (0 1) being an external random
variable independent of all variables entering the construction of the test. We set the band-
width = 1 when no serial correlation in the errors is accounted for and the selection rule
(8) when it is accounted for. We then compute
= max+1≤≤−+1
¯
()− 1
− + 1
()
−+1
¯
for = 1 . With the ordered statistics (1) ≤ (2) ≤ · · · ≤ (), the -value is computed
by 1−∗ where ∗ = max : () ≤ .
4.2 Xu (2015)
Xu (2015) also proposed to use the test3
= max1≤≤ | − ( ) |
where =P
=1 e and is constructed by the square root of (7). The asymptotic
distribution is approximated by a -discrete steps heteroskedastic Wiener process ()(())
with = 0 1 2 1 such that
= max0≤≤1¯ ()(())− ( ) ()(1)
¯
with ()(()) = −12P[()]
=1 () where
() ∼ (0 1) for = 1 and the time
indicator is specified by () = 0 if = 0, () = 1 if = 1 and
() =³P
=1 e2´−1 ³P[]
=1 e2´ if 0 1
With the ordered statistics (1) ≤ (2) ≤ · · · ≤ (), the -value is computed by 1 − ∗
where ∗ = max : () ≤ . When serial correlation in the errors is accounted for, asXu (2015) suggested, we use = mine 085 with e obtained from the selection rule (8);
for more details, see Andreou (2008).
4.3 Górecki, Horváth and Kokiszka (2018)
Górecki, Horváth and Kokiszka (2018) proposed to use the statistic
= −1P
=1
h( − ( ) )
√i2
3Xu (2015) also proposed a version of the sup test, however, its properties are essentially the same
as the CUSUM test considered here.
9
where =P
=1 e. The asymptotic distribution is approximated by (the truncated versionof) the Karhunen-Loeve expansion of the squared Brownian bridgeR 1
=0[ ()− R 1
=0 ()]2 ≈P
=1 (() )
2
where () ∼ (0 1) is an external random variable. The kernel for = 1 is
obtained as the largest eigenvalues of a × matrix whose ( ) element is
( ) = min − ( ) − ( ) + ( )( )
where is constructed by the square root of (7) and min, , and are the subsample
estimates. Using draws for () ( = 1 ), the Brownian bridge is approximated via
replications of =P
=1 (() )2. With the ordered statistics (1) ≤ (2) ≤ · · · ≤ (), the
-value is computed by 1−∗ where ∗ = max : () ≤ .
4.4 Dalla, Giraitis and Phillips (2017)
Dalla, Giraitis and Phillips (2017) considered the same U-statistic as Górecki, Horváth and
Kokiszka (2018), except that they scale it by the full sample estimate 2 in order to cancel
the effect of the variance profile, so that
= −1P
=1
h( − ( ) )
√i22
where =P
=1 e and 2 is constructed by (7). The asymptotic distribution is given byR 1=0[ ()− R 1
=0 ()]2.
4.5 Zhang and Wu (2019)
Zhang and Wu (2019) considered a U-statistic given by:
= −2P
=+1
P
=+16= ( )0
where = e and = |− |, when no serial correlation is accounted for. When serial cor-relation in the errors is accounted for, the residuals are constructed by e = ∗ −∗0 , where
= (P
=2 ∗∗0 )−1(P
=2 ∗∗0 ) with Cochrane-Orcutt (1949) type transformed variables
∗ = − −1 and ∗ = − −1. In this case, is obtained as the OLS estimator
of an (1) model 4 applied to the residuals ∗ . Note that ∗ is obtained by accounting for
local coefficients variations using the specification:
∗ = − 00 − 0¡(0 − )(
35)¢1
4Note that one could use an () model with the lag order estimated via information criteria. Doing
so provides almost identical results, hence, we only report results with the (1) specification.
10
The test statistic is = , where
2 = 2−2P
=+1
P
=+16=[( )(0)]
2
is an estimate of the long-run variance of , whether or not serial correlation in the error
is accounted for. Then, asymptotically follows a standard normal distribution under
0. They recommend the bandwidth to be set at the minimum of the value given by
the rule (8) and 125 15 to avoid inflating it.
5 Simulation design
The simulations design consists of the specifications for the variance profiles, discussed in
Section 5.1, and the conditional mean regression model discussed in Section 5.2. Note that
the sup3 test depends on the prior specification of the number of changes in variance.
This is undesirable hence, for practical applications, we use the sup10 (1 + 1|1 )test of PYZ, which test the null hypothesis of breaks in variance versus the alternative
hypothesis of +1 breaks. The relevant limit distribution is presented in PYZ. We start with
= 1 and continue until a non-rejection occurs subject to a maximal value of = 3. Hence,
we basically search for the “best” number of breaks in variance within the set = 1 2 3 4.
This is enough for all cases considered. Typically, the selected value are = 1 or = 2.
Ignoring the value = 0 allows us to avoid problematic cases for which the test would not
reject when two breaks in variance are needed; see Bai and Perron (2006) for a discussion.
In any event, even if no break in variance is present and one allows a break, the size and
power of the sup3 test are unaffected; see Perron and Yamomoto (2019). This version
of the test is denoted by sup∗3 .
5.1 Variance profiles
To assess the finite sample size and power of the tests, the data are generated from model
(1) and the errors are drawn independently from a (0 1) multiplied by . We exper-
imented with many specifications for the temporal pattern of 2 , and choose the following
four functions as the most illustrative. They are sufficient to convey the relevant message.
Throughout, the parameter captures the magnitude of changes. The first case (VC1) has
a single structural change at date = [0 ], so that 2 = 1 for ≤ [0 ] and 2 = (1+ )2
for [0 ], where we set 0 = 03 and 075. The second case (VC2) considers smooth
changes in the form of a trigonometric function 2 = 2 sin( )2, where we set = 2
11
and 4. The third case (VC3) considers trending variance; the first growing linearly (VC3L)
so that 2 = ( ) and the second having a quadratic trend (VC3Q) with 2 = [( )]2.
The fourth case (VC4) has a short but large spike with length 5% of the sample so that the
variance is given by 2 = (1 + )2, for [(0 − 0025) ] ≤ ≤ [(0 + 0025) ], and 2 = 1,
otherwise. We set 0 = 03, 05, 07.
5.2 Regression models
We consider the following four linear regression models all specified by (1) with ∼(0 1). 1) The “Mean Model” uses = 1 for all and the tests are constructed
without accounting for serial correlations in (i.e., assuming correctly that the errors are
uncorrelated). 2) The “Mean Model (HAC)” also uses = 1 for all but the tests are
constructed allowing for potential serial correlation in the errors. They involve the long-run
(or HAC) variance estimate with the idiosyncratic modifications proposed by the authors
as stated in the previous section; for the sup∗3 test, we use Kejriwal’s (2009) hybrid
correction. The fact that there is no serial correlation in the true errors is inconsequential.
In fact the results reported below would simply be exacerbated (i.e., the power would be
even lower). 3) The “Zero-Mean Regressor” case sets = [1 ]0 where is drawn from
a standard normal distribution independent of and the coefficients are = [0 ]0. 4)
The “Dynamic Model” includes a lagged dependent variable in so that = [1 −1]0 and
= [ 0]0. In order to treat all methods on an equal basis, the tests are constructed for a
pure structural change setting, i.e., allowing all elements of to change. The sample size is
set to = 100, 300 for size and = 100 for power. The size and the power are computed
using a 5% nominal level based on 1,000 Monte Carlo replications. The number of bootstrap
repetitions is = 499 for the tests of Zhou (2013), Xu (2015), and Górecki, Horváth, and
Kokiszka (2018).
6 Size properties
For the size properties of the tests, we first present results from simulation experiments in
Section 6.1. These will show that all tests, including the sup∗3 , have exact size close to
the nominal 5% level. This may seem surprising for the sup∗3 since it is designed for
abrupt changes in variance. Some analytical explanations are provided in Section 6.2.
12
6.1 Simulation results
We first examine the finite sample size, focusing on the effect of structural changes in variance
on the sizes of the various tests. We also present the result for the conventional sup and
tests to assess whether a particular type of variance change causes size distortions
to tests that do not account for variance changes; i.e., to see if there is anything to correct
in practice. The results for a heteroskedasticity robust version of Xu’s (2015) test are also
presented. The results of the other two-steps tests are in line with Xu (2015) with the
exception of the test of Zhang and Wu (2019) which yields conservative size distortions in
almost all cases as mentioned in the original paper. Hence, these are not reported and the
results reported for Xu’s (2015) test should be viewed as representative of the five tests
robust to heteroskedasticity discussed in the previous section. We label it as .
Table 1 presents the exact size for nominal 5% size tests using the four regression spec-
ifications. Overall, we obtain similar results across the regression specifications. Hence, we
focus on the results for VC0 and highlight the main differences for the other model speci-
fication afterwards. When the variance is constant (VC0), all the tests have an exact size
close to the nominal level. This is also the case when the variance process follows a smooth
trigonometric function (VC2) for which the size distortions of the conventional tests are
negligible, meaning that there is nothing to correct. In fact, the sup test has size closer
to 5% in most cases along with the and . The same applies to VC1 with an
early sharp increase. These are all cases where the changes in variance cause no (or little)
size-distortions.
The effect of heteroskedasticity is more pronounced when the variance has an abrupt late
increase (VC1) with the break fraction at 075 or a trend (VC3), in which case the standard
tests have liberal size distortions. For the case of a spike (VC4) the tests are conservative but
still not far from the exact size of the test . The size improves somewhat when = 300.
For VC1, the size of the sup test becomes close to 30% when = 075, however, this
distortion is completely fixed when the sup∗3 test is used. This is expected as the latter
can account for the structural breaks in the error variance. For VC3L, the conventional
sup and tests show size distortions and they become worse with VC3Q.
What is interesting, and somewhat surprising at first sight, is that the sup∗3 test has
a good size in most cases even though the variance process is far from being one of abrupt
structural change. For VC3Q, some liberal size distortions are present with = 100, but
the size is near 5% when = 300. The size of the sup∗3 test is also closer to 5% (less
conservative distortions) when = 300.
13
When using a HAC variance estimate, the size distortions are considerably reduced for
sup∗3 for VC3Q. For VC4, the sup and the tests show conservative size
distortions unless the dynamic model is used. With a dynamic model, they show liberal size
distortions when = 100. However, these size is close to 5% when = 300. For the case of
a dynamic model with a lagged dependent variable, the size distortions for the sup are
important in all cases. The test tends to be conservative, except for a one-time
increase in variance late in the sample. However, the sup∗3 test again has an exact size
very close to 5%. For VC4, it has liberal size distortions when = 100. However, these size
is close to 5% when = 300. The test has an exact size close to nominal level, except
for VC4, in which case it is slightly conservative.
The bottom line is the following. The standard sup and tests can have
substantial size distortions, especially in models with a lagged dependent variable. The tests
designed to be robust to general forms of changes in variance have, in general, good size,
though they can be conservative in some cases. However, in (almost) all cases covered (and
others not reported here), the sup∗3 is remarkably robust (exact size close to nominal
size) to various forms of changes even if they are designed to model abrupt changes. The
next section provides some theoretical explanations for this curious feature.
6.2 Analytic explanations
This subsection provides an analytic explanation about the fact that the sup3 test
can provide good size even when the variance process does not consist of abrupt structural
changes. To this end, we focus on a single structural change in variance ( = 1). The
sup3 test estimates “the most likely” break date in variance through the maximum
likelihood method. In essence it minimizes the overall sum of squared residuals. If we
disregard the parameter uncertainty in the coefficients for simplicity, the quasi log-likelihood
with a single structural change in variance is
log () ∝ −[ log 21 + ( − ) log 22]
where 21 and 22 are the sample variances of the pre- and the post- break samples. Denote
their probability limits by 21 and 22, respectively. Then the probability limit of the log-
likelihood (scaled by −1) is such that
lim→∞ −1 log () ∝ − £ log 21 + (1− ) log 22
¤
The maximizer of this function, denoted by ∗, is what we refer to as “the most likely
variance break fraction”. This, of course, depends on the underlying variance profile. We
14
derive ∗ for the cases VC1 to VC4 in Appendix A. The log-likelihood of the sup3 test
when the model has variance 2 is equivalent to the log-likelihood of the sup test when
the model has variance ∗2 given by
∗2 =
⎧⎨⎩ 2 (221) for = 1 [∗ ]
2 (222) for = [∗ ] + 1
,
where 2 is the probability limit of the full-sample variance estimate. Then, ∗ can be con-
sidered as a modified variance profile; see Appendix B. Figure 2 shows the original variance
profile 2 with a solid line and the modified variance profile ∗2 with a broken line for VC1
to VC4, where we set = 10 although this does not affect the overall shapes. For VC1,
the modified variance profile is completely flattened, as expected for this case of a one-time
abrupt change. For VC2, the modified profile retains the same overall shape as the original
one, with only minor local changes. This is to be expected since the process is essentially
stationary, hence, no breaks are detected. It is the reason why the sup test is little
affected by such changes. For VC3, since the original variance process is monotonically in-
creasing, the variance profile of the post-break regime is divided by a larger value than the
variance profile of the pre-break regime. Hence, it effectively acts as a detrending device via
a rescaling of the levels of the variance profile. For VC4, the spike is flattened as the subpe-
riod which includes the spike is divided by a larger value. In sum, allowing for a single break
in variance can either completely flatten, partially flatten, or detrend the original variance
function by rescaling the pre- and post- beak variance profiles.
We now consider the asymptotic distributions of the test statistics. Define a continuous
analogue 2 with ∈ [0 1] to 2 for = 1 , that is 2 = lim→∞P[]
=1 2 . Note
that 2 depends on the parameter , which measures the magnitude of the variance changes.
Under the assumption that 2 is bounded, it is known that the asymptotic distribution of
the sup test is:
sup ⇒ sup∈Λ [ (())− (1)]2[(1− )]
where (()) is the time transformed Wiener process with () ≡ R 02
R 102; see
Hall and Heyde (1980) and Davidson (1994). Similarly, the asymptotic distribution of the
sup3 test is obtained simply replacing 2 with ∗2 (similarly defined using ∗ ):
sup3 ⇒ sup∈Λ [ (∗())− (1)]2[(1− )]
where ∗() ≡ R 0∗2
R 10∗2 . In Table 2, we evaluate the asymptotic size of the
sup test and the sup3 tests at the 5% nominal level using these distributions with
15
the Wiener process approximated by 5,000 discrete steps. The asymptotic size is very close
to the finite sample size presented in Table 1. They show that including a single break in
variance can drastically improve the size of the sup3 in all cases.
These results can help understand why accounting for one variance break can yield tests
with better size. Let us consider the difference between the maximands of the heteroskedastic
and homoskedastic asymptotic distributions given by:
∆() =£ (())− (1)2 − ()− (1)2¤ [(1− )]
=£ 2(())− 2 (()) (1)− 2() + 2 () (1)
¤[(1− )]
Using [ 2()] = and [ () (1)] = , the expected value conditional on is:
[∆()|] = [()− 2()− + 2()2][(1− )]
= (1− 2)[()− ][(1− )]
i.e., the distance between () and weighted by (1−2)[(1−)]. If we further assumethat is chosen at random under 0, e.g.,
∼ [0 1] for simplicity, we can compute the
unconditional counterpart, which acts as a measure of distance between the original and
transformed variance profiles: =R 1=0|(1− 2)[(1− )][()− ]|, or simply take the
dominating part =R 1=0|() − |, denoted by since it is akin to a Gini coefficient
for (). Table 3 reports the values of and for each variance profile. Clearly, the
larger and are, the more size distortion we observe in Table 2. These measures go a
long way toward getting an intuition about the effect of heteroskedasticity on the size of the
tests. In Figure 2, we illustrate , as the area between the fourty-five-degree line and the
heteroskedastic time scale (). This explains why the size distortions of the sup test
are small for VC2 as the area is narrow. It also shows that the larger the area, the more
severe the distortions become; e.g., VC1 (0 = 075), VC3, and VC4. More importantly,
the area shrinks with the modified profile and this illustrates how rescaling the pre- and the
post- break levels of the variance profile can reduce the size distortions.
7 Power properties
We now investigate via simulations the power properties of the sup∗3 test (henceforth
denoted by 3) and the two-steps methods outlined in Section 4: Zhou (2013; ),
Xu (2015; ), Górecki, Horváth and Kokiszka (2018; ), Dalla, Giraitis and Phillips
(2017; ), and Zhang and Wu (2019; ). We consider power against (1) coefficients
16
with a single structural change and (2) coefficients following a random walk. We include the
latter case because 3 is designed to detect abrupt structural changes, while the other
tests are agnostic about the nature of the changes. Hence, this will allow assessing how
robust 3 is to non-abrupt changes. We select three variance profiles: VC0, VC1, and
VC3L, as they cover the most important cases and are representative of the main features
that can arise. To enable us to make direct comparisons, we set the average variance levels
to be the same across specifications, that is, we set 2 = 1 for all for VC0; 2 = 05 for
≤ [05 ] and 2 = 15 for [05 ] for VC1 and 2 = 2( ) for VC3L. We evaluate the
power functions under the four regression models; (i) the Mean Model without accounting for
serial correlations in the errors, (ii) the Mean Model (HAC) accounting for serial correlations
in the errors, (iii) the Zero-Mean Regressor (without HAC), and (iv) the Dynamic Model
(with HAC). The HAC variance specifications follow the authors’ recommendations and, for
the sup∗3 test, we use Kejriwal’s (2009) hybrid correction.
7.1 Structural change in coefficients
We first consider model (1) with one structural change in coefficients at date 0 with a)
= for the Mean Model, b) = [0 ] for the Zero-Mean Regressor, and c) = [ 0]
for the Dynamic Model, where = 0 for ≤ 0, and = for ≥ 0, with 0 = [025 ],
without loss of generality at least qualitatively. The break magnitude varies from 0 to 10.
Figure 3.1 shows the power functions of the tests for VC0 (constant variance). Here,
all methods have a similar power function that quickly reaches one as increases. When
serial correlation in the errors is accounted for (Mean Model (HAC)), the power of all tests
initially increases as increases until around = 2. Then the power functions of all tests,
except 3 start decreasing as becomes larger and eventually reach zero (except for
, whose power eventually goes back up). This is the well-known non-monotonic power
problem, which has been documented before as stated earlier but seems to be ignored re-
peatedly. If the structural changes in the coefficients are neglected, they translate into level
shifts in the regression residuals. Then, as shown in Perron (1990), such level shifts inflate
the autocovariance estimates for the residuals. They also inflate the bandwidth used
when constructing the HAC variances as the persistence parameter is biased toward one
(e.g., Deng and Perron, 2008, among others). Some studies use a modified version. For
instance, Xu (2015) proposed using the suggestion of Andreou (2008) to avoid the large
bandwidth. However, it remains that autocovariance estimates are inflated resulting in a
power function that is non-monotonic. The same problem occurs when the regression model
17
includes a lagged dependent variable. This is again, following Perron (1989, 1990), because
the coefficient estimate of the lagged dependent variable is then biased toward one if struc-
tural changes are present. The estimated model is then akin to using first-differences of
the series, thereby transforming informative structural changes in the conditional mean into
outliers. An exception, is the test which does not show non-monotonic power in the
Mean Model (HAC) because it avoids the HAC variance estimate and uses FGLS. However,
it performs rather miserably when a lagged dependent variable is present. In contrast, 3
does not show any non-monotonic power as it accounts for structural changes in the mean
so that it obtains uncontaminated residuals. In all cases, the power increases to one rapidly
without any decrease as increases.
A case of interest that is particularly revealing of the properties of the tests is with a
Zero-Mean Regressor. Here, the change in the parameter causes a change in the variance of
the dependent variable even if no change in the variance of the errors occurs. Since the two-
steps type tests construct the quantities of interest using the residuals assuming no change,
they are fooled and assign all the variance change in the dependent variable to a change in
the variance of the errors. This results in no power at all; i.e., power equal to size. This holds
for all tests, except since it is essentially based on the regression scores . Hence,
in a sense, it has power mostly by luck since the presence of changes in coefficients that are
unaccounted for inflate the residuals. However, as discussed previously completely
fails when a lagged dependent variable is present for the same reasons as discussed above.
The 3 continues to have the highest power function, monotonic in .
Figures 3.2 presents the power functions for the case with VC1. We observe similar
patterns: non-monotonic power with the Mean Model (HAC) and with the Dynamic Model,
a trivial power with Zero-Mean Regressor as for VC0. For the Mean Model (HAC), all
methods have a very low power function, the only one increasing with being the 3 and
. When dealing with a model having a lagged dependent variable, all tests, except the
3, have non-monotonic power functions.
Finally, we consider a linearly trending variance process with the results reported in
Figure 3.3. They show that 3 still has the highest power, which is monotonic in all
cases. This is because it detrends or flattens the original variance profile as discussed in
Section 5.2. Again, a non-monotonic power with the Mean Model (HAC) and with the
Dynamic Model is observed for all two-step tests, the test having virtually no power
in the Mean Model (HAC). For the Zero-Mean Regressor case all two-steps tests have trivial
power, except for , which again has near zero power for the Dynamic Model.
18
7.2 Random walk coefficients
The previous subsection illustrates power advantage of 3 over the two-steps methods.
However, it may be suspected that the advantage comes from the particular pattern of
structural changes in the coefficients, i.e., abrupt ones to which the test is tailored. To
address this concern, we consider coefficients following a random walk, namely = −1+,
where ∼ (0 (√ )2). The models and variance profiles remain the same.
Figure 4.1 shows the power functions for VC0. The general shapes remain qualitatively
similar. With the Mean Model, all tests have similar power functions except that
has slightly lower power. The two-steps tests suffer from the non-monotonic power problem
with Mean Model (HAC) (except for ) and with the Dynamic Model. The power
of 3 does not reach one and shows some gradual power decline as increases because
the unaccounted coefficient variations can inflate the HAC variance and the persistence
parameter estimate. However, 3 still dominates the other tests. An exception is with
the Mean Model (HAC), in which case has higher power than 3 for large ,
though again its power is very low with the Dynamic Model. Figures 4.2-4.3 present the
power functions for VC1 and VC3L, respectively. The results are qualitatively the same.
Hence, overall, the 3 test continues to have a strong power advantage, except for the
Mean Model (HAC) for large values of the alternative, in which case the test is more
powerful, though still very deficient with a dynamic model. The fact that the 3 test
retains decent power should be expected since tests for an abrupt change in coefficients are
consistent against general parameter variations processes; e.g., Andrews (1993).
8 Conclusion
In this study, we carefully assessed the effect of accounting for heteroskedasticity on the size
and power of structural change tests in linear regression models. In particular, we focused on
the two-steps nature of some representative existing nonparametric methods, which obtain
regression residuals assuming no coefficient change, then use them to fully approximate the
heteroskedastic asymptotic distributions. These methods have good size. However, they
suffer from several deficiencies when it comes to power. First, most are prone to the non-
monotonic power problem discussed at length in the literature, when serial correlation in the
errors is accounted for or when the regressors include lagged dependent variables. Second,
most tests simply look at the unconditional mean of dependent variable when estimating
the variance profile, exactly because they do not account for parameter variations in the
19
conditional mean model. Hence, they only have trivial power when the regressors have zero-
mean. Third, even when the power is monotonically increasing, the tests are, in general, not
as powerful as the sup∗3 proposed by Perron, Yamamoto and Zhou (2020), who take a
different approach that accounts jointly for changes in coefficient and variance of the error.
The sup∗3 test of Perron, Yamamoto and Zhou (2020) has surprising properties.
First, although the size is not completely immune to general forms of heteroskedastic pat-
terns other than abrupt structural changes, accounting for a few structural changes in the
variance can flatten or detrend the original variance process, thus considerably reduce the
size distortions. Another way to state this important fact is as follow. In many cases, the
original sup test that ignores potential changes in the variance of the errors is very little
affected by various types of variance profiles (e.g., increases early in the sample, periodic
variations, etc.). By allowing just a few breaks in variance (one or two in most cases), the
sup∗3 test in a sense transforms the variance profile into one that effectively has very
little impact on the size of the test. It can either completely flatten, partially flatten or
detrend the variance process by rescaling the pre- and post- break levels of variances. We
also showed that the sup∗3 test is immune to the non-monotonic power problem (since
it treats changes in coefficients and variance jointly) and attains high power even when the
regressors have zero-mean. In general, it has the highest power.
It is clear that within the class of abrupt changes in coefficients and variance of the errors,
the sup∗3 have the correct asymptotic size and the best power. Some may argue that
since the sup∗3 test does not have precisely an asymptotic size of 5% for all types of
variances profiles (within some broader class) it is an invalid test from the start. This ignores
the general principles of hypothesis testing, which deals with a good balance between type I
(incorrectly rejecting when the null is true) and type II (incorrectly not rejecting when the
null is false) errors. By sticking to this philosophy of considering size only irrespective of
power, we end with tests that are only marginally better than throwing coins in important
cases of interest. Our argument is that the sup∗3 test involves some but very little size
distortions in general, and none when the changes in the variance of the errors are abrupt.
The other tests have the correct size in general but can fail drastically when it comes to
power. Hence, the trade-off should favor the sup∗3 test until something better is found.
20
Appendix A: Derivation of the “most likely” variance break fraction
In this Appendix, we derive the “most likely” variance break fraction under VC1 to VC4.As in the text, we ignore parameter uncertainty, for simplicity. The log-likelihood functionwith one variance break is then
() = −[ log 21 + ( − ) log 22]
We have lim→∞ 21 = ()−1R 02 and lim→∞ 22 = (1 − )−1
R 12, which can
easily be computed given a specific variance profile. For VC1, they are 21 and 22 and
∗ = 0. For VC2, after some algebra, we obtain
lim→∞ 21 = 2()−1R 0sin2()
=2
2− 2
4sin(2)
and
lim→∞ 22 = 2(1− )−1R 1sin2()
=2
2− 2
4(1− )sin(2)− sin(2)
where 2 =R 102. Then, the probability limit of (minus) the log-likelihood becomes
lim→∞¡−−1¢ = log
∙2 − 2
2sin(2)
¸+(1− ) log
∙2 − 2
2(1− )sin(2)− sin(2)
¸
Although this is not globally concave in , the numerical solutions are ∗ = 0025 and0975. We present results for ∗ = 0025 in Figure 1 (using ∗ = 0975 provides essentiallythe same results). For VC3L, the limit of the sample variances are
lim→∞ 21 = 2()−1R 0 = 22
lim→∞ 22 = 2(1− )−1R 1 = 2(1 + )2
For VC3Q, we have:
lim→∞ 21 = 2()−1R 02 = 223
lim→∞ 22 = 2(1− )−1R 12 = 2(1 + + 2)3
Hence, the limit of the log-likelihood for VC3L is,
lim→∞¡−−1¢ = log
¡22
¢+ (1− ) log
¡2(1 + )2
¢
21
and for VC3Q,
lim→∞¡−−1¢ = log
¡223
¢+ (1− ) log
¡2(1 + + 2)3
¢
Hence, for VC3L, ∗ = 0225 and for VC3Q, ∗ = 0285. For VC4, the most likely breakfraction is chosen at the end of the spike when 0 05 and at the beginning of the spikewhen 0 05. When 0 = 05, either ∗ = 0 + 0025 or ∗ = 0 − 0025 canmaximize the probability limit of the log-likelihood. Hence, when reporting results, we use∗ = 0 + 0025, although the results with ∗ = 0 − 0025 are essentially the same.
Appendix B: Derivation of the modified variance profiles
We show that the log-likelihood of a linear model with variance profile 2 which accountsfor one variance break is asymptotically the same as the log-likelihood of the linear modelwith variance profile ∗2 which accounts for no variance break. Let the error term be = where () = 1. Again, for simplicity, we ignore the uncertainty due to the estimationof the coefficient parameters. Then, we can simply consider the log-likelihood under thenull hypothesis of no coefficient change. If we omit asymptotically negligible terms, thelog-likelihood accounting for one variance break (see the proof of Theorem 1 (c) of PYZ) is
() = − (e2121)− ( − )(e2222)
= −−21P
=1 2 − −22
P
= +1 2
= −P
=1
¡2
21
¢2 −
P
= +1
¡2
22
¢2 (A.1)
Similarly, the log-likelihood accounting for no variance break but with ∗ = ∗ is
= − (e22)= −2
P
=1 ∗2
= −P
=1
¡∗2
2¢2 (A.2)
Hence, (A.1), evaluated at = [∗ ], and (A.2) yield
∗22
=
⎧⎨⎩ (221) for = 1 [∗ ]
(222) for = [
∗ ] + 1
or
∗2 =
⎧⎨⎩ 2 (221) for = 1 [∗ ]
2 (222) for = [∗ ] + 1
.
22
References
Andreou, E. (2008) “Restoring monotone power in CUSUM test,” Economics Letters, 98,48-58.
Andrews, D.W.K. (1991) “Heteroskedasticity and autocorrelation consistent covariance ma-trix estimation,” Econometrica, 59, 817-858.
Andrews, D.W.K. (1993) “Tests for parameter instability and structural change with un-known change point,” Econometrica, 61, 821-856.
Andrews, D.W.K and W. Ploberger (1994) “Optimal tests when a nuisance parameter ispresent only under the alternative,” Econometrica, 62, 1383-1414.
Bai, J. and P. Perron (1998) “Estimating and testing linear models with multiple structuralchanges,” Econometrica, 66, 47-78.
Bai, J. and P. Perron (2003) “Computation and analysis of multiple structural change mod-els,” Journal of Applied Econometrics, 18, 1-22.
Bai, J. and P. Perron (2006) “Multiple structural change models: a simulation analysis,” inEconometric Theory and Practice: Frontiers of Analysis and Applied Research, D. Corbae,S. Durlauf and B.E. Hansen (eds.), Cambridge University Press: Cambridge: 212-237.
Bock, O., X. Collilieux, F. Guillamon, E. Lebarbier, and C. Pascal (2020) “A breakpointdetection in the mean model with heterogenous variance on fixed time-intervals,” Statisticsand Computing, 30, 195-207.
Brown, R.L., J. Durbin, and J.M. Evans (1975) “Techniques for testing the constancy ofregression relationships over time,” Journal of the Royal Statistical Society B, 37, 149-163.
Cavaliere, G. and A.M.R. Taylor (2006) “Testing for a change in persistence in the presenceof a volatility shift,” Oxford Bulletin of Economics and Statistics, 68, 761-781.
Cavaliere, G. and A.M.R. Taylor (2008) “Testing for a change in persistence in the presenceof non-stationary volatility,” Journal of Econometrics, 147, 84-98.
Cochrane, D. and G.H. Orcutt (1949) “Applications of least squares regressions to relation-ships containing autocorrelated error terms,” Journal of the American Statistical Association,44, 32-61.
Crainiceanu, C.M. and T.J. Vogelsang (2007) “Nonmonotonic power for tests of mean shiftin a time series,” Journal of Statistical Computation and Simulation, 77, 457-476.
Dalla, V., L. Giraitis, and P.C.B. Phillips (2017) “Testing mean stability of a heteroskedastictime series,” Unpublished manuscript, Yale University.
Dalla, V., L. Giraitis, and P.M. Robinson (2020) “Asymptotic theory for time series withchanging mean and variance,” forthcoming in Journal of Econometrics.
23
Davidson, J. (1994) Stochastic Limit Theory: An Introduction for Econometricians. Oxford:Oxford University Press.
Deng, A. and P. Perron (2008) “The limit distribution of the CUSUM of squares test undergeneral mixing conditions,” Econometric Theory, 24, 809-822.
Elliott, G. and U.K. Müller (2006) “Efficient tests for general persistent time variation inregression coefficients,” The Review of Economic Studies 73, 907-940.
Górecki, T., L. Horváth, and P. Kokoszka (2018) “Change point detection in heteroscedastictime series,” Econometrics and Statistics, 7, 63-88.
Hall, P. and C.C. Hyde (1980) Martingale Limit Theory and its Applications. New York:Academic Press.
Hansen, B.E. (2000) “Testing for structural change in conditional models,” Journal of Econo-metrics, 97, 93-115.
Kejriwal, M. (2009) “Tests for a mean shift with good size and monotonic power,” EconomicsLetters, 102, 78-82.
Kim, D. and P. Perron (2009) “Assessing the relative power of structural break tests using aframework based on the approximate Bahadur slope,” Journal of Econometrics, 149, 26-51.
Pein, F., H. Sieling, and A. Munk (2017) “Heterogeneous change point inference,” Journalof Royal Statistical Society Series B, 79, 1207-1227.
Perron, P. (1989) “The great crash, the oil price shock and the unit root hypothesis,” Econo-metrica, 57, 1361-1401.
Perron, P. (1990) “Testing for a unit root in a time series with a changing mean,” Journalof Business and Economic Statistics, 8, 153-162.
Perron, P. (1991) “A test for changes in a polynomial trend function for a dynamic timeseries,” Econometric Research Program Research Memorandum No. 363, Princeton Univer-sity. Reprinted in Time Series Econometrics: Volume 2: Structural Change, (Perron, P.,ed.), World Scientific, 2019, 1-65.
Perron, P. and Y. Yamamoto (2016) “On the usefulness or lack thereof of optimality criteriafor structural change tests,” Econometric Reviews, 35(5), pp 782-844.
Perron, P. and Y. Yamamoto (2019) “Pitfalls of two steps testing for changes in the errorvariance and coefficients of a linear regression models,” Econometrics, 7, 22.
Perron, P., Y. Yamamoto, and J. Zhou (2020) “Testing jointly for structural changes in theerror variance and coefficients of a linear regression model,” Quantitative Economics, 11,1019-1057.
Pitarakis, J.-Y. (2004) “Least-squares estimation and tests of breaks in mean and varianceunder misspecification,” Econometrics Journal, 7, 32-54.
24
Ploberger, W. andW. Krämer (1992) “The CUSUM test with OLS residuals,” Econometrica,60, 271-285.
Qu, Z. and P. Perron (2007) “Estimating and testing multiple structural changes in multi-variate regressions,” Econometrica, 75, 459-502.
Quandt, R.E. (1958) “The estimation of the parameters of a linear regression system obeyingtwo separate regimes,” Journal of the American Statistical Association, 53, 873-880.
Quandt, R.E. (1960) “Tests of the hypothesis that a linear regression system obeys twoseparate regimes,” Journal of the American Statistical Association, 55, 324-330.
Vogelsang, T.J. (1999) “Sources of nonmonotonic power when testing for a shift in mean ofa dynamic time series,” Journal of Econometrics, 88, 283-299.
Xu, K.-L. (2015) “Testing for structural change under non-stationary variances,” Economet-rics Journal, 18, 274-305.
Zhang, E. and J. Wu (2019) “Testing for structural changes in linear regressions with time-varying variance,” forthcoming in Communication in Statistics-Theory and Methods.
Zhou, Z. (2013) “Heteroscedasticity and autocorrelation robust structural change detection,”Journal of American Statistical Association, 108, 726-740.
25
Table 1: Exact size of the structural change tests under heteroskedasticity
a) Mean Model
= 100 = 300
supLR supLR∗3 CUSUM XU supLR supLR∗3 CUSUM XU
VC0 - 0 5.0 4.0 3.8 6.4 5.0 4.1 4.5 5.3
VC1 0=0.3 5 6.6 3.7 6.1 6.4 6.6 2.3 5.8 6.0
10 5.7 4.5 4.1 5.7 6.4 2.9 5.8 6.1
0=0.75 5 27.5 4.6 16.8 7.0 25.9 3.9 15.6 5.9
10 26.2 5.7 16.3 5.4 27.7 3.7 18.9 5.8
VC2 = 2 5 4.8 1.8 3.1 5.6 5.1 2.8 4.4 5.1
10 4.3 1.9 3.4 5.6 5.1 2.5 4.3 5.4
= 4 5 3.8 3.5 1.7 2.9 4.7 2.4 3.1 5.1
10 5.8 4.0 3.7 5.4 5.4 2.5 4.1 5.3
VC3L linear 5 8.6 3.9 5.7 6.1 10.1 3.4 6.6 5.7
10 8.1 4.1 4.5 4.8 10.0 3.6 6.6 5.8
VC3Q quadratic 5 13.6 10.5 7.4 4.4 13.8 5.2 9.8 5.6
10 14.3 9.7 8.5 5.9 15.0 5.3 10.6 5.4
VC4 0=0.3 5 3.3 1.8 2.7 3.8 4.8 2.2 4.7 4.5
10 4.2 0.6 3.9 4.0 7.1 3.9 6.8 3.9
0=0.5 5 1.7 1.7 2.5 5.9 2.2 3.2 3.4 4.9
10 0.3 1.6 0.7 2.4 1.8 4.0 3.1 4.8
0=0.7 5 3.1 1.0 2.8 4.0 4.3 2.1 4.4 4.0
10 2.9 0.9 2.9 3.0 7.5 3.7 7.3 4.5
b) Mean Model (HAC)
= 100 = 300
supLR supLR∗3 CUSUM XU supLR supLR∗3 CUSUM XU
VC0 - 0 6.9 5.7 3.8 6.2 5.5 5.5 3.9 6.2
VC1 0= 03 5 9.3 3.2 4.6 5.8 7.0 2.1 5.7 5.8
10 8.3 2.8 3.2 4.8 6.8 2.0 5.5 4.8
0= 075 5 30.8 9.1 15.0 4.9 27.0 7.0 15.7 4.9
10 31.6 7.9 15.2 3.3 30.3 8.5 19.6 3.3
VC2 = 2 5 6.4 2.1 3.4 5.0 6.2 2.0 4.3 5.0
10 5.7 2.0 2.7 5.0 5.6 3.0 4.2 5.0
= 4 5 6.3 4.9 2.2 3.5 5.5 3.8 3.1 3.5
10 6.4 5.6 2.2 4.4 6.5 3.9 4.3 4.4
VC3L linear 5 9.6 6.9 3.6 3.9 9.5 4.8 6.2 3.9
10 10.6 6.2 5.0 5.3 11.6 4.4 7.2 5.3
VC3Q quadratic 5 15.7 7.5 7.0 3.9 14.3 5.6 8.2 3.9
10 16.9 7.1 7.6 5.0 16.3 6.6 10.2 5.0
VC4 0= 03 5 3.2 1.6 1.7 2.7 4.4 1.3 3.5 2.7
10 3.1 0.6 2.1 1.7 5.5 0.9 4.1 1.7
0= 05 5 2.8 0.9 1.7 4.4 2.9 1.5 3.6 4.4
10 1.1 0.2 1.6 4.0 1.0 1.0 1.4 4.0
0= 07 5 4.1 1.4 2.3 2.8 4.3 1.9 3.9 2.8
10 2.7 0.7 1.7 1.8 5.9 1.3 4.8 1.8
Table 1 (cont’d): Exact size of the structural change tests under heteroskedasticity
c) Zero-Mean Regressor
= 100 = 300
supLR supLR∗3 CUSUM XU supLR supLR∗3 CUSUM XU
VC0 - 0 3.9 5.4 4.1 5.6 5.1 4.1 3.1 4.1
VC1 0= 03 5 7.3 4.9 4.6 6.2 7.4 2.9 4.0 4.2
10 8.5 7.1 4.7 6.0 7.5 3.8 4.9 6.0
0= 075 5 35.6 5.8 15.3 5.8 37.0 3.8 15.8 4.1
10 39.2 6.7 16.0 5.3 40.9 5.8 19.3 4.9
VC2 = 2 5 5.4 4.8 4.3 5.7 5.5 2.1 4.0 4.4
10 5.3 2.5 4.2 5.9 5.2 3.1 4.4 5.8
= 4 5 5.6 4.6 2.5 4.8 5.4 2.1 3.7 4.5
10 7.2 5.6 4.2 5.7 6.6 3.4 4.5 5.5
VC3L linear 5 8.7 7.5 3.1 3.4 11.4 4.0 5.1 4.3
10 11.0 6.8 4.2 4.9 12.4 4.8 7.0 6.3
VC3Q quadratic 5 18.3 15.5 6.0 3.4 18.8 7.7 8.6 4.8
10 19.5 17.0 7.8 5.2 21.0 9.9 9.2 5.5
VC4 0= 03 5 4.9 2.6 3.5 4.6 6.9 1.7 5.8 5.4
10 7.1 1.4 2.7 2.6 11.5 4.3 7.5 4.8
0= 05 5 1.9 1.4 2.8 5.3 1.9 2.5 3.0 5.1
10 0.9 3.0 0.8 2.9 1.3 4.3 3.1 4.4
0= 07 5 5.9 1.7 2.4 3.9 5.3 2.4 4.4 4.4
10 7.0 1.8 3.1 2.9 10.9 4.0 7.5 5.2
d) Dynamic Model
= 100 = 300
supLR supLR∗3 CUSUM XU supLR supLR∗3 CUSUM XU
VC0 - 0 3.8 3.4 2.5 4.1 5.2 4.8 4.7 5.8
VC1 0= 03 5 10.2 3.5 4.4 5.1 11.4 2.6 5.5 5.4
10 9.7 5.2 4.4 5.4 13.7 3.3 5.1 4.6
0= 075 5 37.7 4.6 13.8 4.8 40.6 3.1 17.0 6.0
10 46.7 4.8 18.7 5.7 50.7 3.9 18.4 4.2
VC2 = 2 5 8.7 3.8 2.9 4.7 11.6 4.7 3.9 5.7
10 7.0 4.6 2.5 4.3 12.0 3.7 3.8 4.0
= 4 5 7.0 4.6 2.1 3.6 11.0 5.0 4.3 5.7
10 9.0 5.6 2.2 4.3 10.7 5.0 4.8 5.4
VC3L linear 5 8.0 4.6 3.5 3.8 10.8 3.4 6.2 5.7
10 11.1 4.7 4.3 4.3 11.7 3.8 6.6 5.3
VC3Q quadratic 5 13.7 9.1 6.3 4.3 17.8 7.3 10.2 5.7
10 17.8 9.1 7.3 3.8 17.4 5.9 9.2 5.1
VC4 0= 03 5 38.3 8.5 2.6 3.1 56.4 7.4 4.2 3.9
10 63.2 14.1 1.0 1.3 88.8 6.7 5.5 3.0
0= 05 5 25.5 6.2 1.6 3.7 51.2 6.0 2.8 4.7
10 58.3 11.6 0.8 2.0 85.7 5.0 1.7 3.1
0= 07 5 25.6 4.3 2.0 3.2 51.5 5.9 3.9 3.7
10 54.7 5.1 0.8 1.0 85.1 6.9 5.6 3.2
Figure 1: The original and the modified variance profiles after accounting for one break
VC1 0 = 03 0 = 075
0
20
40
60
80
100
120
140
originalmodified
0
20
40
60
80
100
120
140
VC2 = 2 = 4
VC3 Linear Quadratic
VC4 0 = 03 0 = 05
Table 2: Asymptotic size of the sup and sup3 tests
(%)
sup sup3 1 5 10 1 5 10
VC0 5.1 3.5
VC1 0 = 03 6.6 6.4 6.3 3.6 3.2 2.9
0 = 075 13.8 29.0 29.4 3.3 4.1 2.3
VC2 = 2 5.6 4.6 6.6 5.6 4.7 6.5
= 4 6.0 5.1 5.3 5.7 5.1 4.7
VC3L 10.1 12.3 9.6 6.3 7.4 6.2
VC3Q 16.3 15.4 16.7 8.8 8.1 9.2
VC4 0 = 03 6.1 8.7 15.0 4.4 3.6 4.5
0 = 05 4.8 4.1 4.3 4.3 2.7 2.1
0 = 07 3.5 9.3 12.4 2.9 2.8 4.6
Table 3: Measures of size distortions
Measure Measure
sup sup3 sup sup3 1 5 10 1 5 10 1 5 10 1 5 10
VC0 - 0.00 0.00 0.00 0.00
VC1 0 = 03 0.63 0.89 0.91 0.00 0.00 0.00 0.37 0.52 0.54 0.00 0.00 0.00
0 = 075 1.03 2.15 2.32 0.00 0.00 0.00 0.58 1.21 1.30 0.00 0.00 0.00
VC2 = 2 0.27 0.27 0.27 0.25 0.25 0.25 0.08 0.08 0.08 0.08 0.08 0.08
= 4 0.15 0.15 0.15 0.17 0.17 0.17 0.14 0.14 0.14 0.14 0.14 0.14
VC3L 1.00 1.00 1.00 0.39 0.39 0.39 0.60 0.60 0.60 0.24 0.24 0.24
VC3Q 1.50 1.50 1.50 0.66 0.66 0.66 0.90 0.90 0.90 0.35 0.35 0.35
VC 4 0 = 03 0.20 0.95 1.28 0.14 0.36 0.41 0.14 0.67 0.90 0.04 0.12 0.13
0 = 05 0.18 0.88 1.18 0.15 0.53 0.64 0.11 0.56 0.75 0.10 0.36 0.43
0 = 07 0.20 0.95 1.28 0.14 0.36 0.41 0.14 0.67 0.90 0.04 0.12 0.13
Figure 2: Measure of size distortions
VC1
0 = 03 0 = 075 original modified
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0 original modified
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
VC2
= 2 = 4 original modified
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
original modified
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
VC3
Linear Quadratic original modified
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
original modified
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
VC4
0 = 03 0 = 07
original modified
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
original modified
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 3.1: Power of structural change tests
when the coefficients have a single abrupt break; VC0
Mean Model Mean Model (HAC)
Zero-Mean Regressor Dynamic Model
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
SLR3 ZRB
GHK XU
VS ZWU
(c)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 100.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
Figure 3.2: Power of structural change tests
when the coefficients have a single abrupt break; VC1
Mean Model Mean Model (HAC)
Zero-Mean Regressor Dynamic Model
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
SLR3 ZRB
GHK XU
VS ZWU
(c)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
Figure 3.3: Power of structural change tests
when the coefficients have a single abrupt break; VC3L
Mean Model Mean Model (HAC)
Zero-Mean Regressor Dynamic Model
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 4 8 12 16 20 24 28 32 36 40
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 4 8 12 16 20 24 28 32 36 40
SLR3 ZRB
GHK XU
VS ZWU
(c)0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 4 8 12 16 20 24 28 32 36 40
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 4 8 12 16 20 24 28 32 36 40
Figure 4.1: Power of structural change tests
when the coefficients follow a random walk; VC0
Mean Model Mean Model (HAC)
Zero-Mean Regressor Dynamic Model
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
SLR3 ZRB
GHK XU
VS ZWU
(c)0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 100.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
Figure 4.2: Power of structural change tests
when the coefficients follow a random walk; VC1
Mean Model Mean Model (HAC)
Zero-Mean Regressor Dynamic Model
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
SLR3 ZRB
GHK XU
VS ZWU
(c)0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8 9 10
Figure 4.3: Power of structural change tests
when the coefficients follow a random walk; VC3L
Mean Model Mean Model (HAC)
Zero-Mean Regressor Dynamic Model
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 4 8 12 16 20 24 28 32 36 40
SLR3 ZRB
GHK XU
VS ZWU
(c)0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 4 8 12 16 20 24 28 32 36 40
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 4 8 12 16 20 24 28 32 36 400.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 4 8 12 16 20 24 28 32 36 40