researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · testing...
TRANSCRIPT
![Page 1: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/1.jpg)
Testing for Granger Causality in Large
Mixed-Frequency VARs
Thomas B. Gotz∗1, Alain Hecq†2, and Stephan Smeekes‡2
1Deutsche Bundesbank2Maastricht University
March 3, 2016
Abstract
We analyze Granger causality testing in a mixed-frequency VAR, where the differencein sampling frequencies of the variables is large, implying parameter proliferation problemsin case we attempt to estimate the model unrestrictedly. We propose several tests basedon reduced rank restrictions, including bootstrap versions thereof to account for factor es-timation uncertainty and improve the finite sample properties of the tests, and a BayesianVAR extended to mixed frequencies. We compare these methods to a test based on anaggregated model, the max-test (Ghysels et al., 2016a) and an unrestricted VAR-based test(Ghysels et al., 2016b) using Monte Carlo simulations. An empirical application illustratesthe techniques.JEL Codes: C11, C12, C32JEL Keywords: Granger Causality, Mixed Frequency VAR, Bayesian VAR, Reduced RankModel, Bootstrap Test
∗Correspondence to: Thomas B. Gotz, Deutsche Bundesbank, Macroeconomic Analysis and Projection Divi-sion, Wilhelm-Epstein-Strasse 14, 60431 Frankfurt am Main. Email: [email protected], Tel: +49-69-9566-6991, Fax: +49-69-9566-6734. The views expressed in this paper are ours and do not necessarily reflectthe views of the Deutsche Bundesbank or its staff. Any errors or omissions are our responsibility.
†Maastricht University, School of Business and Economics, Department of Quantitative Economics, P.O. Box616, 6200 MD Maastricht, The Netherlands. Email: [email protected]
‡Maastricht University, School of Business and Economics, Department of Quantitative Economics, P.O. Box616, 6200 MD Maastricht, The Netherlands. Email: [email protected]
1
![Page 2: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/2.jpg)
1 Introduction
Economic time series are published at various frequencies. While higher frequency variables
used to be aggregated (e.g., Silvestrini and Veredas, 2008), it has become more and more popular
to consider models that take into account the difference in frequencies of the processes under
consideration. As argued extensively in the mixed-frequency literature (among others, Ghysels
et al., 2007, Clements and Galvao, 2008, Ghysels et al., 2006, Marcellino and Schumacher, 2010),
working in a mixed-frequency setup instead of a common low-frequency one is advantageous due
to the potential loss of information in the latter scenario and feasibility of the former through
MI(xed) DA(ta) S(ampling) regressions (Ghysels et al., 2004), even in the presence of many
high-frequency variables compared to the number of observations.
Until recently, the MIDAS literature was limited to the single-equation framework, in which
one of the low-frequency variables is chosen as the dependent variable and the high-frequency
ones are in the regressors. Since the work of Ghysels (2015) for stationary series and the ex-
tension of Gotz et al. (2013) and Ghysels and Miller (2015) for the non-stationary and possibly
cointegrated case, we can analyze the link between high- and low-frequency series in a VAR
system treating all variables as endogenous. Note that the VAR vector in the model of Ghy-
sels (2015) is specified in terms of the low frequency by stacking the high-frequency variables
and aligning them with the low-frequency ones. Another approach of incorporating mixed
frequencies into VAR models is to write the entire model in high frequency by treating the
corresponding observations of the low-frequency series as latent. Noteworthy contributions to
this branch of the mixed-frequency VAR literature are, among others, Schorfheide and Song
(2015), Kuzin et al. (2011) or Eraker et al. (2015). We refer the reader to Foroni et al. (2013)
for an excellent survey on VAR models with mixed-frequency data.
An often investigated feature within VAR models is Granger causality, generally introduced
in Granger (1969). It is well known, though, that Granger causality is not invariant to temporal
2
![Page 3: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/3.jpg)
aggregation (Granger, 1988, Granger and Lin, 1995 or Marcellino, 1999). While most work on
this issue focused on cases in which all series are aggregated, Ghysels et al. (2016b) investigate
Granger causality in a setting, where variables are available at varying frequencies. Starting
from an underlying high-frequency process, it is shown that mixed-frequency Granger causality
tests better recover causal patterns in the data generating process than traditional low-frequency
tests do. However, decent size and power properties of their mixed-frequency tests depend on
a relatively small difference in sampling frequencies of the variables involved. Indeed, if the
number of high-frequency observations within a low-frequency period is large, size distortions
and loss of power may be expected.
In this paper we take the model of Ghysels (2015) and analyze the finite sample behavior
of Granger non-causality tests in the spirit of Dufour and Renault (1998) when the number
of high-frequency observations per low-frequency period is large as, e.g., in a month/working
day-example. To avoid the proliferation of parameters we consider two parameter reduction
techniques: reduced rank regressions and a Bayesian mixed-frequency VAR. As far as the
latter is concerned, we show precisely how to extend the dummy observation approach of
Banbura et al. (2010), itself built on the work of Sims and Zha (1998), to the presence of mixed
frequencies.1 Importantly, due to stacking the high-frequency variables in the mixed-frequency
VAR (Ghysels, 2015), their approach cannot be applied directly such that a properly adapted
choice of auxiliary dummy variables is required, which marks a significant contribution to the
respective literature by itself. With respect to reduced rank regressions, the associated factors
are typically not observable and must be estimated, which obviously affects the distribution
of the Wald tests to detect directions of Granger causalities. Consequently, another important
contribution of this paper is the introduction of bootstrap versions of these tests (also for the
unrestricted VAR), which have correct size even for large VARs and a small sample size.
1Banbura et al. (2010) refer to these variables as dummy observations. To avoid confusion between high- andlow-frequency observations and auxiliary variables, we use the term ’auxiliary dummy variables’ henceforth.
3
![Page 4: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/4.jpg)
Independently and simultaneously to this work, Ghysels et al. (2016a) introduced another
approach to testing for Granger causality in a mixed-frequency VAR characterized by a large
discrepancy in frequencies of the series involved. In particular, their method is based on a
sequence of parsimonious regression models where the low-frequency variable is regressed on
its own lags and only one specific lead or lag of the high-frequency series. The test statistic
itself is then computed as the maximum of these individual-regression-based estimators scaled
and weighted properly; hence the name “max-test”. The authors show that, in a mixed-
frequency VAR a la Ghysels (2015), this procedure leads to correct identification of the test
in the presence or absence of Granger causality, which implies consistency against causality of
all forms. Its simplicity and applicability to large dimensional systems makes the max-test the
main competitor for our tests.
Both reduced rank regressions and the Bayesian estimation of the mixed-frequency VAR,
are aiming at reducing the amount of parameters to be estimated before testing for Granger
causality. The former does so by imposing a certain set of restrictions leading to the computation
of a low-dimensional set of factors; the latter uses parameter shrinkage to a prior that is centered
at a restricted version of the VAR of Ghysels (2015). While these methods are very appealing
from an applied perspective, they are, much like the MIDAS polynomials, entirely ad hoc and
possibly misspecified. Indeed, if the restrictions imposed are not satisfied in the true data
generating process, i.e., unless we impose (over)identified restrictions, the resulting model is
not correctly specified. This, in turn, may result in potential inability of our methods to
detect causality, and in particular cause our methods to lose consistency against certain forms
of causality that cannot be detected under the misspecified restrictions. In contrast, Granger
causality tests based on the unrestricted VAR (Ghysels et al., 2016b) and the max-test (Ghysels
et al., 2016a) are not based on (possibly invalid) restrictions implying that they have asymptotic
power of one against all forms of causality. However, in situations where our tests are based
on a correctly specified model, our approaches may very well outperform the test based on
4
![Page 5: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/5.jpg)
the unrestricted VAR or the max-test due to the way in which the parameter proliferation
problem in our large dimensional system is addressed. Moreover, even in situations in which
the restrictions are not valid, in finite samples our methods may still allow one to pick up
causality with high power due to the dimensionality reduction which may offset the effect of
misspecification in small samples.
The rest of the paper is organized as follows. In Section 2 notations are introduced, the
mixed-frequency VAR (MF-VAR hereafter) for our specific case at hand is presented and
Granger (non-)causality is defined. Section 3 discusses the approaches to reduce the num-
ber of parameters to be estimated, whereby reduced rank restrictions (Section 3.1) as well as
Bayesian MF-VARs (Section 3.2) are presented in detail. The finite sample performance of
testing for Granger causality within the aforementioned restricted models is analyzed via a
Monte Carlo experiment in Section 4. We compare our tests with the one based on the unre-
stricted VAR, the max-test, and an asymptotic Wald test based on a common low-frequency
VAR obtained by temporally aggregating the high-frequency variable (Breitung and Swanson,
2002). An empirical example with U.S. data on the monthly industrial production index and
daily volatility in Section 5 illustrates the results. Section 6 concludes.
2 Causality in a Mixed-frequency VAR
2.1 Notation
Let us start from a two variable mixed-frequency system, where yt, t = 1, . . . , T, is the low-
frequency variable and x(m)t−i/m are the high-frequency variables with m high-frequency observa-
tions per low-frequency period t. Throughout this paper we assume m to be rather large as in
a year/month- or month/working day-example. We also assume m to be constant rather than
varying with t.2 The value of i indicates the specific high-frequency observation under con-
2As long as m is deterministic, even time-varying frequency discrepancies do not pose a problem on a theo-retical level. However, the assumption of constant m simplifies the notation greatly (Ghysels, 2015).
5
![Page 6: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/6.jpg)
sideration, ranging from the beginning of each t-period (x(m)t−(m−1)/m) until the end (x
(m)t with
i = 0). These notational conventions have become standard in the mixed-frequency literature
and are similar to the ones in Gotz et al. (2014), Clements and Galvao (2008, 2009) or Miller
(2014).
Furthermore, let Wt = (W ′t−1, W ′t−2, . . ., W ′t−p)′ denote the last p low-frequency lags of
any process W stacked. Finally, 0i×j (1i×j) denotes an (i × j)-matrix of zeros (ones), Ii is an
identity matrix of dimension i, ⊗ represents the Kronecker product and vec corresponds to the
operator stacking the columns of a matrix.
Remark 1 Extensions towards representations of higher dimensional multivariate systems as
in Ghysels et al. (2016b) can be considered, but are left for further research here. Analyzing
Granger causality among more than two variables inherently leads to multi-horizon causality
(see Lutkepohl, 1993 among others). The latter implies the potential presence of a causal chain:
for example, in a trivariate system, X may cause Y through an auxiliary variable Z. To abstract
from that scenario, Ghysels et al. (2016b) often consider cases in which high- and low-frequency
variables are grouped and causality patterns between these groups, viewed as a bivariate system,
are analyzed. They study the presence of a causal chain and multi-horizon causality in a Monte
Carlo analysis though.
2.2 MF-VARs
Considering each high-frequency variable such that
X(m)t = (x
(m)t , x
(m)t−1/m, . . . , x
(m)t−(m−1)/m)′, (1)
a dynamic structural equations model for the stationary multivariate process Zt = (yt, X(m)′t )′
is given by
AcZt = c+A1Zt−1 + . . .+ApZt−p + εt. (2)
6
![Page 7: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/7.jpg)
Note that the parameters in Ac are related to the ones in A1 due to stacking the high-frequency
observations X(m)t in Zt (Ghysels, 2015).3 Explicitly for a lag length of p = 1, the model reads
as:
1 β1 β2 . . . βm
δ1 1 −ρ1 . . . −ρm−1
δ2 0 1 . . . −ρm−2
......
.... . .
...
δm 0 0 . . . 1
yt
x(m)t
x(m)t−1/m
...
x(m)t−(m−1)/m
=
c1
c2
...
cm+1
+
ρy θ1 θ2 . . . θm
ψ2 ρm . . . . . . 0
ψ3 ρm−1 ρm . . . 0
......
.... . .
...
ψm+1 ρ1 ρ2 . . . ρm
yt−1
x(m)t−1
x(m)t−1−1/m
...
x(m)t−1−(m−1)/m
+
ε1t
ε2t
...
ε(m+1)t
.
(3)
In general, we assume the high-frequency process to follow an AR(q) process with q < m
(we set q = m in the equation above; if q < m, one should set ρq+1 = . . . = ρm = 0). For
non-zero values of βj or δj the matrix Ac links contemporaneous values of y and x, a feature
referred to as nowcasting causality (Gotz and Hecq, 2014).4
Pre-multiplying (3) by A−1c we get to the mixed-frequency reduced-form VAR(p) model:
Zt = µ+ Γ1Zt−1 + . . .+ ΓpZt−p + ut
= µ+B′Zt + ut.(4)
3Compared to Ghysels (2015) we simply reverse the mixed-frequency vector Zt and put the low-frequencyvariable first.
4βj 6= 0 implies that yt is affected by incoming observations of X(m)t , whereas δj 6= 0 implies that the
high-frequency observations are influenced by yt (see Gotz and Hecq, 2014). The latter becomes interestingfor studying policy analysis, where the high-frequency policy variable(s) may react to current low-frequencyconditions (see Ghysels, 2015 for details). Note that Gotz et al. (2015) consider a noncausal MF-VAR in whichone allows for non-zero elements in the lead coefficients of the Ac matrix.
7
![Page 8: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/8.jpg)
Consequently, µ = A−1c c, Γi = A−1
c Ai and ut = A−1c εt. Let B = (Γ1,Γ2, . . . ,Γp)
′ and B(z) =
Im+1 −∑p
j=1 Γjzj . We then make the following assumptions on the MF-VAR:
Assumption 2 Zt is generated by the MF-VAR(p) in (4), for which it holds that (i) the roots
of the matrix polynomial B(z) all lie outside the unit circle; (ii) ut is independent and identically
distributed (i.i.d.) with E(ut) = 0, E(utu′t) = Σu, with Σu positive definite, and E(||ut||4) <∞,
where || · || is the Frobenius norm.
Assumption 2(i) ensures that the MF-VAR is I(0), while (ii) is a standard assumption to
ensure validity of the bootstrap for VAR models, see, e.g., Paparoditis (1996), Kilian (1998) or
Cavaliere et al. (2012).
Remark 3 The i.i.d. assumption on ut is not strictly necessary to derive the limit distributions
of the test statistics and could be weakened to a martingale difference sequence (mds) assumption
(see, e.g., Ghysels et al., 2016b). However, it allows us to directly use the aforementioned
literature on the validity of our bootstrap method that uses i.i.d. resampling of the VAR residuals,
as considered in the simulation study in Section 4. Asymptotic validity of the bootstrap can also
be established for heteroskedastic error terms, provided one adapts the bootstrap method to a
wild bootstrap as we do in the empirical application in Section 5. Asymptotic validity of the
wild bootstrap for univariate autoregressions is established in Goncalves and Kilian (2004) and
extended to VAR models by Bruggemann et al. (2016). Section 3.1.4 provides further details
on our bootstrap methods and their validity.
Remark 4 Assumption 2 implies that the data are truly generated at mixed frequencies. Hence,
similar to Ghysels et al. (2016a) but unlike Ghysels et al. (2016b), we do not start with a com-
mon high-frequency data generating process (DGP hereafter).5 The latter would imply causality
5Given such a situation it would seem natural to cast the model in state space form and estimate the param-eters using the Kalman filter. However, this amounts to a ’parameter-driven’ model (Cox et al., 1981), whichcontains latent processes, i.e., the high-frequency observations of y. The latter is a feature we try to avoid in
8
![Page 9: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/9.jpg)
patterns to arise at the high frequency. Consequently, we do not investigate which causal re-
lationships at high frequency get preserved when moving to the mixed- or low-frequency case.
Indeed, an extension of our methods along these lines would demand a careful analysis of the
mixed- and low-frequency systems corresponding to their latent high-frequency counterpart (see
Ghysels et al., 2016b for the unrestricted VAR case).
Equation (4) is easy to estimate for small m, yet becomes a rather large system as the latter
grows. For example, in a MF-VAR(1),
yt
x(m)t
x(m)t−1/m
...
x(m)t−(m−1)/m
=
µ1
µ2
...
µm+1
+
γ1,1 γ1,2 · · · γ1,m+1
γ2,1 γ2,3 · · · γ2,m+1
......
. . ....
γm+1,1 γm+1,2 · · · γm+1,m+1
︸ ︷︷ ︸
Γ1
×
yt−1
x(m)t−1
x(m)t−1−1/m
...
x(m)t−1−(m−1)/m
+
u1t
u2t
...
u(m+1)t
,
(5)
uti.i.d.∼ (0((m+1)×1),Σu), Σu =
σ1,1 σ1,2 . . . σ1,m+1
σ2,1 σ2,2 . . ....
......
. . ....
σm+1,1 . . . . . . σm+1,m+1
, (6)
our MF-VAR: say we are interested in the impact of shocks to one or several variables on the whole system.Using a high-frequency DGP with missing observations implies that shocks to these latent processes are alsolatent and unobservable. This is undesirable given that, e.g., policy shocks are, of course, observable (Foroni andMarcellino, 2014).
9
![Page 10: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/10.jpg)
there are (m + 1)2 parameters to estimate in the matrix Γ1. Additional lags would further
complicate the issue.
2.3 Granger Causality in MF-VARs
Let Ωt represent the information set available at moment t and let ΩWt denote the corresponding
set excluding information about the stochastic process W . With P [X(m)t+h|Ω] being the best linear
forecast of X(m)t+h based on Ω, Granger non-causality is defined as follows (Dufour and Renault,
1998):
Definition 5 y does not Granger cause X(m) if
P [X(m)t+1 |Ω
yt ] = P [X
(m)t+1 |Ωt]. (7)
Similarly, X(m) does not Granger cause y if
P [yt+1|ΩX(m)
t ] = P [yt+1|Ωt]. (8)
In other words, y does not Granger cause X(m) if past information of the low-frequency
variable do not help in predicting current (or future) values of the high-frequency variables and
vice versa. In terms of (5), testing for Granger non-causality implies the following null (and
alternative) hypotheses:
• y does not Granger cause X(m)
H0 : γ2,1 = γ3,1 = . . . = γm+1,1 = 0,
HA : γi,1 6= 0 for at least one i = 2, . . . ,m+ 1.(9)
10
![Page 11: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/11.jpg)
• X(m) does not Granger cause y
H0 : γ1,2 = γ1,3 = . . . = γ1,m+1 = 0,
HA : γ1,i 6= 0 for at least one i = 2, . . . ,m+ 1.(10)
3 Parameter Reduction
This section presents techniques that we have considered, and evaluated through a Monte Carlo
exercise, with the aim to reduce the amount of parameters to be estimated in the MF-VAR
model. Two approaches are discussed in detail, reduced rank restrictions and a Bayesian MF-
VAR.
There are many alternative approaches to reduce the number of parameters among which are
principal components, Lasso (Tibshirani, 1996) or ridge regressions (Hoerl and Kennard, 1970,
among others). However, using principal components does not necessarily preserve the dynamics
of the VAR under the null: nothing prevents, e.g., the first and only principal component to be
loading exclusively on y implying that the remaining dynamics enter the error term. In other
words, the autoregressive matrices in (4) may and will most likely not be preserved, which
naturally affects the block of parameters we test on for Granger non-causality. As for Lasso
and ridge regressions, it is well known that they may be interpreted in a Bayesian context.
In particular, the latter is equivalent to imposing a normally distributed prior with mean zero
on the parameter vector (Vogel, 2002, among others), while the former may be replicated
using a zero-mean Laplace prior distribution (Park and Casella, 2008). Given that our set of
models contains a Bayesian VAR for mixed-frequency data, we abstract from its connection to
regularized versions of least squares at this stage.
11
![Page 12: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/12.jpg)
3.1 Reduced Rank Restrictions
3.1.1 Reduced Rank Regression Model
In order to reduce the number of parameters to estimate in the MF-VAR, we propose the
following reduce rank regression model, for which we make the following assumption:
Assumption 6 Let BX(m) be the matrix obtained from B′ in (4) by excluding the first columns
of Γ1, Γ2, . . ., Γp. The rank of this matrix is smaller than the number of high-frequency
observations within Zt, i.e.,
rk(B′X(m)) = r < m.
The model then reads as follows:
Zt = µ+ γ·,1yt + α∑p
i=1 δ′iX
(m)t−i + νt
= µ+ γ·,1yt + αδ′X(m)t + νt,
(11)
where γ·,1 is the (m + 1) × p matrix containing the first columns of Γ1,Γ2, . . . ,Γp, and α and
δ = (δ′1, . . . , δ′p)′ are (m+ 1)× r and pm× r matrices, respectively. Note that (11) can also be
written in terms of Zt. Let us define Γi ≡ (γ(i)·,1 , αδ
′i), where γ
(i)·,1 , i = 1, . . . , p, corresponds to
the ith column of γ·,1. Then, (11) is equivalent to
Zt = µ+B′Zt + νt, (12)
12
![Page 13: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/13.jpg)
where B = (Γ1, . . . ,Γp)′. For p = 1 the model becomes
yt
x(m)
x(m)t−1/m
...
x(m)t−(m−1)/m
= µ+
γ1,1
γ2,1
...
γm+1,1
yt−1 +
α1
α2
...
αm+1
δ′1
x(m)t−1
x(m)t−1−1/m
...
x(m)t−1−(m−1)/m
+ νt
= µ+
γ1,1
γ2,1
...
γm+1,1
,
α1
α2
...
αm+1
δ′1
︸ ︷︷ ︸
Γ1
×
yt−1
x(m)t−1
x(m)t−1−1/m
...
x(m)t−1−(m−1)/m
+ νt,
(13)
where each αi, i = 1, . . . ,m + 1, is of dimension 1 × r and where δ1 is an m × r matrix.
Hence, we could call δ′X(m)t a vector of r high-frequency factors. Note that r = m − s, where
s represents the rank reduction we are able to achieve within X(m)t . In terms of parameter
reduction, the unrestricted VAR in (4) requires p(m + 1)2 coefficients to be estimated in the
autoregressive matrices, whereas the VAR under reduced rank restrictions in (11) or (12) needs
p(m + 1) + r(m + 1) + prm parameter estimates. As an example, assume p = 1 and m = 20.
Then, if r = 1, 2 or 3, there are, respectively, 62, 103 or 144 coefficients to be estimated in
Γ1 instead of 441 in Γ1. While being reduced by a large amount, the number of parameters
to be estimated remains considerable implying that the tests may still incur small sample size
distortions. This is one of the reasons why we also consider bootstrap versions of these tests.
Note that we do not require yt−1 to be included in the same transmission mechanism as the x
variables.
Remark 7 There are several ways to justify the reduced rank feature of the autoregressive
13
![Page 14: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/14.jpg)
matrix BX(m). First, at the model representation level we may assume that, in the structural
model (3), x follows an AR(q) process with q < m and that the last m − q elements of each
X(m)t−i , i = 1, . . . , p, have a zero coefficient in the equation for yt. Plugging these restrictions
into (4) results in a reduced rank of B′X(m) because the matrices Γi = A−1
c Ai have the rank of
Ai. Second, at the empirical level one can interpret the MF-VAR as an approximation of the
VARMA obtained after block marginalization of a high-frequency VAR for each variable. In
this situation, reduced rank matrices may empirically not be rejected by the data because of the
combinations of many elements. Thus, a small number of dynamic factors can approximate
more complicated (possibly nonlinear) dynamics. Finally, the way one typically restricts the
MF-VAR in (4), i.e., assuming the high-frequency series to follow an ARX-process (Ghysels,
2015), actually implies a reduced rank representation. Looking slightly ahead, consider Equation
(25) or the matrix in (29), which we will use in our Monte Carlo section, for the special case
p = 1. It is obvious that for the first order MF-VAR, Zt = Γ1Zt−1 + ut, with
Γ1 =
γ1,1 γ1,2 γ1,3 . . . γ1,21
γ∗2,1 ρ20 0 . . . 0
γ∗3,1 ρ19... . . . 0
......
.... . .
...
γ∗21,1 ρ 0 . . . 0
we have, due to the block of zeros, a reduced rank matrix BX(m) with two (restricted coefficient)
14
![Page 15: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/15.jpg)
factors:
rk
γ1,2 γ1,3 . . . γ1,21
ρ20 0 . . . 0
ρ19... . . . 0
......
. . ....
ρ 0 . . . 0
= rk
1 0
0 ρ20
0 ρ19
......
0 ρ
AA−1
γ1,2 γ1,3 . . . γ1,21
1 0 . . . 0
= 2,
where A is a 2 × 2 full rank rotation matrix, the identity in this particular example. Hence,
under HA, e.g., two factors (although unrestricted in our setting) will capture the reduced rank
feature. Note that we could impose (overidentifying) restrictions to retain the large block of
zeros in Γ1. In that sense, the model with r = 2 would not be misspecified but “suboptimal”.
On the other hand, with merely one factor we would not be able to impose the block of zeros
implying the model to be misspecified. Under the null hypothesis the model is not misspecified
with either one or two factors, because the test tries to capture some dynamics that is not there.
Note that our focus is to compare the performance of different approaches including their
bootstrap versions (whenever applicable) and that in this light we view reduced rank regressions
as just one particular specification. In particular, next to the Bayesian MF-VAR and the max-
test, our bootstrap implementation has also made the unrestricted MF-VAR (which is not
subject to rank misspecification) a viable option to compare the reduced rank regressions with.
3.1.2 Testing for Granger Non-Causality
Given r and Assumption 6, under the condition that the high-frequency factors δ′X(m)t are
observable, we can estimate (11) by OLS. Letting Γ = (µ, γ·,1, α)′ denote the corresponding
OLS estimates, we can test for Granger non-causality by defining R as the matrix that picks
the set of coefficients we want to do inference on, i.e., Rvec(Γ). For a general construction of
15
![Page 16: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/16.jpg)
the matrix R in the presence of several low- and high-frequency variables,6 we refer the reader
to Ghysels et al. (2016b). The Wald test is constructed as
ξW =[Rvec(Γ)
]′(RΩR′)−1
[Rvec(Γ)
], (14)
with
Ω = (W ′W )−1 ⊗ Σν , (15)
where Σν = 1T
∑Tt=p+1 νtν
′t, νt are the OLS residuals of (11), and W = (W1, . . . ,WT )′ is the
regressor set consisting of Wt = (1, y′t, δ′X
(m)′t )′.7 As illustrated in Ghysels et al. (2016b), ξW is
asymptotically χ2rank(R). A robust version of (14) is also implemented in the empirical section.
Of course, the factors are typically not observable and must be estimated (unless we im-
pose specific factors). Using estimated rather than observed factors in the regression (11) will
obviously affect the distribution of ξW , certainly in small samples. For this reason we consider
bootstrap versions of the tests. Before going into the details on the bootstrap we describe how
we estimate or impose the factors.
3.1.3 Estimating the Factors
We use three ways to estimate the factors, canonical correlation analysis (CCA hereafter),
partial least squares (PLS hereafter) and heterogeneous autoregressive (HAR hereafter) type
restrictions. We briefly present the algorithms that are used to extract those factors.
CCA is based on analyzing the eigenvalues and corresponding eigenvectors of
Σ−1
X(m)
X(m)ΣX
(m)Z
Σ−1ZZ
ΣZX
(m) (16)
6Within the unrestricted MF-VAR in (4), though.7A sample size correction, i.e., using Σν = 1
T−KW
∑Tt=p+1 νtν
′t, where KW is the amount of elements in W ,
may alleviate size distortions in finite samples of tests using asymptotic critical values.
16
![Page 17: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/17.jpg)
or, similarly, of the symmetric matrix
Σ−1/2
X(m)
X(m)ΣX
(m)Z
Σ−1ZZ
ΣZX
(m)Σ−1/2
X(m)
X(m) . (17)
For a detailed discussion we refer the reader to Anderson (1951) or, for the application to com-
mon dynamics, to Vahid and Engle (1993). Note that Σij represents the empirical covariance
matrix of processes i and j. Furthermore, Z and X(m)
indicate Zt and X(m)t , respectively, to
be concentrated out by the variables that do not enter in the reduced rank regression, i.e., the
intercept and yt−1. Denoting by V = (v1, v2, . . . , vr), with v′ivj = 1 for i = j and 0 otherwise,
the eigenvectors corresponding to the r largest eigenvalues of the matrix in (17), we obtain the
estimators of α and δ as:
α = Σ−1ZZ
ΣZX
(m)Σ−1/2
X(m)
X(m) V
δ = Σ−1/2
X(m)
X(m) V .
(18)
Note that the estimation of the eigenvectors obtained from the canonical correlation analysis
in (16) or (17) may, however, perform poorly with high-dimensional systems, because inversions
of the large variance matrices Σ−1ZZ
and Σ−1
X(m)
X(m) are required. As an alternative to CCA we
use a PLS algorithm similar to the one used in Cubadda and Hecq (2011) or Cubadda and
Guardabascio (2012). In order to make the solution of this eigenvalue problem invariant to
scale changes of individual elements, we compute the eigenvectors associated with the largest
eigenvalues of the matrix
D−1/2
X(m)
X(m)ΣX
(m)ZD−1ZZ
ΣZX
(m)D−1/2
X(m)
X(m)
with DX
(m)X
(m) and DZZ being diagonal matrices having the diagonal elements of, respectively,
ΣX
(m)X
(m) and ΣZZ as their entries. The computation of α and δ works in a similar fashion as
with CCA-based factors.
17
![Page 18: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/18.jpg)
Finally, we may impose the presence of r = 3p factors,8 inspired by the HAR-model (Corsi,
2009). For i = 1, . . . , p:
δi =
03(i−1)×m
1 01×(m−1)
11×( 14m) 0(1× 3
4m)
11×m
03(p−i)×m
′
⇒ δ′X(m)t =
x(m)t−1∑ 1
4m−1
i=0 x(m)t−1−i/m∑m−1
i=0 x(m)t−1−i/m...
x(m)t−p∑ 1
4m−1
i=0 x(m)t−p−i/m∑m−1
i=0 x(m)t−p−i/m
. (19)
For p = 1 and m = 20 this corresponds to
δ′X(m)t =
x
(20)t−1∑4
i=0 x(20)t−1−i/20∑19
i=0 x(20)t−1−i/20
≡
xDt−1
xWt−1
xMt−1
, (20)
with xDt , xWt and xMt denoting daily, weekly and monthly measures, respectively.
Remark 8 As noted in Ghysels and Valkanov (2012) and Ghysels et al. (2007), these HAR-
type restrictions are a special case of MIDAS with step functions introduced in Forsberg and
Ghysels (2007). Considering partial sums of regressors x as Xt(K,m) =∑K
i=0 x(m)t−i/m, a MIDAS
regression with M steps reads as yt = µ +∑M
j=1 βjXt(Kj ,m) + εt, where K1 < . . . < KM .
Alternatively, we could use MIDAS restrictions as introduced in Ghysels et al. (2004), for the
difference between the latter and HAR-type restrictions has been shown to be small (Ghysels
and Valkanov, 2012). However, step functions have the advantage of not requiring non-linear
estimation methods, since the distributed lag pattern is approximated by a number of discrete
8To ensure that r < m we assume that p < 13m at this stage.
18
![Page 19: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/19.jpg)
steps, which simplifies the analysis. Furthermore, and more crucially in the context of this paper,
implementing MIDAS restrictions and testing for Granger non-causality implies the well-known
Davies (1987) problem, i.e., the parameters determining the MIDAS weights (see Ghysels et al.,
2004 for details) are not identified under the null hypothesis.9
3.1.4 Bootstrap Tests
Despite the dimensionality reduction achieved by imposing the factor structure, a considerable
number of parameters remains to be estimated for conducting the Wald tests such that the tests
may still be subject to considerable small sample size distortions. Moreover, the estimation of
the factors will have a major effect on the distribution of the Wald test statistic. We therefore
also consider a bootstrap implementation of these tests to improve the properties of the tests
in finite samples.
Let B = Γδ′ denote the estimates of B obtained by one of the reduced rank methods, and
assume that δ is normalized such that its upper r × r block is equal to the identity matrix.10
The first bootstrap method we consider is a standard “unrestricted” bootstrap, that may have
superior power properties in certain cases (cf. Paparoditis and Politis, 2005). Its algorithm
looks as follows.
1. Obtain the residuals from the MF-VAR(p)
νt = Zt − µ− B′Zt, t = p+ 1, . . . , T. (21)
9To properly test for Granger non-causality in this case, a grid for the weight specifying parameter vectorhas to be considered and the corresponding Wald tests for each candidate have to be computed. Subsequently,one can calculate the supremum of these tests (Davies, 1987) and obtain an ’asymptotic p-value’ using bootstraptechniques (see Hansen, 1996 or Ghysels et al., 2007 for details). While this approach is feasible, it is computa-tionally more demanding. Admittedly, HAR-type restrictions provide less flexibility than the MIDAS approach,yet their simplicity makes them very appealing from an applied perspective.
10This ensures that the rotation of the factors is the same for the original sample and the bootstrap sample,which in turn makes sure the correct bootstrap null hypothesis is imposed for the first bootstrap method. In thesimulations we experimented with different normalizations which did not change the results.
19
![Page 20: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/20.jpg)
2. Draw the bootstrap errors ν∗1 , . . . , ν∗T with replacement from νy,p+1, . . . , νy,T .
3. Letting Z∗t = (Z∗′t−1, . . . , Z∗′t−p)
′, construct the bootstrap sample Z∗t recursively as
Z∗t = µ+ B′Z∗t + ν∗t , t = 1, . . . , T, Z∗0 = 0. (22)
Note that the null hypothesis in the bootstrap is not imposed which has consequences for
the next step.
4. Estimate the bootstrap equivalent of (11), where the factors are estimated using the same
method as for the original sample, and obtain the bootstrap Wald test statistic ξ∗W . As the
“true” bootstrap parameters governing Granger causality are different from zero, we need
to adapt the bootstrap Wald test such that it tests the correct null hypothesis. Observing
that those parameters form a subset of the estimates in Γ used in (22), the appropriate
Wald test statistic is
ξ∗W =[Rvec(Γ
∗)− Rvec(Γ)
]′(RΩ∗R′)−1
[Rvec(Γ
∗)− Rvec(Γ)
], (23)
where all quantities with a superscript ‘*’ are calculated analogously to their sample
counterparts but then using the bootstrap sample.
5. Repeat steps 2 to 4 B times, and calculate the bootstrap p-value as the proportion of
bootstrap samples for which ξ∗W > ξW .
While this bootstrap method has the advantage that it can be used for testing Granger
causality in both directions (as only the matrix R needs to change), it also has some drawbacks.
In particular, even with the dimensionality reduction provided by the reduced rank estimation,
it still relies on the generation of a bootstrap sample based on an (m + 1)-dimensional MF-
VAR(p) that requires p(m+ 1) + r(m+ 1) + prm parameters to estimate. This may make the
20
![Page 21: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/21.jpg)
bootstrap unstable and prone to generate outlying samples, with potential size distortions or a
loss of power as a result.
Therefore we consider a second bootstrap method that imposes the null hypothesis of no
Granger causality and in doing so achieves a further significant reduction of dimensionality. This
method is very similar to the parametric wild bootstrap proposed in Ghysels et al. (2016a). A
downside of this method is that it requires separate bootstrap algorithms for testing in the two
opposite directions. We describe the procedure here for the null hypothesis that X(m) does not
Granger cause y, and comment on the test in the opposite direction below.
1. Estimate an AR(p) model for y and obtain the residuals
uy,t = yt − µ−p∑j=1
ρy,jyt−j , t = p+ 1, . . . , T.
2. Draw the bootstrap errors u∗1, . . . , u∗T with replacement from uy,p+1, . . . , uy,T .
3. Construct the bootstrap sample y∗t recursively as
y∗t =
p∑j=1
ρy,jy∗t−1 + u∗t , t = 1, . . . , T, y∗0 = 0,
thus imposing the null of no Granger causality from X(m) to y.
4. Letting X(m)∗t = X
(m)t , use the bootstrap sample to estimate the MF-VAR using the same
estimation method as for the original estimator, and obtain the corresponding Wald test
statistic ξ∗W . As the null hypothesis of no Granger causality has been imposed in the
bootstrap, ξ∗W now has the same form as ξW in (14).
Note that only p parameters need to be estimated in order to generate the bootstrap sam-
ple, most likely making the procedure more stable than the unrestricted bootstrap above. A
similar bootstrap procedure can be implemented for the reverse direction of causality by only
21
![Page 22: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/22.jpg)
resampling X(m)t while keeping yt fixed. As one can resample X
(m)t independently of yt, it can
be done in the original high-frequency as a univariate process. Said differently, it only requires
fitting an AR(q) process to x(m)t , again achieving a large reduction of dimensionality.
Given the simplicity and apparent advantages of the second bootstrap method, one may
wonder why one should consider the first, unrestricted, bootstrap method at all. However,
in our simulation study we find that the unrestricted method has better size properties than
the restricted method for testing causality from y to X(m), in particular in the presence of
nowcasting causality.11
It is also possible in step 2 of both algorithms to use the wild bootstrap to be robust against
heteroskedasticity. In this case the bootstrap errors are generated as u∗t = ξ∗t ut, where we
generate ξ∗1 , . . . , ξ∗T as independent standard normal random variables. This would constitute
the so-called “recursive wild bootstrap” scheme that Goncalves and Kilian (2004) for univariate
autoregressions and Bruggemann et al. (2016) for VAR models prove to be asymptotically valid
under conditionally heteroskedastic errors. In particular, employing the wild bootstrap then
allows us to replace the i.i.d. assumption in Assumption 2 with a martingale difference sequence
assumption. We implement the wild bootstrap version in the empirical application in Section
5.
Remark 9 Kilian (1998) and Paparoditis (1996) prove the asymptotic validity of the unre-
stricted VAR bootstrap under Assumption 2. The validity of the restricted bootstrap can be
derived from the results for AR(p) processes in Bose (1988), and the observation that condi-
tioning is a valid approach in the absence of Granger causality. Van Giersbergen and Kiviet
(1996) provide a detailed examination of the two different types of bootstraps in the context of
autoregressive distributed lag models and demonstrate that the restricted bootstrap even works
well in the absence of strong exogeneity. Dufour et al. (2006) apply the bootstrap to causality
11Simulation results explicitly comparing the two bootstrap methods are available upon request.
22
![Page 23: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/23.jpg)
testing in VAR models and relax the assumption of i.i.d. error terms.12
Remark 10 Goncalves and Perron (2014) consider factor-augmented regression models, with
factors estimated by principal components. In this setting the asymptotic impact of the factor
estimation depends on which asymptotic framework is assumed. Under the asymptotic frame-
work, in which factor estimation has a non-negligible effect on the limit distribution of the
regression estimators, the bootstrap correctly mimics this effect and is therefore asymptotically
valid. Consequently, the bootstrap provides a much more accurate approximation to the finite
sample distribution of the regression estimators than the asymptotic approximation, that as-
sumes a negligible effect of factor estimation. We expect the bootstrap to have similar properties
in our setting where the factors are estimated using CCA or PLS.
3.2 Bayesian MF-VARs
3.2.1 Restricted MF-VARs
Ghysels (2015) discusses the issue of parsimony in MF-VAR models by specifying the high-
frequency process in such a way as to allow a number of parameters that is independent from
m. Indeed, while it seems reasonable to leave the equation for yt unrestricted,13 it is less clear
for the remaining ones. Hence, let us make the following assumption (see Ghysels, 2015 or the
numerical examples in Ghysels et al., 2016a).
Assumption 11 The high-frequency process x(m)t follows an AR(1) model with one lag of the
12If the test statistic (14) is asymptotically pivotal, we may expect the bootstrap to provide asymptoticrefinements as well and, hence, reduce small sample size distortions (see Bose, 1988 or Horowitz, 2001).
13In fact, viewed as single equation, it boils down to an unrestricted MIDAS model (Foroni et al., 2015) withoutcontemporaneous observations of the high-frequency variable. MIDAS restrictions (Ghysels et al., 2004) can beimposed here as an alternative. However, doing so implies leaving the linear framework, which is needed to applythe auxiliary dummy variable approach presented below. Furthermore, even after having drawn the MIDAShyperparameters, one still faces the aforementioned Davies (1987) problem when attempting to test for Grangernon-causality from X(m) to y.
23
![Page 24: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/24.jpg)
low-frequency variable in the regressor set
x(m)t−i/m = µi+2 + ρx
(m)t−(i+1)/m + πyt−1 + υ(i+2)t (24)
for i = 0, . . . ,m− 1.
Completing the system with the equation for yt leads to the following restricted MF-VAR:
yt
x(m)t
x(m)t−1/m
...
x(m)t−(m−1)/m
=
µ1∑m−1i=0 ρiµ2+i∑m−2i=0 ρiµ3+i
...
µm+1
+
γ(1)1,1 γ
(1)1,2 γ
(1)1,3 . . . γ
(1)1,m+1
π∑m−1
i=0 ρi ρm
0m×(m−1)
π∑m−2
i=0 ρi ρm−1
......
π ρ
×
yt−1
x(m)t−1
x(m)t−1−1/m
...
x(m)t−1−(m−1)/m
+∑p
k=2
γ(k)1,1 γ
(k)1,2 . . . γ
(k)1,m+1
0m×(m+1)
×
yt−k
x(m)t−k...
x(m)t−k−(m−1)/m
+
υ1t∑m−1i=0 ρiυ(m+1−i)t∑m−2i=0 ρiυ(m+1−i)t
...
υ(m+1)t
︸ ︷︷ ︸
υ∗t
,
(25)
where γ(k)i,j corresponds to the (i, j)-element of matrix Γk in (4). As for the error terms υit, i =
1, . . . ,m+1, we set E(υitυit) = σHH , E(υ1tυit) = σHL for i = 2, . . . ,m+1, and E(υ1tυ1t) = σLL.
Furthermore, each error term is assumed to possess a zero mean and to be normally distributed.
24
![Page 25: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/25.jpg)
Consequently, υ∗t ∼ N(0((m+1)×1),Συ∗), whereby we refer the reader to Ghysels (2015) for the
exact composition of Συ∗ .
3.2.2 The Auxiliary Dummy Variable Approach for MF Data
As pointed out by Carriero et al. (2011), Bayesian methods allow the imposition of restrictions
such as the ones in Assumption 11, while also admitting influence of the data. Consequently,
Bayesian shrinkage has become a standard tool when being faced with large-dimensional esti-
mation problems such as large VARs (e.g., Banbura et al., 2010, Kadiyala and Karlsson, 1997).
As for MF-VARs, Ghysels (2015) describes a way to sample the MIDAS hyperparameters and
subsequently formulates prior beliefs for the remaining parameters.14 Once these hyperparam-
eters are taken care of, the Bayesian analyses of mixed- and common-frequency VAR models
are quite similar and, hence, traditional Bayesian VAR techniques (e.g., Kadiyala and Karlsson,
1997, Litterman, 1986) can be applied.
We follow the approach of Banbura et al. (2010), which in turn is built on the work of
Sims and Zha (1998), showing that adding a set of auxiliary dummy variables to the system
is equivalent to imposing a normal inverted Wishart prior. The specification of prior beliefs is
derived from the Minnesota prior in Litterman (1986), whereby we center the prior distributions
14McCracken et al. (2015) use a Sims-Zha shrinkage prior and the algorithm in Waggoner and Zha (2003) tosolve the parameter proliferation problem with Bayesian estimation techniques. Bayesian methods within mixed-frequency VARs are also considered by Schorfheide and Song (2015). However, they use a latent high-frequencyVAR instead of the mixed-frequency system a la Ghysels (2015).
25
![Page 26: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/26.jpg)
of the coefficients in B around the restricted MF-VAR in (25):
E[γ(k)i,j ] =
ρm if i = j = k = 1
ρm+j−i if k = 1, j = 2, i > 1
0 else
,
V ar[γ(k)i,j ] =
φλ
2
k2SLH for i = 1, j > 1
φλ2
k2SHL for j = 1, i > 1
λ2
k2else
,
(26)
where all γ(k)i,j are assumed to be a priori independent and normally distributed. The covariance
matrix of the residuals is for now assumed diagonal and fixed, i.e., Σu = Σ = Σd with Σd =
diag(σ2L, σ
2H , . . . , σ
2H) of dimension m + 1. The tightness of the prior distributions around the
AR(1) specification in (24) is determined by λ,15 the influence of low- on high-frequency data
and vice versa is controlled by φ and, finally, Sij =σ2i
σ2j, i, j = L,H, governs the difference in
scaling between y and the x-variables. For µ we take a diffuse prior.
Remark 12 As for the common-frequency case, the expressions for V ar[γ(k)i,j ] in (26) imply
that more recent (low-frequency) lags provide more reliable information than more distant ones.
However, due to the stacked nature of X(m)t , the coefficients could be shrunk according to their
high-frequency time difference instead. This implies specifying the lag associated with each
coefficient in fractions of the low-frequency time index: yt and x(m)t−1−i/m, e.g., are 1 + i/m low-
frequency time periods apart such that the denominator in the corresponding coefficient’s prior
variance would equal (1 + i/m)2.
However, in the context of Granger causality testing such a specification is problematic as
the coefficients to test on are shrunk in ”opposite directions”: yt and x(m)t−1−1/2 are 1.5 t-periods
apart, whereas x(m)t−1−1/2 and yt−1 are separated by only 0.5 low-frequency periods. Consequently,
15λ ≈ 0 results in the posterior coinciding with the prior, whereas λ =∞ causes the posterior mean to coincidewith the OLS estimate of the unrestricted VAR in (4).
26
![Page 27: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/27.jpg)
for a given λ, the coefficients are shrunk much more when testing for Granger causality from
X(m) to y, especially for large m. This makes it difficult to control the size of the respective
tests in one or the other direction. The common-frequency handling of the prior variances in
(26) mitigates this issue, although some loss in power has to be assumed.
Note that a treatment of the mixed-frequency nature of the variables along the lines outlined
above may well be advantageous in other circumstances (e.g., forecasting) such that we present
this approach in detail in Appendix A.16
Given the prior beliefs specified before, the analysis is very similar to the one in Banbura
et al. (2010). In short, let us write the MF-VAR as
Z = ZB∗ + E,
where Z = (Zµ1 , . . . , ZµT )′ with Zµt = (Z ′t, 1)′, Z = (Z1, . . . , ZT )′, E = (u1, . . . , uT )′ and B∗ =
(B′, µ)′. Then, one can show that augmenting the model by two dummy variables, Yd and Xd,
is equivalent to imposing a normal inverted Wishart prior that satisfies the moments in (26).
Finally, estimating the augmented model by ordinary least squares gives us the posterior mean
of the coefficients, on which we can do inference as outlined in the next subsection.
Note that Xd is, in fact, identical to the matrix for the common-frequency VAR (Banbura
et al., 2010); Yd, however, is slightly different due to the prior means being centered around the
restricted VAR in (25):
16When extending the approach we stick to the standard structure of the Minnesota prior in Litterman (1986).It is, however, also feasible to address the aforementioned issue by introducing, say, three hyperparameters: λ1
governing the prior tightness for the coefficients determining Granger causality from y to X(m)t , λ2 for parameters
controlling causality in the reverse direction and λ3 taking care of the remaining coefficients. This strategy is,however, a rather straightforward extension of the theory presented in Appendix A such that we do not presentit here.
27
![Page 28: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/28.jpg)
Yd︸︷︷︸((m+1)(p+1)+1)×(m+1)
=
ρmσL/λ 0 0 . . . 0
0 ρmσH/λ ρm−1σH/λ . . . ρσH/λ
0((m+1)p−2)×(m+1)
diag(σL, σH , . . . , σH)
01×(m+1)
,
Augmenting the model is then achieved by setting Z∗ = (Z ′, Y ′d)′ and Z∗ = (Z ′, X ′d)′.
3.2.3 Testing for Granger Non-Causality
In terms of analyzing the Granger non-causality testing behavior within the Bayesian MF-
VAR, the auxiliary dummy variable approach is quite appealing, as it provides a closed-form
solution for the posterior mean of the coefficients. It is thus straightforward to compare the
testing behavior with the ones from alternative approaches using the Wald test.17 The latter
is derived analogously to the one in (14), and is, given Assumption 11, also asymptotically
χ2rank(R)-distributed.
Remark 13 Because the parameter vector vec(B∗) is not assumed fixed but random, construc-
tion of the test statistic, and especially its interpretation, need to be treated with special care. We
consider Bayesian confidence intervals, in particular highest posterior density (HPD) confidence
intervals (Bauwens et al., 2000), the tightness of which is expressed by α, not coincidentally
the same letter that denotes the significance level in the frequentist’s framework. Here, it is
interpreted such that (1 − α)% of the probability mass falls within the respective interval. In
other words, the probability that a model parameter falls within the bounds of the interval is
equal to (1− α)%. The interval centered at the modal value for unimodal symmetric posterior
17There is a large sample correspondence between classical Wald and Bayesian posterior odds tests (Andrews,1994). For certain choices of the prior distribution, the posterior odds ratio is approximately equal to the Waldstatistic. Andrews (1994) shows that for any significance level α there exist priors such that the aforementionedcorrespondence holds, and vice versa.
28
![Page 29: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/29.jpg)
densities is then called the HPD (Zellner, 1996 or Bauwens et al., 2000). The connection to
Wald tests is now immediate using the equivalence between confidence intervals and respective
test statistics, whereby similar care is demanded when interpreting results.
3.3 Benchmark Models
3.3.1 Low-Frequency VAR
Before the introduction of MIDAS regression models, high-frequency variables were usually
aggregated to the low frequency in order to obtain a common frequency for all variables
appearing in a regression (Silvestrini and Veredas, 2008 or Marcellino, 1999). Likewise for
systems, a monthly variable, for example, was usually aggregated to, say, the quarterly fre-
quency such that a VAR could be estimated in the resulting common low frequency. Formally,
xt = W (L1/m)x(m)t , where W (L1/m) denotes a high-frequency lag polynomial of order A, i.e.,
W (L1/m)x(m)t =
∑Ai=0wix
(m)t−i/m (Silvestrini and Veredas, 2008).18 As far as testing for Granger
non-causality is concerned, we can rely on the asymptotic Wald statistic in (14), where the set
of regressors, the matrices Ω and R as well as the coefficient matrix are suitably adjusted.19
Remark 14 Naturally, such temporal aggregation leads to a great reduction in parameters.
After all, each set of m high-frequency variables per t-period is aggregated into one low-frequency
observation. Of course, this decrease in parameters comes at the cost of disregarding information
embedded in the high-frequency process. As argued in Miller (2011), if the aggregation scheme
employed is different from the true one underlying the DGP, potentially crucial high-frequency
information will be forfeited. Additionally, aggregating a high-frequency variable may lead to
“spurious” (non-)causality in the common low-frequency setup (Breitung and Swanson, 2002),
18This generic specification nests the two dominating aggregation schemes in the literature, Point-in-Time(A = 0, w0 = 1) and Average sampling (A = m− 1, wi = 1/m ∀i), where the former is usually applied to stockand the latter to flow variables. In view of the high-frequency variable we consider in our empirical application,the natural logarithm of bipower variation, we focus on Average sampling in this paper.
19We only consider the asymptotic Wald test in our simulations as it turns out to be correctly sized; hencethere is no need for the bootstrap.
29
![Page 30: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/30.jpg)
since causality is a property which is not invariant to temporal aggregation (Marcellino, 1999
or Sims, 1971).
3.3.2 The max-test
Independently and simultaneously to this work, Ghysels et al. (2016a) have developed a new
Granger non-causality testing framework, whose parsimonious structure makes it very appealing
in a situation, where m is large relative to the sample size. In short, the idea is to focus on the
first line of the MF-VAR in (4), but rather than estimating that (U)-MIDAS equation (Foroni
et al., 2015) and test for Granger non-causality in the direction from X(m) to y, the authors
propose to compute the OLS estimates of βj in the following h separate regression models:
yt = µ+
q∑k=1
αk,jyt−k + βj+1x(m)t−1−j/m + vj , j = 0, . . . , h− 1,
whereby h needs to be set “sufficiently large” (to achieve h > pm). The corresponding max-
test statistic is then the properly scaled and weighted maximum of β21 , . . . , β
2h. Although it
has a non-standard limit distribution under H0, an asymptotic p-value may be obtained in a
similar way as when overcoming the Davies (1987) problem (see Remark 8). Testing for Granger
non-causality in the reverse direction works analogously in the following regression model:
yt = µ+
q∑k=1
αk,jyt−k +h∑k=1
βk,jx(m)t−1−k/m + γjx
(m)t+j/m + vj , j = 1, . . . , l.
Again, the corresponding max-test statistic is the maximum of γ21 , . . . , γ
2l scaled and weighted
properly.20 Ghysels et al. (2016a) show that, in a mixed-frequency VAR a la Ghysels (2015), the
sequential procedure underlying the max-test leads to correct identification of the test in the
20A MIDAS polynomial (Ghysels et al., 2004) on∑hk=1 βk,jx
(m)
t−1−k/m may be imposed to increase parsimony.Note that the aforementioned Davies problem does not occur due to testing for Granger non-causality by lookingat the OLS estimates of γj .
30
![Page 31: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/31.jpg)
presence or absence of Granger causality. Hence, the max-test is consistent against causality.
3.3.3 Unrestricted VARs
Finally, we can attempt to estimate the full MF-VAR in (4) ignoring the possibility that the
amount of parameters may be too large to perform accurate estimations or adequate Granger
non-causality tests. To this end we just estimate the MF-VAR using ordinary least squares. In
this sense the comparison is related to the one of U-MIDAS (Foroni et al., 2015) and MIDAS
regression models for large m. We can test for Granger non-causality using the Wald statistic
in (14), adequately adjusting W , Ω, R and B. We also consider a bootstrap version of the un-
restricted MF-VAR, which we expect to alleviate size distortions, but which cannot solve power
issues due to the proliferation of parameters. Indeed, as shown by Davidson and MacKinnon
(2006), the power of a bootstrap test can only equal the size-corrected power of the original
test; as the parameter proliferation causes the original Wald test to be grossly oversized, the
amount of size-correcting needed implies the remaining power to be low.
4 Monte Carlo Simulations
In order to assess the finite sample performance of our different parameter reduction techniques,
we conduct a Monte Carlo experiment. In light of our empirical investigation we set m = 20,
i.e., as in a month/ working day-example.21 Furthermore, we start by investigating the case
where p = 1 and keep the analysis of higher lag orders for future research.
As far as investigating the size of our Granger non-causality tests is concerned, we assume
21See Section 5 for a justification of the time-invariance of m within this setup. The merits of our methodsshould hold for larger values of m; for smaller values, however, we may assume our approaches to perform worse.Indeed, the smaller m becomes, the less is the need to reduce the amount of parameters via (possibly misspecified)reduction techniques.
31
![Page 32: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/32.jpg)
that the data are generated as a mixed-frequency white noise process, i.e.,
Zt = ut. (27)
Remark 15 Additionally, we have considered three alternative DGPs for size, all based on the
restricted VAR in (25): (i) γ1,i = γ∗i,1 = 0 (’diagonal’), (ii) γ1,i = 0 and γ∗i,1 6= 0 (’only Granger
non-causality from X(m) to y’) and (iii) γ∗i,1 = 0 and γ1,i 6= 0 (’only Granger non-causality
from y to X(m)’) ∀i = 2, . . . , 21, where the parameters γ1,i and γ∗i,1 refer to (29). However,
with the respect to the respective testing direction, the outcomes do not differ qualitatively and
quantitative differences are very small. Results are available upon request.
To analyze power we generate two different DGPs that are closely related to the restricted
VAR in (25):
Zt = ΓPZt−1 + ut (28)
with
ΓP =
γ1,1 γ1,2 γ1,3 . . . γ1,21
γ∗2,1 ρ20 0 . . . 0
γ∗3,1 ρ19... . . . 0
......
.... . .
...
γ∗21,1 ρ 0 . . . 0
, (29)
where γ1,1 = 0.5, ρ = 0.6 and γ1,j = βw∗j−1(−0.01) for j > 1. Note that wi(ψ) = exp(ψi2)∑20i=1 exp(ψi
2)
corresponds to the two-dimensional exponential Almon lag polynomial with the first parameter
set to zero (Ghysels et al., 2007). Now, for the first power DGP we simply set w∗i (ψ) = wi(ψ),
whereas in the second power DGP w∗i (ψ) = wi(ψ)− w(ψ) with the horizontal bar symbolizing
the arithmetic mean. As far as γ∗j,1 is concerned we take γ∗j,1 = γj,1 (first power DGP) and
γ∗j,1 = γj,1 − γ·,1 (second power DGP) for j = 2, . . . , 21, whereby γj,1 = (γj−1,1)1.11 for j =
32
![Page 33: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/33.jpg)
3, . . . , 21.22 As an example, Figure 1 plots βw∗j−1 and γ∗j,1 for β = 2 and γ2,1 = 0.25.
[INSERT FIGURE 1 HERE]
Remark 16 Due to the zero-mean feature of w∗j−1(ψ) and γ∗j,1, j = 2, . . . , 21, in the second
power DGP, we expect the presence of Granger causality to be “hidden” when Average sampling
the high-frequency variable (see Section 3.3.1). It is thus a situation, in which classical temporal
aggregation is expected not to preserve the causality patterns in the data (Marcellino, 1999). The
first power DGP serves as a benchmark in the sense that we do not a priori expect one approach
to be better or worse than the others.
For the methods in Section 3 we analyze size and power for T = 50, 250 and 500, corre-
sponding to roughly 4, 21 and 42 years of monthly data. Note that an additional 100 monthly
observations are used to initialize the process. As far as the error term is concerned, we as-
sume uti.i.d.∼ N(0(21×1),Σu), where Σu has the same structure as the covariance matrix of the
restricted VAR in (25) with σLL = 0.5 and σHH = 1.23
For CCA and PLS we consider r = 1, 2 and leave the analysis of higher factor dimensions
for further research. Recall from Remark 7 that under H0 the model is not misspecified with
either amount of factors. Under HA, however, the model is misspecified for r = 1, but not
for r = 2. In that sense, we implicitly analyze the impact of misspecification, stemming from
choosing r too small, on the behavior of the respective test statistics. As far as the Bayesian
approach is concerned, we experimented with various values for the hyperparameter and found
λ = 0.175 to be a good choice. Furthermore, we approximate ρ by running a corresponding
univariate regression. For the max-test we take q = 1 and h = l = m = 20.
22The parameter values have been chosen to mimic part of the structure of the restricted VAR in (25) and toensure stability of the system. In the first DGP late xt−1-observations have a larger impact on yt, and positivevalues of yt−1 increase x towards the end of the current period, but hardly have an impact at the beginning. Inthe second power DGP a zero-mean feature is imposed on the coefficients w∗j−1(ψ) and γ∗j,1 without changingthe general pattern of their evolvement.
23These values resemble the data in the empirical section, where σLL = 0.55 and σHH = 1.09.
33
![Page 34: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/34.jpg)
We consider different versions of our size and power DGPs by varying the values of σHL
and β. To be more precise, we choose σHL = 0,−0.05,−0.1 when analyzing size properties
of our tests in both directions.24 A value of zero implies the absence of nowcasting causality,
whereas the other two values imply some degree of correlation between the low- and high-
frequency series. For both power DGPs we fix σHL = −0.05 though. When investigating
Granger causality testing from X(m) to y in the first power DGP we consider β = 0.5, 0.8, 2;
in the second power DGP we only take β = 0.5. For causality in the reverse direction we take
β = 0.5 and γ2,1 = 0.2 in both power DGPs, because the results do not seem sensitive in this
respect.
The figures in the tables below represent the percentage amount of rejections at α = 5%.25
All figures are based on 2,500 replications and are computed using GAUSS12. For the bootstrap
versions we use B = 499.
4.1 Size
Tables 1 and 2 contain the size results for Granger non-causality tests within the following
approaches: the low-frequeny VAR (LF), the unrestricted VAR (UNR), reduced rank restric-
tions using canonical correlations analysis (CCA) or partial least squares (PLS), and with three
imposed factors using the HAR model (HAR), the max-test (max) and the Bayesian mixed-
frequency VAR (BMF). With respect to CCA and PLS, the outcomes for r = 2 turn out to
be qualitatively very similar to the r = 1 case, showing that the methods are apparently not
affected by the presence of misspecification. Consequently, we only show the outcomes for r = 1
to save space. Furthermore, aside from the results of the standard, i.e., asymptotic, Wald tests,
we only display the outcomes for the best performing bootstrap variant, which are denoted with
24σHL turns out to be equal to −0.22 in our empirical analysis, yet we decrease its value here so as to achievepositive-definiteness of Σu. We checked the robustness of our outcomes to σHL being positive (we chose thecorresponding values 0.05 and 0.1) as this situation is more common in economic applications. However, theresults remain broadly unchanged and are available upon request.
25Outcomes for α = 10% and α = 1% are available upon request.
34
![Page 35: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/35.jpg)
a superscript ‘∗’ (e.g., CCA∗ is the bootstrap CCA-based Wald test). In the case of testing
causality in the direction from the high- to the low-frequency series this turns out to be the
bootstrap that imposes the null hypothesis, whereas the unrestricted variant dominates for the
reverse direction.26 For both methods we set the lag length within the bootstrap equal to p = 1.
When testing causality from y to X(m) size distortions may occur due to computing a joint
test on mp parameters from m different equations in the system. In order to address this issue,
we take the maximum of the Wald statistics computed equation by equation. We denote these
tests by adding a subscript ’b’, e.g., CCAb. For the tests with asymptotic critical values we
apply a Bonferroni correction to control the size under multiple testing (Dunn, 1961). In similar
spirit as in White (2000), the bootstrap implementations of these tests automatically provide
an implicit Bonferroni-type correction for multiple testing, as we calculate the corresponding
maximum of the single-equation Wald tests within each bootstrap iteration and compare that
to the maximum of the original tests. We will refer to all these tests as ‘Bonferroni-type’ tests
to avoid confusion with the max-test, and present the corresponding outcomes in Table 3.
[INSERT TABLE 1 HERE]
Let us start by analyzing the testing behavior from X(m) to y. First, LF has size close to
the nominal one, which is not surprising given that the flat aggregation scheme is correct in this
white noise DGP (no matter the value of σHL). The unrestricted VAR, however, incurs some
size distortions for small T due to parameter proliferation. Reduced rank restrictions based
on CCA and PLS yield considerable size distortions, whereby the imposed HAR-type factor
structure delivers good results (despite being slightly oversized for T = 50). Finally, max
and BMF show almost perfect size, whereby the former is marginally under- and the latter
marginally oversized.
26The unrestricted bootstrap is seriously oversized when testing causality from X(m) to y, while for the reversedirection the bootstrap under the null is mildly oversized for small samples. Hence, we recommend the use of therestricted bootstrap for testing causality from X(m) to y, but using the unrestricted bootstrap for the oppositedirection.
35
![Page 36: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/36.jpg)
Turning to the outcomes of the bootstrap versions we observe that, independent of the value
of σHL, even for T = 50 the actual size of the Wald tests within the unrestricted VAR and
reduced rank restriction models with CCA- or HAR-type factors is very close to the nominal one,
such that the bootstrap tests clearly dominate their asymptotic counterparts. For PLS-based
factors this conclusion only holds in the absence of nowcasting causality, and size distortions
arise quickly as we increase the contemporaneous correlation between X(m) and y.
[INSERT TABLE 2 HERE]
[INSERT TABLE 3 HERE]
Many of the aforementioned statements carry over to testing causality in the reverse direc-
tion: LF delivering nearly perfect size (as it is the correct model under the null), max performing
very well (being only marginally oversized for small T ) and BMF being a bit oversized (by a
slightly larger degree than in Table 1).27 The size distortions incurred by UNR, CCA, PLS
and HAR are, however, amplified. The use of the Bonferroni-corrected UNRb, CCAb, PLSb and
HARb can only mitigate this effect, yet not fully eradicate it. The bootstrap version does elim-
inate the size distortions, though: actual size of UNR∗, PLS∗ and HAR∗ becomes oftentimes
very close to 5%; only CCA∗ remains a bit oversized for T = 50. The Bonferroni-type tests
UNR∗b , CCA∗b , PLS∗b and HAR∗b all have size very close to the nominal one for all sample sizes.
To sum up, the asymptotic Granger non-causality tests with size close to the nominal one are
LF, max and BMF to some degree. As far as the bootstrap tests are concerned, UNR∗, CCA∗,
PLS∗ (with at most ”mild” nowcasting causality) and HAR∗ when testing the direction from
X(m) to y, and UNR∗, CCA∗, PLS∗ and HAR∗ as well as their Bonferroni-type counterparts
when testing the reverse direction deliver good size outcomes. Consequently, we focus on these
cases when analyzing power (and when dealing with real data in Section 5).
27Strangely, BMF seems to be rejected less and less for growing σHL, with the effect being strongest for smallT .
36
![Page 37: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/37.jpg)
4.2 Power
Tables 4 and 5 contain the corresponding outcomes under the alternative hypothesis. Recall
that for the direction from X(m) to y we consider three different values of β within the first
power DGP, β = 0.5, 0.8 and 2. For the second power DGP we fix β = 0.5. For the reverse
direction we always keep γ2,1 = 0.2.
[INSERT TABLE 4 HERE]
Let us again start with the direction from the high- to the low-frequency variable and first
focus on the outcomes for the first power DGP, i.e., the top three blocks of Table 4. Observe (i)
how power reaches one asymptotically for all approaches, and (ii) how larger values of β imply
an increase in the rejection frequencies for a given T . Naturally, for a large enough β all tests
have power equal to one, irrespective of the sample size (as nearly happens for β = 2). So, in
order to compare our different tests let us focus on β = 0.5.
The asymptotic Wald test within the low-frequency VAR still performs very well, with the
highest rejection frequency for T = 50. Note, however, that the Granger causality feature
in the data does not get averaged out by temporal aggregation in the first power DGP. The
bootstrapped version of the Wald test after HAR-type restrictions have been imposed in a
reduced rank regression perform almost as well as LF. max, BMF and PLS∗ appear to fall a
bit short, but catch up quickly as either T or β grows. A similar observation holds for UNR∗
and CCA∗, whereby they seem to be markedly less powerful for small T .
These conclusions generally carry over to the second power DGP, with two exceptions: first
and foremost, the outcomes of LF show that Average sampling of X(m) annihilates the causality
feature between the series for all sample sizes (power actually almost coincides with the nominal
size of 5%). Second, the rejection frequencies are lower than they were in the corresponding
setting of the first power DGP.
[INSERT TABLE 5 HERE]
37
![Page 38: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/38.jpg)
Focusing on the direction from y to X(m), it suffices to analyze the results of the first power
DGP; the aforementioned statements regarding the second power DGP carry over to the reverse
direction of causality being tested. It turns out that both, max and BMF, have higher power
than the bootstrapped Wald tests corresponding to the reduced rank restrictions approaches.
Recall, though, that BMF and max were somewhat oversized, especially for small T , which
may inflate their power. Also, so far we did not consider the Bonferroni-type tests. Given the
way in which max is constructed it seems more natural to compare it to the Bonferroni-type
counterparts of the bootstrap tests. Indeed, both approaches rely on a statistic that is computed
as the maximum of a set of test statistics.28 We find that for a given T almost all approaches are
more powerful than max. The fact that UNR∗b , PLS∗b and HAR∗b are all slightly undersized for
small T , while max is marginally oversized, even strengthens these conclusions. Also note that
the bootstrap tests based on the reduced rank restrictions seem to be slightly more powerful
than UNR∗b . Hence, while the bootstrap is able to correct size of the unrestricted MF-VAR
approach, the proliferation of parameters continues to have a negative effect on power.
Combining the power and size outcomes, it seems that BMF, HAR∗, max (for large enough
β when T is small) and PLS∗ (in the absence of nowcasting causality) are the dominant Granger
non-causality testing approaches as far as the direction from X(m) to y is concerned. For the
reverse direction of causality CCA∗b , PLS∗b and HAR∗b as well as UNR∗b and max (though both to
a lesser degree) appear superior. Despite the competitiveness of our proposed tests, asymptotic
or bootstrapped, in the situation at hand, one should keep in mind that the underlying param-
eter reduction techniques are ultimately ad hoc. Hence, in contrast to the max-test and testing
within the unrestricted VAR, which do not involve possibly misspecified dimension reduction
approaches, our tests can ultimately not have an asymptotic power of one. As such, the good
performance of our proposed tests may be specific to our simulation setup and should not be
28There is a difference in terms of the underlying regression to obtain the various test statistics, of course.Ghysels et al. (2016a) consider univariate MIDAS regressions with leads, whereas the Bonferroni-type tests arebased on estimated coefficients from different equations, i.e., the ones for X(m), of the corresponding system.
38
![Page 39: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/39.jpg)
generalized to all possible causality patterns for which our imposed restrictions may not hold.
5 Application
We apply the approaches described in Section 3 to a MF-VAR consisting of the monthly growth
rate of the U.S. industrial production index (ipi hereafter), a measure of business cycle fluctua-
tions, and the logarithm of daily bipower variation (bv hereafter) of the S&P500 stock index, a
realized volatility measure more robust to jumps than the realized variance. While the degree
to which macroeconomic variables can help to predict volatility movements has been investi-
gated widely in the literature (see Schwert, 1989b, Hamilton and Gang, 1996, or Engle and
Rangel, 2008, among others), the reverse, i.e., whether the future path of the economy can be
predicted using return volatility, has been granted comparably little attention (examples are
Schwert, 1989a, Mele, 2007, and Andreou et al., 2000). Instead of using an aggregate mea-
sure of volatility such as, e.g., monthly realized volatilities (Chauvet et al., 2015) or monthly
GARCH estimated variances, we use daily bipower variation computed on 5-minute returns ob-
tained from the database in Heber et al. (2009). With the bv-series being available at a higher
frequency than most indicators of business cycle fluctuations, we obtain the mixed-frequency
framework analyzed in this paper.
The sample covers the period from January 2000 to June 2012 yielding a sample size of
T = 150. We take m = 20 as it is the maximum amount of working days that is available
in every month throughout the sample we deal with.29 Consequently, we have BV(20)t =
(bv(20)t , bv
(20)t−1/20, . . . , bv
(20)t−19/20)′. Figure 2 plots the data.
[INSERT FIGURE 2 HERE]
29Whenever a month contains more than 20 working days we disregard the corresponding amount of days atthe beginning of the month. In June 2012, e.g., there are 21 working days such that we do not consider June1. For May 2012 we disregard the first three days. An alternative (balanced) strategy would have been to takethe maximum number of days in a particular month (i.e., 23, usually in July, August or October) and to createadditional values for non-existing days in other months whenever necessary. As far as the treatment of dailydata is concerned we have also taken bv
(20)t = bv
(20)
t−1/20 when there are no quotations for bv(20)t .
39
![Page 40: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/40.jpg)
Table 6 contains the outcomes of Granger non-causality tests for all approaches discussed in
Section 3. As mentioned before, we disregard the cases that showed considerable size distortions
in the Monte Carlo analysis. Note that a lag length of p = 1 and 2 is considered (the same
for the bootstrap) and that the numbers represent p-values (in percentages). For reduced rank
restrictions using CCA and PLS we consider one up to three factors, whereby we only show
the outcomes for r = 1 and r = 2 for representational ease; r = 3 gives similar results. For
the non-bootstrap tests, except the max-test and the Wald-test within the Bayesian mixed-
frequency VAR,30 we also consider a heteroscedasticity consistent variant of (14) by computing
a robust estimator of Ω (see Ravikumar et al., 2000) to account for the potential presence of a
time varying multivariate process:
ΩR = T ((W ′W )−1 ⊗ Im+1)S0((W ′W )−1 ⊗ Im+1), (30)
where
S0 =1
T
T∑t=1
(Wt ⊗ ut)′(Wt ⊗ ut). (31)
For the bootstrap tests we achieve the robustness to heteroskedasticity by implementing the
wild bootstrap version as described in Section 3.1.4.
[INSERT TABLE 6 HERE]
If we consider the test statistics that perform best in our Monte Carlo experiment, the out-
comes above clearly point towards Granger-causality from BV (20) to ipi. This result supports
Andreou et al. (2000) concluding that ”[...] volatilities may also be useful [...] indicators for
both the growth and volatility of industrial production” (p. 15). The situation is less clear
cut when looking at Granger-causality from the low-frequency variable to the high-frequency
30Note that, as discussed in Section 3.2.2, there is a one-to-one correspondence between the OLS estimator ofthe augmented model and the prior setting we consider in our Bayesian framework. As we did not discuss a priorspecification that corresponds to the robust version of the Wald test we disregard from that test variant here.
40
![Page 41: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/41.jpg)
volatility measures. Indeed, max and most Bonferroni-type tests seem to reject the null of no
Granger-causality, whereas the joint tests do not. However, the higher power obtained on the
maximum of the individual Wald-type tests would favor the presence of Granger-causality in
the direction from business cycle movements to financial uncertainties as well. A more careful
investigation, out of the scope of this paper, could be done on different subsamples in order
to analyze whether, e.g., periods before, during or after the financial crises lead to similar
conclusions.
6 Conclusion
We investigate Granger non-causality testing in a mixed-frequency VAR, where the mismatch
between the sampling frequencies of the variables under consideration is large, causing estima-
tion and inference to be potentially problematic. To avoid this issue we discuss two parameter
reduction techniques in detail, reduced rank restrictions and a Bayesian MF-VAR approach,
and compare them to (i) a common low-frequency VAR, (ii) the max-test approach and (iii) the
unrestricted VAR in terms of their Granger non-causality testing behavior. To further improve
their finite sample test properties we also consider two bootstrap variants for the reduced rank
regression approaches (and the unrestricted VAR).
For both directions of causality we find a different set of tests that result in the best Granger
non-causality testing behavior. For the direction from the high- to the low-frequency series,
standard testing within the Bayesian mixed-frequency VAR, the max-test of Ghysels et al.
(2016a), and the restricted bootstrap version of the Wald test in a reduced rank regression
after HAR-type factors have been imposed or PLS-type factors have been computed, the latter
of which being restricted to nowcasting non-causality (Gotz and Hecq, 2014), perform best.
For the reverse direction, the unrestricted bootstrap variants of the Bonferroni-corrected Wald
tests within the following models dominate: the unrestricted VAR and reduced rank regressions
41
![Page 42: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/42.jpg)
with CCA-, PLS- or HAR-based factors.
An application investigating the presence of a causal link between business cycle fluctuations
and uncertainty in financial markets illustrates the practical usefulness of these approaches.
While Granger causality from uncertainty in financial markets to business cycle fluctuations
was clearly supported by the data, evidence for causality in the reverse direction only comes
from a subset of the tests, yet the more powerful ones according to our simulation results.
Acknowledgements
We want to particularly thank Eric Ghysels for many fruitful discussions, comments and sug-
gestions on earlier versions of the paper. Moreover, we thank two anonymous referees, one
of them deserving special gratitude for providing us with MATLAB code for the max-test.
We also thank Daniela Osterrieder, Lenard Lieb, Jorg Breitung, Roman Liesenfeld, Jan-Oliver
Menz, Klemens Hauzenberger, Martin Mandler and participants of the Workshop on Advances
in Quantitative Economics in Maastricht 2013, the CEIS Seminar in Rome 2013, the 33rd Inter-
national Symposium on Forecasting in Seoul 2013, the 24th (EC)2 conference in Cyprus 2013,
the 22th SNDE Annual Symposium in New York 2014, the 8th ECB Workshop on Forecasting
Techniques in Frankfurt 2014, the Graduate School Research Seminar in Cologne 2014, the 68th
European Meeting of the Econometric Society in Toulouse 2014, the 8th International Confer-
ence on Computational and Financial Econometrics in Pisa 2014 and the Norges Bank Research
Seminar in Oslo 2015 for useful suggestions and comments on earlier versions of the paper. The
third author would like to thank the Netherlands Organisation for Scientific Research (NWO)
for financial support.
42
![Page 43: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/43.jpg)
A Bayesian VAR Estimation with MF Prior Variances
Properly accounting for the high-frequency time difference between the variables involved im-
plies the following prior variances:
V ar[γ(k)i,j ] =
φ λ2
(k+(j−2)/m)2SLH for i = 1, j > 1
φ λ2
(k+(2−i)/m)2SHL for j = 1, i > 1
λ2
(k+(j−i)/m)2else
, (32)
Now, define Z = (Zµ1 , . . . , ZµT )′, where Zµt = (Z ′t, 1)′, and let us re-write the MF-VAR in (4)
in the following way:
vec(Z)︸ ︷︷ ︸(m+1)T×1
= Z∗︸︷︷︸(m+1)T×(m+1)n
vec(B∗)︸ ︷︷ ︸(m+1)n×1
+ vec(E)︸ ︷︷ ︸(m+1)T×1
, (33)
where Z = (Z1, . . . , ZT )′, Z∗ = Im+1⊗Z, E = (u1, . . . , uT )′, B∗ = (B′, µ)′ and n = (m+1)p+1.
In order to drop the undesirable feature of a fixed and diagonal covariance matrix Σ, we impose
a normal inverted Wishart prior (at the cost of having to set φ = 1) with the following form
(Kadiyala and Karlsson, 1997):
vec(B∗)|Σ ∼ N(vec(B∗0), [Z∗′0 (Σ−1 ⊗ IT )Z∗0 ]−1) and Σ ∼ iW (V0, v0), (34)
where Z∗0 = Im+1⊗Z0 with Z0 being a (T ×n)-matrix. B∗0 and Z0 have to be chosen so as to let
expectations and variances of the elements in B∗ coincide with the moments in (32). Likewise,
V0 and v0 need to be set such that E[Σ] = Σd.
Similar to Banbura et al. (2010), we can show that adding auxiliary dummy variables Yd and
Xd, the precise composition of which is given in Appendix B, to (33) is equivalent to imposing
43
![Page 44: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/44.jpg)
the normal inverted Wishart prior in (34). To this end, let
B∗aux0 = (X ′dXd)−1X ′dYd,
Zaux0 = Xd
V0 = (Yd −XdB∗aux0 )′(Yd −XdB
∗aux0 ) and
v0 = m+ 3 = Td − naux + 2,
(35)
where naux = m(2p+ 1) and Td = naux + (m+ 1), and subsequently set
vec(B∗0) = S′vec(B∗aux0 ) and
V ar[vec(B∗)] = S′[Z∗aux′0 (Σ−1 ⊗ ITd)Z∗aux0 ]−1S,(36)
with Z∗aux0 = Im+1 ⊗ Zaux0 and where S is an [(m + 1)naux × (m + 1)n]-dimensional selection
matrix, the precise construction of which is given in Appendix C.
Remark 17 Intuitively speaking, Zaux0 is an auxiliary matrix constructed as the “union” of the
Z0 matrices corresponding to the different columns of B∗. The non-random matrix S selects,
for each column of B∗, the corresponding elements of Zaux0 in order to let the variance of each
element in B∗ match the corresponding prior variance. Likewise, B∗aux0 is an auxiliary matrix
from which we derive B∗0 .
In order to augment the model in (33), let us define Zdj = (Z ′, (XdSj)′)′, where Sj is the
jth block of the selection matrix S (see Appendix C for details), j = 1, . . . ,m+ 1.31 Then, the
augmented system becomes
vec(Z∗)︸ ︷︷ ︸(m+1)(T+Td)×1
= Z∗∗︸︷︷︸(m+1)(T+Td)×(m+1)n
vec(B∗)︸ ︷︷ ︸(m+1)n×1
+ vec(E∗)︸ ︷︷ ︸(m+1)(T+Td)×1
, (37)
where Z∗ = (Z ′, Y ′d)′, E∗ = (E′, E′d)′ and Z∗∗ is block diagonal with Z∗∗ = diagZd1, Zd2, . . . , Zdm+1.
31Here, Sj picks the elements of Zaux0 = Xd corresponding to column j of B∗.
44
![Page 45: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/45.jpg)
The posterior then has the form
vec(B∗)|Σ, Z ∼ N(vec(B∗), [Z∗′∗ (Σ−1 ⊗ IT+Td)Z∗∗]−1) and Σ|Z ∼ iW (V , T +m+ 3), (38)
where V = E′∗E∗ with vec(E∗) = vec(Z∗)− Z∗∗vec(B∗).32 Note that the posterior mean of the
coefficients boils down to
vec(B∗) = [Z∗′∗ (Σ−1 ⊗ IT+Td)Z∗∗]−1Z∗′∗ (Σ−1 ⊗ IT+Td)vec(Z∗), (39)
i.e., the GLS estimate of a SUR regression of vec(Z∗) on Z∗∗. As for the common-frequency
case, it can be checked that it also coincides with the posterior mean for the prior setup in (32).
An example with p = 1 and m = 2 illustrates the auxiliary dummy variables approach and is
provided in Appendix D.
B The Auxiliary Dummy Variables with MF Prior Variances
The auxiliary dummy variables that imply a matching of the prior moments turn out to be
Yd︸︷︷︸Td×(m+1)
=
02(m−1)×1
Dρ ⊗ (0, σH/λ)′σLρm/λ
0
0(m(2p−1)−1)×(m+1)
diag(σL, σH , . . . , σH)
01×(m+1)
, (40)
32In practice, we estimate Σ in the standard way, i.e., Σ = 1T+Td
Eols′∗ Eols∗ , where Eols∗ denotes the OLS
residuals of the system in (37). Again, a sample size correction alleviates potential size distortions (see Section3.1.2). The ith column of Eols∗ , i = 1, . . . ,m + 1, corresponds to the residuals of a regression of (Z′·,i, Y
′d,i)′ on
Zdi , where Z·,i and Yd,i denote the ith columns of Z and Yd, respectively.
45
![Page 46: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/46.jpg)
Xd︸︷︷︸Td×naux
=
J1p ⊗ diag(σL, σH)/λ 02pm×(m−1) 02pm×1
0(m−1)×2pm J2pσH/λ 0(m−1)×1
0(m+1)×2pm 0(m+1)×(m−1) 0(m+1)×1
01×2pm 01×(m−1) ε
, (41)
where Dρ = diaga(ρm, m−1m ρm−1, . . . , 2
mρ2, 1mρ), with diaga(·) denoting an anti-diagonal matrix.
Furthermore,
J1p = diag( 1
m ,2m , . . . , 1,
m+1m , . . . , 2, . . . , . . . , p),
J2p = diag(p+ 1
m , p+ 2m , . . . , p+ m−1
m ).(42)
The last line of both, Yd and Xd, corresponds to the diffuse prior for the intercept (ε is a very
small number), the block above imposes the prior for Σ and the remaining blocks set the priors
for the coefficients γ(k)i,j . As in Banbura et al. (2010), we set σ2
i = s2i , i = L,H, where s2
i is the
variance of a residual from an AR(p), respectively an AR(mp), model for yt, respectively for
x(m)t .
C Construction of the Selection Matrix S
Let us investigate the variance of vec(B∗)|Σ more closely. Noting that we have to choose Z0,
or Z∗0 in (34), in such a way so as to let the variances of the corresponding coefficients coincide
with the prior variances in (32), it turns out that we need to set
V ar[vec(B∗)|Σ] =
σ2LΩ
(2)0 0n×n . . . . . . 0n×n
0n×n σ2HΩ
(2)0 0n×n . . . 0n×n
... 0n×n σ2HΩ
(3)0 . . .
...
......
.... . . 0n×n
0n×n 0n×n . . . 0n×n σ2HΩ
(m+1)0
, (43)
46
![Page 47: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/47.jpg)
where
Ω(i)0 = diag( λ2
(1+(2−i)/m)2σ2L, λ2
(1+(2−i)/m)2σ2H, λ2
(1+(3−i)/m)2σ2H, . . . , λ2
(1+1+(1−i)/m)2σ2H, . . . ,
. . . , λ2
(p+(2−i)/m)2σ2L, λ2
(p+(2−i)/m)2σ2H, . . . , λ2
(p+1+(1−i)/m)2σ2H, 1ε2
)(44)
for i = 2, . . . ,m+ 1.
Hence, unlike in the common-frequency case, where Ω(i)0 = Ω0 ∀i (Banbura et al., 2010),
the set of variances changes due to the stacked nature of the vector Zt and the specific lag
structure for each coefficient; see the variances in (32). Let us form an auxiliary matrix Ωaux0
that contains the union of all elements in Ω(2)0 , . . . ,Ω
(m+1)0 . Each matrix Ω
(i)0 contains p + 1
new elements compared to Ω(i−1)0 for i > 2. As there are n elements in Ω
(2)0 , we end up with a
dimension of m(2p+ 1) = naux for the square matrix Ωaux0 :
Ωaux0 = diag( λ2
(1/m)2σ2L, λ2
(1/m)2σ2H, . . . , λ2
12σ2L, λ2
12σ2H, λ2
(1+1/m)2σ2L, λ2
(1+1/m)2σ2H, . . . , λ2
22σ2H,
. . . , . . . , λ2
p2σ2L, λ2
p2σ2H, λ2
(p+1/m)2σ2H, . . . , λ2
(p+1−1/m)2σ2H, 1ε2
).(45)
All that remains is to define a [(m + 1)naux × (m + 1)n]-dimensional selection matrix S such
that S′(Σd ⊗ Ωaux0 )S = V ar[vec(B∗)|Σ] . Note that from Ωaux
0 we can then derive Zaux0 = Xd
by using Ωaux0 = (Zaux
′0 Zaux0 )−1.
Let us denote by 1i an naux-dimensional column vector with a one in row i and zeros
elsewhere. It turns out that S is a block-diagonal matrix, i.e., S = diagS1, S2, . . . , Sm+1,
where each off-diagonal block is 0naux×n. Each Sj , j = 1, . . . ,m+1, can be described as follows:
Sj = (S1j , S
2j , 1naux) for j = 2, . . . ,m+ 1, (46)
47
![Page 48: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/48.jpg)
where
S1j = ( 12m−(2j−3), 12m−(2j−4), 12m−(2j−6), . . . , 14m−(2j−2), 14m−(2j−3), . . . ,
16m−(2j−2), . . . , . . . , 12pm−(2j−3), 12pm−(2j−4), . . . , 12pm),(47)
S2j =
∅ for j = m+ 1
(12pm+1, 12pm+2, . . . , 12pm+m−(j−1)) else(48)
and
S1 = S2. (49)
Each of the indices in Sj reveals which row element of the jth column of Baux0 gets selected by
S. Put differently, the indices that are missing in Sj correspond to elements of Baux0 (in column
j) that will not get chosen by S. This implies that the values of these elements do not play a
role for the computation of vec(B∗0). Consequently, when constructing Baux0 , and subsequently
also Yd, we assign these elements a value of zero for simplicity.
D Example: the Auxiliary Dummy Variables Approach for MF Data
Let p = 1 and m = 2. The prior beliefs in (32) corresponding to this setup, and written in
terms of vec(B∗) = vec((B′, µ)′), are given by
E(vec(B∗)) = vec
ρ2 0 0
0 ρ2 ρ
0 0 0
0 0 0
48
![Page 49: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/49.jpg)
and
V ar(vec(B∗)) = λ2
1 0 . . . 0
0 SLH 0 . . . 0
0 0 1(3/2)2
SLH 0 . . . 0
0 . . . 0σ2L
ε2λ2 0 . . . 0
0 . . . 0 SHL 0 . . . 0
0 . . . 0 1 0 . . . 0
0 . . . 0 1(3/2)2
0 . . . 0
0 . . . 0σ2H
ε2λ2 0 . . . 0
0 . . . 0 1(1/2)2
SHL 0 . . . 0
0 . . . 0 1(1/2)2
0 0
0 . . . 0 1 0
0 . . . 0σ2H
ε2λ2
.
(50)
With Td = 9 and naux = 6 the auxiliary dummy variables look as follows:
Yd =
0 0 0
0 012ρσHλ
ρ2σLλ 0 0
0 ρ2σHλ 0
0 0 0
σL 0 0
0 σH 0
0 0 σH
0 0 0
, Xd =
12σLλ 0 0 0 0 0
012σHλ 0 0 0 0
0 0 σLλ 0 0 0
0 0 0 σHλ 0 0
0 0 0 032σHλ 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 ε
.
Let us demonstrate that these auxiliary dummy variables truly imply a matching of the
prior moments in (34) and the prior beliefs given above. Simple algebra gives us Zaux0 = Xd,
49
![Page 50: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/50.jpg)
v0 = 5 as well as
B∗aux0 = (X ′dXd)−1X ′dYd =
0 0 0
0 0 ρ
ρ2 0 0
0 ρ2 0
0 0 0
0 0 0
,
V0 = (Yd −XdB∗aux0 )′(Yd −XdB
∗aux0 ) =
σ2L 0
0 σ2H 0
0 0 σ2H
.
To obtain vec(B∗0) we require the selection matrix S. Following the guidelines in Appendix C
yields S = diagS1, S2, S3, where the off-diagonal blocks are of dimension 6× 4 and
S1 = S2 =
0 0 0 0
0 0 0 0
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
, S3 =
1 0 0 0
0 1 0 0
0 0 0 0
0 0 1 0
0 0 0 0
0 0 0 1
.
Now it is easy to show that, indeed,
vec(B∗0) = S′vec(B∗aux0 ) = vec
ρ2 0 0
0 ρ2 ρ
0 0 0
0 0 0
= E(vec(B∗))
50
![Page 51: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/51.jpg)
and S′[Σ⊗ (Zaux′0 Zaux0 )−1]S = V ar(vec(B∗)) as in equation (50).
Having shown that Yd and Xd capture the prior moments, we need to augment the MF-VAR.
Let us start by re-writing it into the format presented in equation (33):
vec
y2 x
(2)2 x
(2)2−1/2
......
...
yT x(2)T x
(2)T−1/2
=
I3 ⊗
y1 x
(2)1 x
(2)1/2 1
......
......
yT−1 x(2)T−1 x
(2)T−1−1/2 1
×
vec
γ1,1 γ2,1 γ3,1
γ1,2 γ2,2 γ3,2
γ1,3 γ2,3 γ3,3
µ1 µ2 µ3
︸ ︷︷ ︸
vec(B∗)
+vec(E).
Following the steps outlined after Remark 17 gives us the augmented system:
vec
y2 x(2)2 x
(2)2−1/2
......
...
yT x(2)T x
(2)T−1/2
0 0 0
0 012ρσHλ
ρ2σLλ 0 0
0 ρ2σHλ 0
0 0 0
σL 0 0
0 σH 0
0 0 σH
0 0 0
= Z∗∗vec(B∗) + vec(E∗),
51
![Page 52: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/52.jpg)
Z∗∗ =
y1 x(2)1 x
(2)1/2
1
0(T+9)×4 0(T+9)×4
......
......
yT−1 x(2)T−1 x
(2)T−1−1/2
1
0 0 0 0
0 0 0 0
σLλ
0 0 0
0 σHλ
0 0
0 032σHλ
0
03×4
0 0 0 ε
0(T+9)×4
y1 x(2)1 x
(2)1/2
1
0(T+9)×4
......
......
yT−1 x(2)T−1 x
(2)T−1−1/2
1
0 0 0 0
0 0 0 0
σLλ
0 0 0
0 σHλ
0 0
0 032σHλ
0
03×4
0 0 0 ε
0(T+9)×4 0(T+9)×4
y1 x(2)1 x
(2)1/2
1
......
......
yT−1 x(2)T−1 x
(2)T−1−1/2
112σLλ
0 0 0
012σHλ
0 0
0 0 0 0
0 0 σHλ
0
0 0 0 0
03×4
0 0 0 ε
.
The GLS estimate of vec(B∗) in the regression above is then a closed-form solution for the
posterior mean of the coefficients.
52
![Page 53: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/53.jpg)
References
Anderson, T. W., 1951, Estimating linear restrictions on regression coefficients for multivariate
normal distributions. The Annals of Mathematical Statistics 22, 327–351.
Andreou, E., D. R. Osborn and M. Sensier, 2000, A comparison of the statistical properties of
financial variables in the USA, UK and Germany over the business cycle. Manchester School
68, 396–418.
Andrews, D. W. K., 1994, The large sample correspondence between classical hypothesis tests
and Bayesian posterior odds tests. Econometrica 62, 1207–1232.
Banbura, M., D. Giannone and L. Reichlin, 2010, Large Bayesian vector autoregressions.
Journal of Applied Econometrics 25, 71–92.
Bauwens, L., M. Lubrano and J.-F. Richard, 2000, Bayesian inference in dynamic econometric
models. Oxford University Press.
Bose, A., 1988, Edgeworth correction by bootstrap in autoregressions. Annals of Statistics 16,
1709–1722.
Breitung, J. and N. R. Swanson, 2002, Temporal aggregation and spurious instantaneous
causality in multiple time series models. Journal of Time Series Analysis 23, 651–666.
Bruggemann, R., C. Jentsch and C. Trenkler, 2016, Inference in VARs with conditional het-
eroskedasticity of unknown form. Journal of Econometrics 191, 69–85.
Carriero, A., G. Kapetanios and M. Marcellino, 2011, Forecasting large datasets with bayesian
reduced rank multivariate models. Journal of Applied Econometrics 26, 735–761.
Cavaliere, G., A. Rahbek and A. M. R. Taylor, 2012, Bootstrap determination of the co-
integration rank in vector autoregressive models. Econometrica 80, 1721–1740.
53
![Page 54: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/54.jpg)
Chauvet, M., Z. Senyuz and E. Yoldas, 2015, What does financial volatility tell us about
macroeconomic fluctuations? Journal of Economic Dynamics and Control 52, 340–360.
Clements, M. and A. B. Galvao, 2008, Macroeconomic forecasting with mixed-frequency data.
Journal of Business & Economic Statistics 26, 546–554.
Clements, M. and A. B. Galvao, 2009, Forecasting U.S. output growth using leading indicators:
an appraisal using MIDAS models. Journal of Applied Econometrics 24, 1187–1206.
Corsi, F., 2009, A simple approximate long-memory model of realized volatility. Journal of
Financial Econometrics 7, 174–196.
Cox, D. R., L. Bondesson, G. Gudmundsson, E. Harsaae, K. Juselius, P. Laake, S. L. Lau-
ritzen and G. Lindgren, 1981, Statistical analysis of time series: some recent developments.
Scandinavian Journal of Statistics 8, 93–115.
Cubadda, G. and B. Guardabascio, 2012, A medium-N approach to macroeconomic forecasting.
Economic Modelling 29, 1099–1105.
Cubadda, G. and A. Hecq, 2011, Testing for common autocorrelation in data rich environments.
Journal of Forecasting 30, 325–335.
Davidson, R. and J. G. MacKinnon, 2006, The power of bootstrap and asymptotic tests. Journal
of Econometrics 133, 421–441.
Davies, R. B., 1987, Hypothesis testing when a nuisance parameter is present only under the
alternatives. Biometrika 74, 33–43.
Dufour, J.-M., D. Pelletier and E. Renault, 2006, Short run and long run causality in time
series: inference. Journal of Econometrics 132, 337–362.
Dufour, J.-M. and E. Renault, 1998, Short run and long run causality in time series: theory.
Econometrica 66, 1099–1126.
54
![Page 55: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/55.jpg)
Dunn, O. J., 1961, Multiple comparisons among means. Journal of the American Statistical
Association 56, 52–64.
Engle, R. F. and J. G. Rangel, 2008, The spline-garch model for low-frequency volatility and
its global macroeconomic causes. Review of Financial Studies 21, 1187–1222.
Eraker, B., C. W. J. Chiu, A. T. Foerster, T. B. Kim and H. D. Seoane, 2015, Bayesian mixed
frequency VARs. Journal of Financial Econometrics 13, 698–721.
Foroni, C., E. Ghysels and M. Marcellino, 2013, Mixed-frequency vector autoregressive models,
in: T. B. Fomby, L. Kilian and A. Murphy, (Eds.), Advances in Econometrics, Vol. 32.
Emerald Group Publishing Limited, pp. 247–272.
Foroni, C. and M. Marcellino, 2014, Mixed frequency structural models: identification, estima-
tion, and policy analysis. Journal of Applied Econometrics 29, 1118–1144.
Foroni, C., M. Marcellino and C. Schumacher, 2015, Unrestricted mixed data sampling (MI-
DAS): MIDAS regressions with unrestricted lag polynomials. Journal of the Royal Statistical
Society Series A 178, 57–82.
Forsberg, L. and E. Ghysels, 2007, Why do absolute returns predict volatility so well? Journal
of Financial Econometrics 5, 31–67.
Ghysels, E., 2015, Macroeconomics and the reality of mixed frequency data. Working Paper,
Dept. of Economics, University of North Carolina at Chapel Hill.
Ghysels, E., J. B. Hill and K. Motegi, 2016a, Simple Granger causality tests for mixed frequency
data. Working paper, Dept. of Economics, University of North Carolina at Chapel Hill.
Ghysels, E., J. B. Hill and K. Motegi, 2016b, Testing for Granger causality with mixed frequency
data. Journal of Econometrics 192, 207–230.
55
![Page 56: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/56.jpg)
Ghysels, E. and J. I. Miller, 2015, Testing for cointegration with temporally aggregated and
mixed-frequency time series. Journal of Time Series Analysis 36, 797–816.
Ghysels, E., P. Santa-Clara and R. Valkanov, 2004, The midas touch: Mixed data sampling
regression models. CIRANO Working Papers 2004s-20, CIRANO.
Ghysels, E., P. Santa-Clara and R. Valkanov, 2006, Predicting volatility: getting the most out
of return data sampled at different frequencies. Journal of Econometrics 131, 59–95.
Ghysels, E., A. Sinko and R. Valkanov, 2007, Midas regressions: further results and new
directions. Econometric Reviews 26, 53–90.
Ghysels, E. and R. Valkanov, 2012, Forecasting volatility with MIDAS, in: L. Bauwens, C.
Hafner and S. Laurent, (Eds.), Handbook of Volatility Models and Their Applications, Chap-
ter 16. John Wiley & Sons, Inc., Hoboken, NJ, USA, pp. 383–401.
Goncalves, S. and L. Kilian, 2004, Bootstrapping autoregressions with conditional heteroskedas-
ticity of unknown form. Journal of Econometrics 123, 89–120.
Goncalves, S. and B. Perron, 2014, Bootstrapping factor-augmented regression models. Journal
of Econometrics 182, 156–173.
Gotz, T. B. and A. Hecq, 2014, Nowcasting causality in mixed frequency vector autoregressive
models. Economics Letters 122, 74–78.
Gotz, T. B., A. Hecq and L. Lieb, 2015, Real-time mixed-frequency vars: nowcasting, back-
casting and granger causality. Working Paper.
Gotz, T. B., A. Hecq and J.-P. Urbain, 2013, Testing for common cycles in non-stationary VARs
with varied frecquency data, in: T. B. Fomby, L. Kilian and A. Murphy, (Eds.), Advances in
Econometrics, Vol. 32. Emerald Group Publishing Limited, pp. 361–393.
56
![Page 57: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/57.jpg)
Gotz, T. B., A. Hecq and J.-P. Urbain, 2014, Forecasting mixed frequency time series with
ECM-MIDAS models. Journal of Forecasting 33, 198–213.
Granger, C. W. J., 1969, Investigating causal relations by econometric models and cross-spectral
methods. Econometrica 37, 424–438.
Granger, C. W. J., 1988, Some recent development in a concept of causality. Journal of
Econometrics 39, 199–211.
Granger, C. W. J. and J.-L. Lin, 1995, Causality in the long run. Econometric Theory 11,
530–536.
Hamilton, J. D. and L. Gang, 1996, Stock market volatility and the business cycle. Journal of
Applied Econometrics 11, 573–93.
Hansen, B. E., 1996, Inference when a nuisance parameter is not identified under the null
hypothesis. Econometrica 64, 413–30.
Heber, G., A. Lunde, N. Shephard and K. Sheppard, 2009, Oxford-man institute’s realized
library (version 0.2). Oxford-Man Institute, University of Oxford.
Hoerl, A. E. and R. W. Kennard, 1970, Ridge regression: biased estimation for nonorthogonal
problems. Technometrics 12, 55–67.
Horowitz, J. L., 2001, The bootstrap, in: J. J. Heckman and E. E. Leamer, (Eds.), Handbook
of Econometrics, Vol. 5, Chapter 52. North Holland Publishing, Amsterdam, pp.3159–3228.
Kadiyala, K. R. and S. Karlsson, 1997, Numerical methods for estimation and inference in
Bayesian VAR-models. Journal of Applied Econometrics 12, 99–132.
Kilian, L., 1998, Small-sample confidence intervals for impulse response functions. Review of
Economics and Statistics 80, 218–230.
57
![Page 58: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/58.jpg)
Kuzin, V., M. Marcellino, and C. Schumacher, 2011, MIDAS vs. mixed-frequency VAR: now-
casting GDP in the euro area. International Journal of Forecasting 27, 529–542.
Litterman, R. B., 1986, Forecasting with Bayesian vector autoregressions - Five years of expe-
rience. Journal of Business & Economic Statistics 4, 25–38.
Lutkepohl, H., 1993, Testing for causation between two variables in higher dimensional VAR
models, in: H. Schneewei, Hans und K. F. Zimmermann, (Eds.), Studies in Applied Econo-
metrics. Springer-Verlag, Heidelberg, pp. 75–91.
Marcellino, M., 1999, Some consequences of temporal aggregation in empirical analysis. Journal
of Business & Economic Statistics 17, 129–136.
Marcellino, M. and C. Schumacher, 2010, Factor MIDAS for Nowcasting and Forecasting with
Ragged-Edge Data: a Model Comparison for German GDP. Oxford Bulletin of Economics
and Statistics 72, 518–550.
McCracken, M. W., M. T. Owyang and T. Sekhposyan, 2015, Real-Time forecasting with a
large, mixed frequency, Bayesian VAR. Working Paper 2015-30, Federal Reserve Bank of St.
Louis.
Mele, A., 2007, Asymmetric stock market volatility and the cyclical behavior of expected
returns. Journal of Financial Economics 86, 446–478.
Miller, J. I., 2011, Conditionally efficient estimation of long-run relationships using mixed-
frequency time series. Working Paper 1103, Department of Economics, University of Missouri.
Miller, J. I., 2014, Mixed-frequency cointegrating regressions with parsimonious distributed lag
structures. Journal of Financial Econometrics 12, 584–614.
Paparoditis, E., 1996, Bootstrapping autoregressive and moving average parameter estimates of
infinite order vector autoregressive processes. Journal of Multivariate Analysis 57, 277–296.
58
![Page 59: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/59.jpg)
Paparoditis, E. and D. N. Politis, 2005, Bootstrap hypothesis testing in regression models.
Statistics & Probability Letters 74, 356–365.
Park, T. and G. Casella, 2008, The Bayesian Lasso. Journal of American Statistical Association
103, 681–686.
Ravikumar, B., S. Ray and N. E. Savin, 2000, Robust Wald tests in SUR systems with adding-
up restrictions. Econometrica 68, 715–720.
Schorfheide, F. and D. Song, 2015, Real-time forecasting with a mixed-frequency VAR. Journal
of Business & Economics Statistics 33, 366–380.
Schwert, G. W., 1989a, Business cycles, financial crises, and stock volatility. Carnegie-Rochester
Conference Series on Public Policy 31, 83–125.
Schwert, G. W., 1989b, Why does stock market volatility change over time? Journal of Finance
44, 1115–53.
Silvestrini, A. and D. Veredas, 2008, Temporal aggregation of univariate and multivariate time
series models: a survey. Journal of Economic Surveys 22, 458–497.
Sims, C. A., 1971, Discrete approximations to continuous time distributed lags in econometrics.
Econometrica 39, 545–563.
Sims, C. A. and T. Zha, 1998, Bayesian methods for dynamic multivariate models. International
Economic Review 39, 949–68.
Tibshirani, R., 1996, Regression shrinkage and selection via the Lasso. Journal of the Royal
Statistical Society (Series B) 58, 267–288.
Vahid, F. and Engle, R. F., 1993, Common trends and common cycles. Journal of Applied
Econometrics 8, 341–360.
59
![Page 60: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/60.jpg)
Van Giersbergen, N. P. A. and J. F. Kiviet, 1996, Bootstrapping a stable AD model: weak vs
strong exogeneity. Oxford Bulletin of Economics and Statistics 58, 631–656.
Vogel, C. R., 2002, Computational methods for inverse problems. Society for Industrial and
Applied Mathematics, Philadelphia.
Waggoner, D. F. and T. Zha, 2003, A Gibbs sampler for structural vector autoregressions.
Journal of Economic Dynamics and Control 28, 349–366.
White, H., 2000, A reality check for data snooping. Econometrica 68, 1097–1126.
Zellner, A., 1996, An introduction to bayesian inference in econometrics. John Wiley and Sons,
Inc.
60
![Page 61: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/61.jpg)
E Tables
61
![Page 62: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/62.jpg)
Tab
le1:
Siz
eof
Gra
nge
rN
on-C
ausa
lity
Tes
tsfr
omX
(m)
toy
X(m
)toy
TL
FU
NR
CC
AP
LS
HA
Rmax
BM
FU
NR∗
CC
A∗
PL
S∗
HA
R∗
σHL
=0
505.4
13.4
18.8
11.4
6.9
4.3
5.7
5.0
4.6
4.9
5.2
250
4.7
6.4
28.6
12.4
5.4
4.4
5.2
4.9
4.7
5.3
5.3
500
4.9
5.6
29.2
12.8
5.2
4.1
5.3
5.1
5.2
6.2
5.3
σHL
=−
0.0
550
5.3
12.6
20.9
16.7
6.6
4.6
5.6
4.7
5.6
8.2
5.0
250
4.8
5.8
28.3
17.1
5.5
4.4
5.2
4.7
4.8
9.2
5.4
500
4.8
6.2
30.0
16.9
4.4
4.2
5.3
5.6
5.2
8.5
4.5
σHL
=−
0.1
505.0
12.5
20.0
32.6
6.3
4.6
5.0
4.8
4.8
19.6
4.8
250
4.6
7.2
28.8
35.7
5.5
4.8
5.1
5.8
5.8
23.2
5.2
500
4.7
5.4
28.9
36.5
5.3
4.5
5.2
4.8
6.3
23.0
5.2
Note
:F
or
test
ing
Gra
nger
non-c
ausa
lity
fromX
(m)
toy
the
figure
sre
pre
sent
per
centa
ges
of
reje
ctio
ns
of
the
asy
mpto
tic
Wald
test
stati
stic
sand
the
rest
rict
edb
oots
trap
vari
ants
(indic
ate
dw
ith
asu
per
scri
pt
’*’)
of
the
follow
ing
appro
ach
es:
The
low
-fre
quen
yV
AR
(LF
),th
eunre
stri
cted
appro
ach
(UN
R),
reduce
dra
nk
rest
rict
ions
wit
hone
fact
or
usi
ng
canonic
al
corr
elati
ons
analy
sis
(CC
A)
or
part
ial
least
square
s(P
LS),
and
wit
hth
ree
imp
ose
dfa
ctors
usi
ng
the
HA
Rm
odel
(HA
R),
themax
-tes
t(max
)and
the
Bay
esia
nm
ixed
-fre
quen
cyappro
ach
(BM
F).
The
lag
length
of
the
esti
mate
dV
AR
sis
equal
toone.
The
under
lyin
gD
GP
isfo
und
in(2
7),
wher
eth
eva
riance
-cov
ari
ance
matr
ixof
the
erro
rte
rmis
equal
toΣu
wit
hσHL
=0,−
0.0
5,−
0.1
.
62
![Page 63: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/63.jpg)
Tab
le2:
Siz
eof
Gra
nge
rN
on-C
ausa
lity
Tes
tsfr
omy
toX
(m)
ytoX
(m)
TL
FU
NR
CC
AP
LS
HA
Rmax
BM
FU
NR∗
CC
A∗
PL
S∗
HA
R∗
σHL
=0
505.7
91.1
82.0
58.6
59.8
6.1
7.9
4.6
8.8
5.0
5.2
250
5.5
11.5
11.8
10.6
10.2
4.6
10.4
5.2
5.3
5.0
5.3
500
5.2
6.7
7.4
6.8
6.6
5.9
7.8
4.6
4.8
4.6
4.8
σHL
=−
0.0
550
5.6
92.4
83.8
62.0
60.6
5.7
4.8
5.6
9.4
4.9
5.6
250
5.4
10.9
13.2
11.9
10.3
4.8
8.9
4.7
4.5
4.4
4.8
500
5.0
7.4
9.2
8.7
7.3
5.5
7.3
5.4
5.4
5.3
5.2
σHL
=−
0.1
505.7
92.1
87.1
64.2
58.5
5.7
0.5
5.5
9.2
4.0
4.4
250
4.7
11.0
20.5
14.8
10.5
5.0
5.4
5.2
6.3
5.1
5.3
500
5.3
7.4
13.8
10.7
7.2
5.5
5.0
5.3
5.7
4.2
5.2
Note
:F
or
test
ing
Gra
nger
non-c
ausa
lity
fromy
toX
(m)
the
figure
sre
pre
sent
per
centa
ges
of
reje
ctio
ns
of
the
asy
mpto
tic
Wald
test
stati
stic
sand
the
unre
stri
cted
boots
trap
vari
ants
(indic
ate
dw
ith
asu
per
scri
pt
’*’)
of
the
follow
ing
appro
ach
es:
The
low
-fre
quen
cyV
AR
(LF
),th
eunre
stri
cted
appro
ach
(UN
R),
reduce
dra
nk
rest
rict
ions
wit
hone
fact
or
usi
ng
canonic
al
corr
elati
ons
analy
sis
(CC
A)
or
part
ial
least
square
s(P
LS),
and
wit
hth
ree
imp
ose
dfa
ctors
usi
ng
the
HA
Rm
odel
(HA
R),
themax
-tes
t(max
)and
the
Bay
esia
nm
ixed
-fre
quen
cyappro
ach
(BM
F).
The
lag
length
of
the
esti
mate
dV
AR
sis
equal
toone.
The
under
lyin
gD
GP
isfo
und
in(2
7),
wher
eth
eva
riance
-cov
ari
ance
matr
ixof
the
erro
rte
rmis
equal
toΣu
wit
hσHL
=0,−
0.0
5,−
0.1
.
63
![Page 64: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/64.jpg)
Tab
le3:
Siz
eof
Gra
nge
rN
on-C
ausa
lity
Tes
tsfr
omy
toX
(m)
(Bon
ferr
oni)
ytoX
(m)
TU
NRb
CC
Ab
PL
Sb
HA
Rb
BM
Fb
UN
R∗ b
CC
A∗ b
PL
S∗ b
HA
R∗ b
σHL
=0
5016
.917
.316
.413
.92.
93.
44.
63.
93.
725
012
.713
.613
.112
.75.
14.
65.
05.
04.
950
012
.412
.412
.112
.24.
45.
05.
04.
64.
9
σHL
=−
0.0
550
17.9
18.9
18.3
15.2
2.1
3.8
4.8
4.0
4.0
250
13.4
14.5
14.4
12.8
4.8
5.4
5.2
5.4
5.4
500
12.3
13.7
14.8
12.5
4.0
5.0
5.2
5.0
4.6
σHL
=−
0.1
5017
.220
.221
.615
.40.
43.
35.
04.
34.
425
012
.015
.417
.012
.23.
34.
95.
34.
84.
750
011
.914
.917
.011
.73.
35.
45.
25.
15.
2
Note
:F
or
test
ing
Gra
nger
non-c
ausa
lity
fromy
toX
(m)
the
figure
sre
pre
sent
per
centa
ges
of
reje
ctio
ns
of
the
Bonfe
rroni-
typ
ete
sts
(indic
ate
dw
ith
asu
bsc
ript
’b’)
for
both
the
asy
mpto
tic
Wald
test
stati
stic
sand
the
unre
stri
cted
boots
trap
vari
ants
(indic
ate
dw
ith
asu
per
scri
pt
’*’)
of
the
follow
ing
appro
ach
es:
The
unre
stri
cted
appro
ach
(UN
R),
reduce
dra
nk
rest
rict
ions
wit
hone
fact
or
usi
ng
canonic
al
corr
elati
ons
analy
sis
(CC
A)
or
part
ial
least
square
s(P
LS),
and
wit
hth
ree
imp
ose
dfa
ctors
usi
ng
the
HA
Rm
odel
(HA
R)
and
the
Bay
esia
nm
ixed
-fre
quen
cyappro
ach
(BM
F).
For
the
low
-fre
quen
cyV
AR
(LF
)and
themax
-tes
t(max
)th
eB
onfe
rroni-
typ
ete
stis
not
applica
ble
.T
he
lag
length
of
the
esti
mate
dV
AR
sis
equal
toone.
The
under
lyin
gD
GP
isfo
und
in(2
7),
wher
eth
eva
riance
-cov
ari
ance
matr
ixof
the
erro
rte
rmis
equal
toΣu
wit
hσHL
=0,−
0.0
5,−
0.1
.
64
![Page 65: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/65.jpg)
Tab
le4:
Pow
erof
Gra
nge
rN
on-C
ausa
lity
Tes
tsfr
omX
(m)
toy
X(m
)toy
TL
Fmax
BM
FU
NR∗
CC
A∗
PL
S∗
HA
R∗
Pow
er-D
GP
1,
β=
0.5
5064
.139
.652
.823
.112
.754
.062
.625
099
.910
0.0
99.8
99.8
96.0
98.7
100.
050
010
0.0
100.
010
0.0
100.
010
0.0
100.
010
0.0
Pow
er-D
GP
1,
β=
0.8
5093
.679
.492
.254
.626
.677
.993
.025
010
0.0
100.
010
0.0
100.
010
0.0
100.
010
0.0
500
100.
010
0.0
100.
010
0.0
100.
010
0.0
100.
0
Pow
er-D
GP
1,
β=
2
5010
0.0
100.
010
0.0
100.
085
.398
.410
0.0
250
100.
010
0.0
100.
010
0.0
100.
010
0.0
100.
050
010
0.0
100.
010
0.0
100.
010
0.0
100.
010
0.0
Pow
er-D
GP
2,
β=
0.5
505.
814
.521
.99.
78.
025
.122
.325
05.
075
.979
.271
.258
.879
.789
.450
04.
798
.699
.098
.387
.697
.399
.8
Note
:F
or
test
ing
Gra
nger
non-c
ausa
lity
fromX
(m)
toy
the
figure
sre
pre
sent
per
centa
ges
of
reje
ctio
ns
of
the
asy
mpto
tic
Wald
test
stati
stic
sof
the
low
-fre
quen
yV
AR
(LF
),th
emax
-tes
t(max
)and
the
Bay
esia
nm
ixed
-fre
quen
cyappro
ach
(BM
F).
Furt
her
more
,it
conta
ins
per
centa
ges
of
reje
ctio
ns
of
the
rest
rict
edb
oots
trap
vari
ants
(indic
ate
dw
ith
asu
per
scri
pt
’*’)
of
the
unre
stri
cted
appro
ach
(UN
R),
reduce
dra
nk
rest
rict
ions
wit
hone
fact
or
usi
ng
canonic
al
corr
elati
ons
analy
sis
(CC
A)
or
part
ial
least
square
s(P
LS),
and
wit
hth
ree
imp
ose
dfa
ctors
usi
ng
the
HA
Rm
odel
(HA
R).
The
lag
length
of
the
esti
mate
dV
AR
sis
equal
toone.
The
under
lyin
gD
GP
isfo
und
in(2
8)
or
(29),
wher
eth
eva
riance
-cov
ari
ance
matr
ixof
the
erro
rte
rmis
equal
toΣu
wit
hσHL
=−
0.0
5.
Inp
ower
-DG
P1
the
Gra
nger
causa
lity
det
erm
inin
gco
effici
ents
do
not
sum
up
toze
ro,
wher
eas
they
do
soin
pow
er-D
GP
2.
See
Sec
tion
4fo
rdet
ails.β
=0.5,0.8,2
inp
ower
-DG
P1
andβ
=0.5
inp
ower
-DG
P2.
65
![Page 66: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/66.jpg)
Tab
le5:
Pow
erof
Gra
nge
rN
on-C
ausa
lity
Tes
tsfr
omy
toX
(m)
ytoX
(m)
TL
Fmax
BM
FU
NR∗
CC
A∗
PL
S∗
HA
R∗
UN
R∗ b
CC
A∗ b
PL
S∗ b
HA
R∗ b
Pow
er-D
GP
150
6.9
8.0
15.8
5.6
9.9
6.2
6.2
6.8
11.4
10.7
10.8
250
19.9
24.2
31.0
17.7
20.0
20.3
19.3
51.0
54.3
54.7
53.8
500
36.4
53.2
49.1
39.3
44.7
43.9
42.4
83.1
85.2
85.2
84.6
Pow
er-D
GP
250
4.7
7.9
18.4
5.7
10.6
5.7
5.3
5.9
10.0
8.9
8.0
250
5.0
22.9
25.1
13.5
14.6
14.3
14.3
32.2
34.7
34.2
32.6
500
5.2
49.9
33.7
26.6
29.9
28.4
27.7
57.3
62.2
62.2
59.3
Note
:F
or
test
ing
Gra
nger
non-c
ausa
lity
fromy
toX
(m)
the
figure
sre
pre
sent
per
centa
ges
of
reje
ctio
ns
of
the
asy
mpto
tic
Wald
test
stati
stic
sof
the
low
-fre
quen
yV
AR
(LF
),th
emax
-tes
t(max
)and
the
Bay
esia
nm
ixed
-fre
quen
cyappro
ach
(BM
F).
Furt
her
more
,it
conta
ins
per
centa
ges
of
reje
ctio
ns
of
the
unre
stri
cted
boots
trap
vari
ants
(indic
ate
dw
ith
asu
per
scri
pt
’*’)
of
the
unre
stri
cted
appro
ach
(UN
R),
reduce
dra
nk
rest
rict
ions
wit
hone
fact
or
usi
ng
canonic
al
corr
elati
ons
analy
sis
(CC
A)
or
part
ial
least
square
s(P
LS),
and
wit
hth
ree
imp
ose
dfa
ctors
usi
ng
the
HA
Rm
odel
(HA
R).
For
the
latt
erse
tof
appro
ach
es,
itals
osh
ows
the
outc
om
esfo
rth
eB
onfe
rroni-
typ
ete
sts
(indic
ate
dw
ith
asu
bsc
ript
’b’)
.T
he
lag
length
of
the
esti
mate
dV
AR
sis
equal
toone.
The
under
lyin
gD
GP
isfo
und
in(2
8)
or
(29),
wher
eth
eva
riance
-cov
ari
ance
matr
ixof
the
erro
rte
rmis
equal
toΣu
wit
hσHL
=−
0.0
5.
Inp
ower
-DG
P1
the
Gra
nger
causa
lity
det
erm
inin
gco
effici
ents
do
not
sum
up
toze
ro,
wher
eas
they
do
soin
pow
er-D
GP
2.
See
Sec
tion
4fo
rdet
ails.β
=0.5
andγ2,1
=0.2
inb
oth
pow
er-D
GP
s.
66
![Page 67: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/67.jpg)
Table 6: Testing for Granger Causality between BV (20) and ipi
BV (20) to ipip = 1 p = 2
Wald Robust Wald Robust
LF 0.0 0.2 0.1 2.1max 0.1 ./. 0.0 ./.BMF 14.4 ./. 21.8 ./.
UNR∗ 15.0 17.4 7.3 8.7CCA1∗ 2.9 7.9 12.4 20.4CCA2∗ 14.5 16.2 19.2 25.1PLS1∗ 0.1 0.7 0.2 1.9PLS2∗ 0.6 2.8 0.1 1.4HAR∗ 1.2 2.2 1.5 2.4
ipi to BV (20) p = 1 p = 2Wald Robust Wald Robust
LF 0.1 1.5 0.4 5.0max 4.4 ./. 1.1 ./.BMF 23.5 ./. 42.5 ./.
UNR∗ 28.9 34.2 25.8 26.4CCA1∗ 25.8 33.5 20.5 38.4CCA2∗ 35.9 46.1 17.6 34.3PLS1∗ 15.8 24.6 11.4 24.4PLS2∗ 18.3 26.9 10.3 18.0HAR∗ 19.2 27.0 11.4 18.6
UNR∗b 5.4 8.9 20.0 22.9CCA1∗b 4.1 10.5 7.8 16.9CCA2∗b 5.0 10.8 7.7 15.1PLS1∗b 1.5 4.3 1.5 9.5PLS2∗b 1.2 5.2 1.1 4.0HAR∗b 2.3 6.5 6.2 14.1
Note: For testing Granger non-causality between BV (20) and ipi the figures represent p-values (in percentages)of the asymptotic Wald test statistics, the restricted bootstrap variants for the direction from BV (20) to ipi andthe unrestricted bootstrap variants for the direction from ipi to BV (20) (bootstrap variants are indicated with asuperscript ’*’, the Bonferroni-type tests with a subscript ’b’). For all approaches but the max-test (max) andthe Bayesian MF-VAR (BMF) a robustified version is computed as well. The lag length of the estimated VARsis equal to one or two. For CCA and PLS the results for one and two factors are displayed.
67
![Page 68: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/68.jpg)
F Figures
Figure 1: Parameter values for 2w∗j−1(−0.01) and γ∗j,1 (γ2,1 = 0.25)
68
![Page 69: Researchersresearchers-sbe.unimaas.nl/alainhecq/wp-content/uploads/... · 2016. 3. 8. · Testing for Granger Causality in Large Mixed-Frequency VARs Thomas B. G otz∗1, Alain Hecq†](https://reader036.vdocument.in/reader036/viewer/2022081614/5fc0c57c422d4366f85c8de2/html5/thumbnails/69.jpg)
Figure 2: ipi and BV (20)
Note: This figure shows the monthly growth rate of the U.S. industrial production index (lower line), i.e., ipi,
and the logarithm of daily bipower variation of the S&P500 stock index (top lines), i.e., BV (20), for the time
period from January 2000 to June 2012. The graph for the former is shifted downwards to enable a visual
separation of ipi from the bv-lines.
69