estimating var-mgarch models in multiple...

Estimating VAR-MGARCH Models inMultiple Steps

M. Angeles Carnero� and M. Hakan Eratalayy

Dpt. Fundamentos del Análisis EconómicoUniversidad de Alicante

PRELIMINARY VERSION

June 12, 2009

Abstract

This paper analyzes the performance of multiple steps estimators of Vector Autoregres-sive Multivariate Conditional Correlation GARCH models by means of Monte Carloexperiments. High dimensionality is one of the main problems in the estimation of theparameters of these models and therefore maximizing the likelihood function of thefull model leads commonly to numerical problems. We show that if innovations aregaussian, estimating the parameters in multiple steps is a reasonable alternative to thefull maximization of the likelihood. Our results also suggest that for the sample sizesusually encountered in �nancial econometrics, the di¤erences between the volatilityestimates obtained with the more e¢ cient estimator and the multiple steps estima-tors are negligible. However, this does not seem to be the case if the distribution isa Student-t for some Conditional Correlation GARCH models. Then, multiple stepsestimators lead to volatility estimates which could be far from the ones implied by thetrue parameters. The results obtained in this paper are used for estimating volatilityinteractions in 10 European stock markets.

JEL Classi�cation: C32

Keywords: Volatility Spillovers, Financial Markets.

�E-mail: [email protected] Author. E-mail: [email protected]

1

1 Introduction

Understanding how stock market returns and volatilities move over time has been ofgreat concern to researchers into the time series literature. In addition, as the recent�nancial crisis has shown, it is also very important to realize that stock markets movetogether. There are many examples con�rming this incidence: a very big investmentbank in US, Lehman Brothers, declares bankruptcy and this increases the losses in thestock market of some other country. Barack Obama�s $787 billion economic stimu-lus package has been passed by the US Congress and there were positive reactions tothis in local and global stock markets. Therefore, trying to model stock markets in aunivariate way ignoring this kind of interaction would be insu¢ cient. In this sense, Mul-tivariate generalized autoregressive conditional heteroskedasticity (MGARCH) modelshave been popular tools in the literature to model volatility-covolatility of assets andmarkets; see, for example, Bauwens et al. (2006) and Silvennoinen et al. (2008) for asurvey.The problem with many MGARCHmodels is that it is often di¢ cult to verify if the

conditional variance-covariance matrix is positive de�nite. Engle et al. (1984) providenecessary conditions for the positive de�niteness of the variance-covariance matrix ina bivariate ARCH setting, but extensions of these results to more general models arevery complicated. Moreover, imposing restrictions on the log-likelihood function duringoptimization to obtain such conditions is often di¢ cult.A model that could avoid these problems is the Constant Conditional Correlation

GARCH (CCC-GARCH) model proposed by Bollerslev (1990). He shows that thegaussian maximum likelihood (ML) estimator of the correlation matrix is the samplecorrelation matrix which is always positive de�nite, leading to the result that the op-timization will not fail as long as the conditional variances are positive. On the otherhand, he notes that the correlation matrix can be concentrated out of the log-likelihoodfunction, what simpli�es more the optimization problem. Due to this computationalsimplicity, the CCC-GARCH model has become very popular in the literature. How-ever, the CCC-GARCHmodel has several limitations. One of them is that it fails to ex-plain possible volatility interactions as pointed out by Ling and McAleer (2003). Theyfavor the extension proposed by Jeantheau (1998), ECCC-GARCH, where volatil-ity spillovers can be considered. Another limitation of the CCC-GARCH model is theconstant correlation assumption; see Christodoulakis and Satchell (2002), Engle (2002)and Tse and Tsui (2002) who propose the Dynamic Conditional Correlation GARCH(DCC-GARCH) model where the correlation changes over time.The DCC-GARCH model is computationally more complicated than the CCC-

GARCH since the correlation dynamics require the estimation of more parameters.Fortunately, the variance targeting approach, in which the intercept in the correlationequation is replaced by its sample counterpart, makes the estimation managable.

2

Aielli (2006) points our severe problems with the variance targeting approach usedwhen estimating the DCC-GARCH model in multiple steps. He suggests a correctionto the DCC-GARCH model, which is referred to as Corrected DCC-GARCH (cDCC-GARCH) model. However, as also pointed out in Caporin and McAleer (2009), thecDCC-GARCH model does not allow for variance targeting. Therefore the model ofAielli (2006) su¤ers from the curse of dimensionality since as the number of time seriesconsidered increases, the number of correlation parameters to be estimated increasesat a geometric rate.Pelletier (2006) introduces the Regime Switching Dynamic Correlation GARCH

(RSDC-GARCH) model in which correlations between series are allowed to be constantover time but changing between di¤erent regimes and driven by an unobserved markovswitching chain. This model is, somehow, in between the CCC-GARCH model andDCC-GARCH model, with the problem that the number of correlation parameters tobe estimated increases rapidly with the number of series considered.When dealing with stock market returns, it is not unusual to �nd some dynamics in

the conditional mean, that could be approximated by a VARMA model. One way toestimate the parameters of the full VARMA-MGARCH conditional correlation modelwould be solving the optimization problem of a full gaussian log-likelihood function;therefore obtaining the estimates for all the parameters in one step. If the true datagenerating process (DGP) is also gaussian, then it is known that ML estimators areconsistent and asymptotically normal. In the case that the true DGP is not gaussian,then we would be using quasi-maximum likelihood (QML) estimators. Bollerslev andWooldridge (1992) show that under quite general conditions, these QML estimators areconsistent and asymptotically normal. Bollerslev (1990), Longin and Solnik (1995) andNakatani and Teräsvirta (2008) are few of the papers using one-step estimation. Esti-mating all parameters in one step would be the best one could achieve, however whenthere are many equations and parameters involved, it is very heavy computationally,when feasible. As Longin and Solnik (1995) state, sometimes even the convergence inthe optimization process is only possible for very speci�c starting values for the para-meters. In addition, if some of the coe¢ cients are not strongly signi�cant, it is hardto achieve iterative convergence with a small sample.Under the normality assumption for the DGP, the parameters could also be esti-

mated in two steps by estimating �rst the mean and variance parameters assuming nocorrelation and then, in a second step, estimating the correlation parameters given theestimates from the �rst step; see, for example, Engle (2002). However, as Engle andSheppard (2001) suggest for the DCC-GARCH model, these two-step estimators willbe consistent and asymptotically normal but not fully e¢ cient since they are limitedinformation estimators.The three-steps estimation method is mentioned in Bauwens (2006) and consists

of estimating the mean equation parameters in a �rst step, obtaining the variance

3

equation parameters in a second step and �nally in the third and last step, given allother parameter estimates, estimating the correlation parameters. If the parameters ofthe �rst step are consistently estimated, then the estimators from the second step willbe consistent and therefore the third step estimators will also be consistent. The secondand third steps of the three-steps estimation procedure will be equivalent to the two-steps estimation method for a zero-mean series. Therefore the three-steps estimatorsare also asymptotically normal. Engle and Sheppard (2001) use such a three-stepsprocedure in the empirical part of their paper.We also consider estimating the models using the log-likelihood function based on

Student-t distribution. If the true distribution of the errors is also a t-distribution,then the estimators are ML estimators, if not then they will be QML estimators. Inone-step estimation, all the parameters of the model are estimated maximizing the log-likelihood function based on the multivariate t-distribution; see for example Harveyet al:(1992) and Fiorentini et al: (2003). Although it would not have any theoreticalsupport, we also tried two-step and three-step estimation methods for this case. We�nd that some, although not all, of the parameter estimates obtained in multiple stepscan be very far from their true values.For the MC studies where we obtained unbiased estimators, we checked the robust-

ness of the results to the distribution assumption. For example, since the two-step andthree-step estimators for CCC-GARCH model simulated and estimated with gaussianerrors were unbiased, we checked the robustness of this result by simulating the modelwith t-distribution and estimating with gaussian distribution. Our results show that,the estimators that are unbiased under the correctly speci�ed error distribution arealso unbiased when the true distribution was changed.Although the two-step estimation takes less computational time compared to one-

step estimation, it is still costly compared to three step estimation method, speci�callywhen dealing with big samples, many equations and/or parameters. These three esti-mation methods will be explained later in detail in section 2 of the paper.In this paper, we consider the e¢ ciency loss that Engle (2002) and Engle and

Sheppard (2001) talk about when estimating the correlation parameters separatelyfrom the mean and variance parameters. This type of e¢ ciency loss will be analyzedby comparing the one-step and two-step estimation methods. As we shall see, whenthe errors are assumed to be gaussian, the small sample behavior of one-step and two-steps estimators are very similar. On the other hand, when the estimation is basedon t-distribution, there are cases where two-steps estimators deviate from one-stepestimators.Additionally, we analyze the e¤ects of separating the estimation of mean and vari-

ance parameters in small samples. Bauwens (2006) points out that the conditionalmean parameters may be estimated consistently in the �rst stage for example for aVector Autoregressive Moving Average (VARMA) model, prior to estimating the con-

4

ditional variance parameters. He notes that estimating all the parameters in one stagewouldn�t bring about gains in e¢ ciency in large samples, since the variance-covariancematrix is asymptotically block diagonal between the mean and variance parameters. Inaddition, since one-step estimation is computationally more di¢ cult, what is generallydone is taking a very simple model for the conditional mean of the series or �ttingthe demeaned series, obtained after a mean regression, to estimate the variance para-meters; see for example Engle and Sheppard (2001). We analyze the e¤ects of suchan estimation procedure in small samples by comparing the two-step and three-stepmethods.In this paper we conduct Monte Carlo (MC) studies to �nd out how much e¢ ciency

is lost in small samples when estimating the above-mentioned conditional correlationmodels if the estimation of the parameters are done in two or three steps instead of onestep. Assuming gaussian errors, we �rst analyze the e¤ects of estimation in multiplesteps of a CCC-GARCH model by considering di¤erent values for the parameters tosimulate two time series. To see that the e¤ects are also robust for the number of theseries or parameters considered, we consider also three time series and a study witha di¤erent mean structure. Later, we study the other conditional correlation models,ECCC, DCC, cDCC, RSDC-GARCH. As reported in the results of the MC studies,when the estimations are based on gaussian distribution the di¤erence between the one-step and multiple steps estimators in small samples is negligible. When we change theassumption on the distribution of errors to t-distribution, for some models we concludethat the di¤erences between the estimators could be relevant.Up to our knowledge, this is the �rst work comparing these multiple steps estimators

and analyzing the e¢ ciency lost when employing them instead of one-step estimationprocedure in small samples. Finally, we use the three-steps procedure to estimate thevolatility spillovers in ten major European stock markets.The rest of the paper is structured as follows: in section 2 we present the models that

are to our interest. Di¤erent estimators of the parameters of these models are discussedin section 3. In section 4 we introduce the basis of the MC experiments we conductedand discuss the results we obtained. In section 5, we use our results to estimate aVAR-ECCC-GARCH model with volatility spillovers using the data obtained from 10major stock markets in Europe. Finally in section 6 concludes the paper.

2 Econometric Models

For simplicity we consider a k-variate vector autoregressive (VAR) model of order onefor the mean equation with the following notation:

5

Yt = �+ �Yt�1 + "t

(1)

where V ar("t jYt�1; :::Y1) = Ht, Yt is a kx1 vector of returns, � is a kx1 vector ofconstants, � is a kxk matrix of autoregressive coe¢ cients and "t is a kx1 vector of errorterms as follows:

Yt =�y1t y2t : : ykt

�0, � =

��1 �2 : : �k

�0

� =

2666664�11 �12 : : �1k�21 : :

: : :

: : :

�k1 : : : �kk

3777775, "t =�"1t "2t : : "kt

�0

The stationarity of this VAR(1) model is assured if all the values of z that solvethe following equation are outside of the unit circle: jIk � �zj = 0The coe¢ cient matrix � could be chosen as diagonal which wouldn�t allow for the

return spillovers. In this case there will be 2k mean parameters to estimate. Otherwisethis number will be k(k + 1):The conditional variance Ht is modelled in the following way:

"t = H1=2t �t, where �t is a kx1 vector with E(�t) = 0 and V ar(�t) = Ik

Ht = DtRtDt, where Dt = diag(h1=21t ; h

1=22t ; :::; h

1=2kt ) and Rt is the kxk conditional

correlation matrix such that:

Ht = diag(h1=21t ; h

1=22t ; :::; h

1=2kt )

26666641 �12t : : �1kt�12t 1 :

: : :

: : :

�1kt : : : 1

3777775 diag(h1=21t ; h1=22t ; :::; h1=2kt )

=

2666664h1t �12t

ph1th2t : : �1kt

ph1thkt

�12tph1th2t h2t :

: : :

: : :

�1ktph1thkt : : : hkt

3777775which ensures that as long as the conditional correlation matrix is positive de�nite

and conditional variances are positive, the conditional variance-covariance will be pos-itive de�nite as well. For parsimony, the variances of the error terms of each series aremodelled as GARCH(1,1):

6

~ht = W + A~"2t +G~ht�1

(2)

where ~ht = (h1t; h2t; :::; hkt)0 and ~"2t = ("

21t; "

22t; :::; "

2kt)

0are kx1 vector of conditionalvariances and squared errors respectively and W is a kx1 and A andG are kxk matricesof coe¢ cients. A and G matrices could be restricted to be diagonal, which wouldn�tallow for volatility spillovers, as in Bollerslev (1990) or Engle (2002). On the other hand,it could be non-diagonal as in Jeantheau (1998) or Ling and McAleer (2003) allowingfor volatility spillovers. In the former case there will be 3k variance parameters toestimate, while in the latter this number will be k(2k + 1).If we denote by !i = [W ]i, �ij = [A]i;j and ij = [G]i;j, for the variances to be always

positive, Jeantheau (1998) requires the following trivial set of su¢ cient conditions:

!i > 0, �ij > 0, ij > 0 for all i, j.

Nakatani and Teräsvirta (2008) provide necessary and su¢ cient conditions wherethey allow for negative volatility spillovers. For an empirical work, see Conrad andKaranasos (2008). For the covariance stationarity, the roots of the following determi-nant has to be outside of unit circle is required:

jIk � (A+G)zj = 0

which in the diagonal case implies that the persistence term "�+ " should be:

�ii + ii < 1 for all i.

We consider four di¤erent models for the conditional correlations. The �rst andsimplest is the Constant Conditional Correlation (CCC) GARCH model of Bollerslev(1990) where he restricts the correlations to be constant over time. He shows in hispaper that, under this restriction, gaussian QML estimator of the correlation matrix,R, is equal to the sample correlation of the standardized residuals such that:

[R]ij = b�ij = Pt vitvjtq

(P

t v2it)(P

t v2jt)

(3)

where vt = D�1t "t are the standardized errors. Therefore the number of correlation

parameters to be estimated is only k(k � 1)=2.

7

The second model we consider is the Dynamic Conditional Correlation (DCC)GARCH of Engle (2002) given by:

Rt = PtQtPtPt = diag(Q

�1=2t )

Qt = (1� �1 � �2) �Q+ �1vt�1v0t�1 + �2Qt�1,

where Qt is generally referred to as non-standardized correlations and �Q is the longrun correlation matrix that could be replaced by the correlation matrix of the standard-ized errors which is a consistent estimator of �Q. This latter process is called "variancetargeting" and it makes the estimation easier, reducing the number of parameters toestimate from k(k � 1)=2 + 2 to only 2. vt is the vector of standardized errors de�nedas before. �1 and �2 are non-negative scalars which satisfy �1 + �2 < 1, to have thecorrelation matrix, Rt; positive de�nite. As one can notice, these scalar coe¢ cients im-pose the same dynamics on di¤erent correlations. Hafner and Franses (2003) providea more general de�nition of the model where they consider coe¢ cient matrices insteadof scalar coe¢ cients allowing for di¤erent dynamics on di¤erent correlations, howeverthis increases the number of parameters considerably. For simplicity, we will focus onthe set up with the scalar coe¢ cients.Engle and Sheppard (2001) and later Engle, Sheppard, Shepherd (2007) point out

that when using large systems of equations, i.e. k is large, estimating the correlationparameters of a DCC-GARCH model separately, as in two-steps or three-steps, bringsabout signi�cant biases in the estimates of the dynamic parameters �1 and �2. Aielli(2006) shows that two-step or three-step estimation of DCC-GARCH models with vari-ance targeting is inconsistent since the correlation matrix of the standardized residualsis not a consistent estimator of the long run correlation matrix. He suggests a correctedversion of the DCC-GARCH model, in this paper referred to as cDCC-GARCH, whichreplaces the Qt equation of DCC-GARCH model:

Qt = �+ �1Q�t�1(vt�1v

0t�1 �Rt�1)Q

�t�1 + (�1 + �2)Qt�1

where Q�t = diag(Q1=2t ) = inv(Pt).

Although Aielli (2006) suggested this model to resolve the inconsistency problemin estimating the unconditional correlation matrix with sample correlation matrix, hismodel does not allow variance targeting. Therefore, his model su¤ers from the curse ofdimensionality since the number of parameters to be estimated, including the interceptmatrix, is k(k � 1)=2 + 2 which is O(k2). On the other hand, it is worth mentioningthat Engle, Sheppard and Shepherd (2007) employs Aielli�s model and they use variancetargeting, where they estimate the unconditional correlation matrix through a single-factor model.

8

The last model of conditional correlations we will consider is the Regime Switch-ing Dynamic Correlations GARCH, (RSDC-GARCH) model of Pelletier (2006). Inhis model, the conditional correlations follow a switching regime driven by an unob-served Markov chain such that they are �xed in each regime but may change acrossregimes. For simplicity we assume a two-state Markov process such that Rt, at anytime t, could be equal to either RL or RH , which stand for low and high state cor-relation matrices respectively. The matrix of transition probabilities are given by� = ff�L;L; �H;Lg; f�L;H ; �H;Hgg, where �i;j is the probability of passing from statej to state i. Given that �j;j + �i;j = 1, in this model the number of parameters toestimate is equal to 2k(k � 1) + 2:Next section discusses how the parameters of these models are estimated. The

estimation of RSDC-GARCH models will be discussed separately.

3 Estimation Procedures

For notation purposes, we denote all the mean parameters by � = (�0; vec(�)0)0, allthe variance parameters by � = (W 0; vec(A0); vec(G0))0, and all the correlation para-meters by which changes according to the model considered above. For example,for a CCC-GARCH model = vech(R), while for cDCC-GARCH it becomes: =(vech()0; �1; �2)

0:1

3.1 CCC, DCC and cDCC GARCH models

3.1.1 One-step Estimation

In the one-step estimation procedure, all parameters of the model, � = (�0; �0; 0)0 ,are estimated in one step simultaneously. If the DGP is gaussian (not gaussian), weassume that it is normal, then the estimation is a ML (QML) estimation and it iscarried out by maximizing the multivariate gaussian log-likelihood function:

L(�) = �Tk2log(2�)� 1

2

PTt=1(log jHtj+ "0tH

�1t "t)

Under the assumption that �ijt = �ijt�it�jt, this log-likelihood function boils downto:

L(�) = �Tk2log(2�)� 1

2

PTt=1 log jDtRtDtj � 1

2

PTt=1 "

0t(DtRtDt)

�1"t

= �Tk2log(2�)� 1

2

PTt=1 log jRtj �

PTt=1 log jDtj � 1

2

PTt=1 v

0tR

�1t vt

Since all the parameters are estimated using the full log-likelihood function in onestep, the estimators are consistent, asymptotically normal and if the DGP is also

1vec operator stacks the colums of a matrix while vech operator stacks the columns of the lowertriangular part of a matrix.

9

normal they are asymptotically e¢ cient. The problem with this estimation would bethat in every iteration, the correlation matrix has to be inverted which slows downthe optimization process. On the other hand, when the number of series, k, increases,the number of parameters to be estimated will increase rapidly, which might causeconvergence problems speci�cally in small samples.If we would assume that the errors follow a t-distribution, then the multivariate

t-loglikelihood function to be maximized would be as in Fiorentini et al. (2003):

L(�; �) = T log(�(�k+12�))� T log(�( 1

2�))� Tk

2log(1�2�

�)� Tk

2log(�)

�PT

t=1

n�12log jHtj �

��k+12�

�log(1 + �

1�2�v0tR

�1t vt)

owhere � is the inverse of the degrees of freedom as a measure of tail thickness such

that it is always positive and strictly less than 0:5 for the existence of the second ordermoments. One concern on using a loglikelihood function based on t-distribution isthat, as shown in Newey and Steigerwald (1997), although it will yield more e¢ cientestimators than using gaussian loglikelihood when the assumed distribution is correct,there is the danger that one may end up sacri�cing consistency when it is not. Since thisinconsistency problem will not occur as long as both the true and assumed distributionsare symmetric, it will not be a problem for our Monte Carlo simulations in section 4.

3.1.2 Two-steps Estimation

This is the procedure proposed in Engle (2002) or in Engle and Sheppard (2001) for aDCC-GARCH model. The idea is to separate the estimation of the correlation para-meters from the mean and variance parameters. Assuming gaussian errors, in the �rststep all parameters except the correlation coe¢ cients, � = (�0; �0)0, are estimated bymaximizing the gaussian log-likelihood function mentioned above replacing the correla-tion matrix Rt with the identity matrix of size kxk; therefore maximizing the followingquasi-log-likelihood function:

QL(�) = �Tk2log(2�)�


2

PTt=1 v

0tvt

Notice that in the case where there are no volatility spillovers allowed, this �rststep estimation is equivalent to estimating k univariate models separately. See Engleand Sheppard (2001) for details. However, when there are volatility spillovers, this �rststep estimation must be done in a multivariate fashion.In the second step, given the estimates from the �rst step, �, the correlation coef-

�cients are estimated by:

QL( j�) = �12

PTt=1

�log jRtj � v0tR

�1t vt

10

Where the vt is the standardized residuals obtained in the �rst step. As shown inEngle and Shepperd (2001), this two step estimation yields estimators which are consis-tent and asymptotically normal but not fully e¢ cient since they are limited informationestimators. Their proof follows closely the discussion of the asymptotic properties oftwo-step GMM estimators in Newey and McFadden (1994). Bollerslev (1990) showsthat when the correlations are constant over time, this second step estimators are equalto the sample correlation matrix of the standardized residuals:

b�ij = Pt vitvjtp

(Pt v

2it)(

Pt v

2jt)

Although itdoesn�t have theoretical support, we also consider a two-step estimationusing log-likelihood function based on t-distribution. In the multivariate t-based log-likelihood function, we replace the correlation matrix Rt with Ik in the �rst step:

QL(�; �) = T log(�(�k+12�))� T log(�( 1

2�))� Tk

2log(1�2�

�)� Tk

2log(�)

�PT

t=1

n� log jDtj �

��k+12�

�log(1 + �

1�2�v0tvt)o

Similar to the gaussian log-likelihood case, in the case where there are no volatil-ity spillovers, we employ univariate estimation for each series while when there arevolatility spillovers, we solve the multivariate problem. In the second step of the esti-mation based on t-distribution, we use the second step estimation with the gaussianloglikelihood.

3.1.3 Three-steps Estimation

This procedure is mentioned in Bauwens (2006) for the gaussian log-likelihood estima-tion. In the �rst step, assuming constant variance for each error term, i.e. �2it = �2i 8t, and assuming that the the correlation matrix Rt is equal to the identity matrix forall t; we estimate parameters of the mean equation, �, and the variance parameters �2iby maximizing the following quasi-loglikelihood function:

QL(�; �2i ) = �Tk2log(2�)�

PTt=1 log jDj � 1

2

PTt=1 v

0tvt

whereD is a vector of conditional standard deviations: D = diag(h1=21 ; h

1=22 ; :::; h

1=2k ).

This �rst step estimation is equivalent to doing OLS estimation for the mean equations,equation by equation, given that under the assumptions made, the variance-covariancematrix is diagonal between the mean and variance parameters. Therefore the estima-tors obtained from this �rst step are consistent, disregarding that the errors are actuallyheteroskedastic. Later in the second step, the parameters of the variance equations, �,are estimated, given the estimates of the parameters of the mean equation, b�, imposingto the full loglikelihood function only the assumption that the correlation matrix Rt isreplaced with identity matrix. This gives the following quasi-loglikelihood function:

11

QL(�jb�) = �Tk2log(2�)�


2

PTt=1t "

0tD

�2t "t

where "t are the residuals obtained from the �rst step. After obtaining the estima-tors b� and � from the �rst two steps, in the third step, the correlation parameters areestimated as in the two-step estimation:

QL( jb�; �) = �12

PTt=1

�log jRtj � v0tR

�1t vt

where vt are the standardized residuals obtained from the �rst two steps. Similarly,

as in the two-step estimation, when the correlations are constant over time, this thirdstep estimators could be calculated from the sample correlations of the standardizedresiduals:

b�ij = Pt vitvjtp

(Pt v

2it)(

Pt v

2jt)

In case of a log-likelihood function based on t-distribution, these three steps areperformed in a similar manner. In the �rst step, mean parameters, �, are estimatedalong with the inverse of the degrees of freedom assuming homoskedasticity, i.e. �2it =�2i 8 t, by maximizing the �rst step loglikelihood function:

QL(�; �) = T log(�(�k+12�))� T log(�( 1

2�))� Tk

2log(1�2�

�)� Tk

2log(�)

�PT

t=1

n� log jDtj �

��k+12�

�log(1 + �

1�2�v0tvt)o

The estimated variances, �2i , and degrees of freedom, �, of the �rst step estimationare disregarded. In the second step, the variance parameters, �, and the inverse ofthe degrees of freedom, �, that is reported in the results are estimated given the meanparameter estimates, �, maximizing the following log-likelihood function:

QL(�; �j�) = T log(�(�k+12�))� T log(�( 1

2�))� Tk

2log(1�2�

�)� Tk

2log(�)

�PT

t=1

n� log jDtj �

��k+12�

�log(1 + �

1�2� "0tD

�2t "t)

oAs in the two-step estimation procedure, we use the gaussian log-likelihood function

to estimate the correlation parameters, in the third step of the three-step estimationwith t-based log-likelihood function. Our results from the MC studies didn�t changewhen we chose a t-based log-likelihood function in the estimation of correlation para-meters.

3.2 Estimation with RSDC-GARCH Model

The RSDC-GARCH model of Pelletier (2006) suggests correlations to be constant overtime but di¤erent between regimes where these regimes are driven by an unobserved

12

markov chain. In our paper, we will make use of his unrestricted two regime RSDCmodel. In this model, the correlation matrix at any time t could be either RL or RH .The matrix of transition probabilities is given by � = ff�L;L; �H;Lg; f�L;H ; �H;Hgg,where �i;j is the probability of passing from state j to state i. If we denote by t�1 allthe information up to t� 1, then a one-step estimation, with f(:) being the likelihoodfunction based on either Gaussian or Student-t distribution, would be obtained bymaximizing the following log-likelihood function for estimating � = (�0; �0; 0)0 , themean, variance and correlation parameters respectively is:

L(�) =PT

t=1 log f(Ytjt�1)

f(Ytjt�1) = f(YtjSt = L;t�1) � Pr(St = Ljt�1)+f(YtjSt = H;t�1) � (1� Pr(St = Ljt�1))

where f(YtjSt;t�1) is the likelihood conditional on the state and past values whilef(Ytjt�1) is the likelihood when the state is marginalized out. Pr(Stjt�1) denotesthe probability of being in a certain state given the past information. This probabilityis derived using Hamilton �lter (Hamilton, 1994, Chapter 22).In our two state model, the law of motion of Pr(Stjt�1) will boil down to:

Pr(St = Ljt�1) = (1� �H;H) + (�L;L + �H;H � 1)�h

f(Yt�1jSt�1=L;t�2)�Pr(St�1=Ljt�2)f(Yt�1jSt�1=L;t�2)�Pr(St�1=Ljt�2)+f(Yt�1jSt�1=H;t�2)�(1�Pr(St�1=Ljt�2))

iFor the initial condition of this iterative process, we use the long run probabilities

for each state which are calculated using the transition probabilities and that the sumof the long run probabilities should be equal to one.Since the number of correlation parameters to be estimated is 2k(k � 1) + 2 for k

time series, as Pelletier (2006) also mentions, one-step estimation is not practicable,especially when k is not small. Pelletier (2006) uses a data set consisting of four ex-change rate series. After removing the mean of the data, he estimates the correlationparameters separately from estimation of the variance parameters. In our paper, thiscorresponds to a three-step estimation without paying attention to the mean parame-ters or then a two-step estimation for a zero mean series. The estimation of meanand variance parameters can be processed in a single step or in successive two stepsas discussed in the case of other conditional correlation models. When estimating thecorrelation parameters in a separate step, the log-likelihood function will be maximizedtaking the mean and variance parameter estimates as known. On the other hand, if kis large, estimating the correlation parameters in a separate step could be even trou-blesome in terms of convergence. For this reason, Pelletier (2006) suggests using theEM algorithm of Dempster et al:(1977), noting that EM algorithm gives estimates thatare very close to the numerical maximum. To obtain the exact numerical maximum,he suggests using a limited number of iterations of a Newton type algorithm.

13

3.3 Asymptotic Properties

The one-step gaussian ML/QML estimators for the CCC, DCC and cDCC GARCHmodels are consistent and asymptotically normal such that:

pn(�n � �0) �A N(0; A�10 B0A

�10 )

where A0 is the negative expectation of the Hessian matrix evaluated at the trueparameter vector �0 and B0 is the expectation of the outer product of the score vec-tor evaluated at �0 obtained from the full-likelihood function mentioned in one-stepestimation.The two-step gaussian QML estimators mentioned are consistent as well and under

mild assumptions listed in Engle and Sheppard (2001) for a DCC-GARCH model,they are asymptotically normal such that A0 is a block lower-triangular matrix andtogether with B0, are obtained from the likelihood functions used in each step. Thepart of A0 that corresponds to the mean and variance parameters is block diagonalwhere each block comes from the second derivatives of the log-likelihood functions ofeach corresponding step. To obtain the part of A0 corresponding to the correlationparameters, we use the second derivatives from the full-likelihood function. Since A0 isblock lower triangular, the asymptotic variance of the two-step estimators are weaklygreater than the case of a block diagonal A0, which belongs to the one-step estimators.This loss of e¢ ciency is due to the estimation of the correlation parameters in a separatestep.Three-step QML estimators are consistent and their asymptotic distribution is very

similar to that of the two-step estimators. The part of A0 that corresponds to thecorrelation parameters is obtained in the same manner, however the part for the meanand variance equations parameters come from the second derivatives of the likelihoodfunctions of the corresponding �rst two steps, therefore for the parameters of each seriesforming a block diagonal matrix where the �rst block is for the mean and second blockis for the variance parameters. In the three-step estimation procedure there is loss ofe¢ ciency both because of separating the estimation of variance parameters from themean parameters, as mentioned by Bauwens (2001) and of separating the estimationof correlation parameters from the mean and variance parameters as pointed out inEngle and Sheppard (2001).The asymptotic properties of one-step t-based ML estimators can be found in

Fiorentini et al: (2003). It is well known that if the true distribution of the errorsis also a Student-t, ML estimators based on t-errors are more e¢ cient than QML esti-mators based on gaussian errors. However, as shown by Newey and Steigerwald (1997),care should be taken when computing ML estimations based on t-distribution since ifthe true distribution of the errors is not Student-t, one might end up sacri�cing con-sistency. In our MC studies, as Newey and Steigerwald (1997) requires for consistency

14

of the t-based estimators, the data is either generated from a symmetric gaussian or asymmetric t-distribution.Finally, the asymptotic properties of the one-step and multiple steps estimators of

the RSDC-GARCH model are similar and can be found in Pelletier (2006).

4 Monte Carlo Experiments

We �rst analyze the �nite sample performance of multiple steps estimators in di¤erentsettings using CCC-GARCH model with gaussian errors. Although we could focus onthe performance of each parameter estimator, it is not really practical when we considerdi¤erent models, sample sizes, estimators and parameters. In this sense, later whenwe consider CCC, DCC, cDCC and RSDC GARCH models with gaussian or Student-terrors, we rather look at the in-sample predictions of volatilities and correlations tocompare di¤erent estimators. For RSDC GARCH models the correlations are drivenby an unobservable markov chain and therefore the correlation parameter estimatorswill be analyzed instead of correlation predictions. Finally, for the models that themultiple steps estimation yields estimators which are very close to one-step estimators,we will do a robustness check with respect to the distribution of the errors.We consider a replication size of 1000 and in each replication a data set of k series

is generated according to the relevant model and distribution. We consider two timeseries in all the experiments except one in the �rst part where we check the robustnessof the results when there are three time series. Then we estimate the parameters ofthe model using one-step, two-step and three-step estimators assuming either gaussianor t-distribution for the errors. All simulations are performed by MATLAB computerlanguage. We consider three di¤erent sample sizes 500, 1000 and 5000 for the �rstpart. However, having realized that the 5000 sample size is already too large, in theremaining parts we consider the sample sizes 200, 500 and 1000.In the �rst part only, to compare di¤erent estimators, the true parameter values

and the Monte Carlo average of the parameter estimates together with their corre-sponding standard errors from each procedure are reported. In addition, for eachparameter, average deviations of two-steps from one-step estimators and of three-stepsfrom two-steps estimators are reported with their corresponding Monte Carlo standarddeviations. In all the parts, kernel density estimates of di¤erent estimators of eachparameter are also plotted to compare the performance of multiple steps estimators foreach sample size. Also these density estimates allow us to observe graphically the biasand standard deviations of the di¤erent parameter estimators.

15

Onestep Est Twostep Est Threestep EstDeviation of 2stepest from

1stepestDeviation of 3stepest from

2stepests500 s1000 s5000 s500 s1000 s5000 s500 s1000 s5000 s500 s1000 s5000 s500 s1000 s5000

μ1 0.2 0.20677 0.20410 0.20113 0.20695 0.20413 0.20117 0.20759 0.20382 0.20119 0.00018 0.00003 0.00005 0.00064 0.00031 0.00001 (0.04973) (0.03642) (0.01631) (0.05029) (0.03692) (0.01628) (0.05318) (0.03901) (0.01749) (0.00660) (0.00433) (0.00186) (0.01854) (0.01387) (0.00593)

β1 0.8 0.79324 0.79636 0.79928 0.79313 0.79645 0.79927 0.79200 0.79621 0.79909 0.00011 0.00009 0.00001 0.00113 0.00024 0.00017 (0.02826) (0.02006) (0.00873) (0.02888) (0.02046) (0.00874) (0.03048) (0.02183) (0.00955) (0.00583) (0.00398) (0.00173) (0.01051) (0.00811) (0.00360)

μ2 0.4 0.40322 0.40339 0.40022 0.40311 0.40327 0.40006 0.40302 0.40389 0.40012 0.00010 0.00012 0.00016 0.00009 0.00062 0.00005 (0.06017) (0.04303) (0.01740) (0.06091) (0.04353) (0.01750) (0.06215) (0.04435) (0.01806) (0.00837) (0.00549) (0.00236) (0.01446) (0.01109) (0.00507)

β2 0.6 0.59620 0.59763 0.59980 0.59607 0.59774 0.59994 0.59615 0.59731 0.60008 0.00014 0.00011 0.00014 0.00009 0.00043 0.00013 (0.03807) (0.02621) (0.01135) (0.03893) (0.02683) (0.01150) (0.03943) (0.02743) (0.01190) (0.00762) (0.00516) (0.00222) (0.00966) (0.00745) (0.00348)

ω1 0.1 0.17979 0.12374 0.10339 0.18172 0.12290 0.10334 0.18273 0.12374 0.10335 0.00192 0.00084 0.00005 0.00102 0.00084 0.00001 (0.17909) (0.07936) (0.01929) (0.18270) (0.07195) (0.01947) (0.18413) (0.07482) (0.01948) (0.11269) (0.05054) (0.00322) (0.11759) (0.05370) (0.00025)

α1 0.1 0.10835 0.10306 0.09988 0.10853 0.10298 0.09990 0.10643 0.10231 0.09977 0.00018 0.00008 0.00002 0.00209 0.00068 0.00013 (0.04394) (0.03022) (0.01234) (0.04429) (0.03042) (0.01253) (0.04313) (0.03021) (0.01251) (0.00918) (0.00594) (0.00187) (0.01070) (0.00413) (0.00019)

γ1 0.8 0.70641 0.77192 0.79637 0.70502 0.77285 0.79640 0.70455 0.77244 0.79652 0.00139 0.00093 0.00003 0.00047 0.00041 0.00012 (0.20318) (0.09603) (0.02695) (0.20553) (0.08880) (0.02727) (0.20786) (0.09253) (0.02727) (0.12339) (0.05506) (0.00431) (0.13076) (0.05995) (0.00039)

ω2 0.1 0.26975 0.12014 0.05309 0.27334 0.13184 0.05318 0.29011 0.14590 0.05400 0.00359 0.01170 0.00009 0.01677 0.01407 0.00082 (0.30773) (0.17689) (0.01474) (0.31067) (0.19837) (0.01492) (0.33851) (0.23086) (0.03143) (0.21035) (0.13564) (0.00215) (0.21710) (0.15677) (0.02819)

α2 0.1 0.06083 0.05417 0.05014 0.06092 0.05379 0.05011 0.06058 0.05388 0.05005 0.00009 0.00038 0.00003 0.00034 0.00010 0.00006 (0.03588) (0.02312) (0.00869) (0.03653) (0.02386) (0.00874) (0.03540) (0.02336) (0.00884) (0.01187) (0.00796) (0.00121) (0.00865) (0.00532) (0.00125)

γ2 0.9 0.65975 0.82210 0.89662 0.65620 0.81012 0.89656 0.63727 0.79581 0.89577 0.00355 0.01198 0.00006 0.01893 0.01431 0.00080 (0.32238) (0.19192) (0.02074) (0.32471) (0.21186) (0.02095) (0.35500) (0.24291) (0.03530) (0.21535) (0.14075) (0.00297) (0.22300) (0.15847) (0.02950)

ρ 0.2 0.19921 0.20071 0.19999 0.19751 0.19985 0.19983 0.19760 0.19981 0.19982 0.00170 0.00086 0.00016 0.00010 0.00004 0.00000 (0.04359) (0.03147) (0.01379) (0.04325) (0.03134) (0.01378) (0.04322) (0.03134) (0.01378) (0.00155) (0.00084) (0.00011) (0.00124) (0.00072) (0.00012)

Table 1: Means and standard deviations of different estimators in Basic Setting. Bold: Statistically significant at 5 %.

real

valu

es

4.1 Parameter Estimators of the CCC-GARCH Model

In this �rst part, we consider di¤erent settings where we change either some parametervalues or model structure or the number of time series. In all these settings we simulatedata with gaussian errors and estimate the parameters with a gaussian log-likelihoodfunction. We start our Monte Carlo experiments with a bivariate model given byequations (1) to (3) with a diagonal matrix �. We �x the unconditional mean andvariance equal to 1. The mean and variance persistence is set to be di¤erent one fromeach other but quite high. Therefore in this bivariate model, which we will refer to asbasic setting later, we have 11 parameters to estimate. The true parameter values forthis basic setting, as well as Monte Carlo means and standard deviations of one-stepand multiple steps estimators are given in Table 1. To check the robustness of theresults, in the other settings we change only one assumption or property of this basicsetting keeping others �xed.In the second setting, we set the unconditional variance of one series approximately

six times the other by changing the true parameter values in the second varianceequation to: w2 = 1, �2 = 0:15, 2 = 0:7. In the third setting, we set the unconditionalmean of one series six times the other by changing the true mean parameters for the�rst series to: �1 = 0:3, �1 = 0:8. In the fourth setting we increase the power of a shockthat �rst return series receives on its conditional variance by changing the coe¢ cientsto: �1 = 0:35, 1 = 0:55. We allow for return interactions in the �fth setting bychosing a non diagonal � matrix with the following true parameter values: �1 = 0:1,�11 = 0:8; �12 = 0:1, �2 = 0:3, �21 = 0:1; �22 = 0:6. Finally, in the sixth setting, weconsider a trivariate model where we add another equation to the basic setting withthe following mean and variance parameters: �3 = 0:3, �3 = 0:7, w3 = 0:05, �2 = 0:15, 2 = 0:8. The true values for the correlation parameters are chosen in this sixth setting

16

as: �12 = 0:1, �13 = 0:2, �23 = 0:3. To save space, Table 1 contains the results for onlythe basic setting. Although not included in this paper, similar tables for the results ofother settings are available from the authors upon request.The results obtained are similar in all these settings. First thing to note is that

the Monte Carlo bias and standard error of all estimators considered are decreasingas the sample size increases, as shown in Table 1 for the basic setting. The resultsof coe¢ cient tests in every setting suggest that signi�cant biases in the estimators forsmall samples seem to disappear when we move to larger samples, as expected whenestimators are consistent.In some cases, multiple steps estimators seem to perform better than the one-step

estimator; see, for example Table 1, where the average of the three-step estimates of �1and �2 are closer to the true parameter values compared to one-step estimates. There-fore, we can not conclude that in �nite samples, multiple steps estimators over/underestimate the parameters in a systematic manner.Like for the basic setting, in all these settings we analyzed the deviation of two-

steps estimators from one-step estimators for the loss of e¢ ciency due to estimating thecorrelation separately and also the deviation of three-steps estimators from two-stepestimators for the loss of e¢ ciency due to estimating mean and variance parametersseparately. For all the settings, the results suggest that the two-steps estimators followthe one-step estimators very closely such that even in small samples the deviation isvery small. Similarly, the three-steps estimators are very close to two-steps estimators.We can also con�rm these results visually from Figure 1 for the basic setting where weplot kernel density estimates of the one-step and multiple steps estimators for samplesize 500. Even for this small sample size the modes of the estimated densities of one-stepand multiple steps estimators are very close to the true value of the parameter. There isvery little, if any, di¤erence between the density estimates for di¤erent estimators. Asthe sample size increases, these densities become more and more concentrated aroundtrue parameter values. We don�t include the �gures for higher sample sizes in this �rstpart to save space.

17

00.

20.

40.

60.

81

051015

ω1

00.

20.

40.

60.

81

051015

α1

00.

20.

40.

60.

81

051015

γ 1

00.

20.

40.

60.

81

051015

ω2

00.

20.

40.

60.

81

051015

α2

00.

20.

40.

60.

81

051015

γ 2

00.

20.

40.

60.

81

051015µ 1

1ste

p2s

tep

3ste

ptr

ue v

alue

00.

20.

40.

60.

81

051015

β 1

00.

20.

40.

60.

81

051015

µ 2

00.

20.

40.

60.

81

051015

β 2

00.

20.

40.

60.

81

051015

α1+γ

1

00.

20.

40.

60.

81

051015

α2+γ

2

00.

20.

40.

60.

81

051015ρ

CC

Cn

CC

Cn

Sam

ple

size

: 50

0

Figure1:Kerneldensityestimatesofdi¤erentestimatorsinBasicSetting

18

CCCGARCH

DCCGARCH

cDCCGARCH

ECCCGARCH

RSDCGARCH

μ1 0.20 0.20 0.20 0.20 0.20β11 0.80 0.80 0.80 0.80 0.80μ2 0.40 0.40 0.40 0.40 0.40β22 0.60 0.60 0.60 0.60 0.60ω1 0.10 0.10 0.10 0.10 0.10α11 0.10 0.10 0.10 0.10 0.10α12 0.10 γ11 0.80 0.80 0.80 0.80 0.80γ12 0.80 ω2 0.05 0.05 0.05 0.05 0.05α21 0.05 α22 0.05 0.05 0.05 0.05 0.05γ21 0.90 γ22 0.90 0.90 0.90 0.90 0.90ρ 0.20 0.20 ρL 0.20ρH 0.80λ 0.20 0.20 δ1 0.04 0.04 δ2 0.94 0.94 πL 0.80πH 0.90Table 2. The true parameter values used to simulate thedata used in the MC experiments in part 2. "λ" is theoffdiagonal parameter of the intercept in DCC and cDCCGARCH models.

4.2 Predicted Volatilities and Correlations

Focusing on the di¤erences of one-step and multiple steps estimators for each parameterfor all the models is not practical. Therefore instead, in this part and the next one, wewill focus on the in-sample predictions of volatilities and correlations. Since correlationparameter estimators depend on the standardized residuals which in turn depends onmean parameter estimators through the residuals and variance parameter estimatorsthrough the variances used for standardization, the in-sample predictions of volatilitiesand correlations could be used to see the e¤ects of estimating in multiple steps.As we could obtain the in-sample predictions of volatilities and correlations using

di¤erent estimators, we could also derive the volatilities and correlations implied bythe true parameter values. For a given sample size T , if we call the volatility predictorsh ji;t as the volatilities of series i over time t obtained from estimator j (one-step, two-

steps or three-steps estimator) and if we denote htruei;t as the true volatilities, then theestimator for the deviation in the volatilities of series i will be:

�hji;t =1

1000

PTt=1

nhji;t � htruei;t

oSimilarly, estimators for the deviations in correlations over time between series i

and j, pij;t, can be calculated as:

�pjij;t =1

1000

PTt=1

�pjij;t � ptrueij;t

When the conditional volatility and conditional correlation predictions were calcu-

lated the starting values were taken as the unconditional volatilities and unconditionalcorrelations.

19

Finally we plot the kernel density estimates of these deviation estimators to comparethe performances of one-step, two-steps and three-steps estimators.In this part we consider a range of GARCH models. For the CCC-GARCHmodel of

Bollerslev (1990), we take the parameters and structure of the basic setting of the �rstpart. We also consider DCC-GARCH model of Engle (2002), cDCC-GARCH modelof Aielli (2006), ECCC-GARCH of Jeantheau (1998) and RSDC-GARCH model ofPelletier (2006). The true values of the mean, variance and correlation parametersconsidered are given in Table 2. On the other hand, when simulating the data andestimating the parameters we assume either Gaussian or Student-t distribution. Whenusing the Student-t distribution, we chose the degrees of freedom equal to �ve, whichtranslates to fairly fat tails for the distribution of the errors.For the models where the one-step and multiple steps estimators yield similar

volatility and correlation predictions, we do a robustness check for the distributionin the third part. Checking the robustness of the results to simulating with one modeland estimating with another model is outside of the focus of this paper and a subjectfor future research.

4.2.1 With Gaussian Errors

We �rst assume that the true distribution of the errors is normal and we thereforeestimate using a gaussian loglikelihood function. As mentioned before, the one-stepand multiple steps estimators in this case are consistent asymptotically normal howeverthe standard errors will be weakly greater for the latter estimators. See Engle andSheppard (2001) for details.Figure 2 shows for the CCC-GARCHmodel, the kernel density estimates of "devinh1",

"devinh2" and "devincorr", the deviation estimators for the volatilities of the �rst andsecond series and the correlations, which are obtained from one-step and multiple stepsestimators using sample sizes 200, 500 and 1000. "CCCn�CCCn" is the name givento the �gure to represent that the data is simulated and estimated with a CCC-GARCHmodel with normal errors. The �gure suggests that the deviation estimators are dis-tributed with a mode very close to zero and almost symmetric around zero for eventhe smallest sample size. Also, as the sample size grows, the distributions of thesedeviation estimators become degenerate around zero, suggesting that the volatilityand correlation predictors are consistent estimators of the ones implied by the trueparameters.Similar results were obtained for the DCC-GARCH, cDCC-GARCH, ECCC-GARCH

and RSDC-GARCH models and are available upon request.

20

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr.

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

1s2s3s

CCCnC CCn

Sample s iz e: 200 Sample s iz e: 500 Sample s iz e: 1000

Figure 2: Kernel density estimates of the estimators of deviations of in-sample volatilityand correlation predictors obtained from one-step and multiple steps Gaussian QMLestimators of the parameters of CCC-GARCH model for di¤erent sample sizes.

21

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr.

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

1s2s3s

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

CCCtCCCt


Figure 3: Kernel density estimates of the estimators of deviations of in-sample volatilityand correlation predictors obtained from one-step and multiple steps Student-t QMLestimators of the parameters of CCC-GARCH model for di¤erent sample sizes.

4.2.2 With Student-t Errors

In this second part, we assume that errors follow a t-distribution and we formulate ourloglikelihood functions using Student-t distribution (univariate or multivariate depend-ing on the method). We consider multiple steps estimators as explained in Section 3.Although not supported by theory, multiple steps estimation surprisingly yielded veryclose estimates to the ones obtained from one-step estimation for some models.Figure 3 plots kernel density estimates of the deviation estimators for the CCC-

GARCH model. Since the errors both in the simulation and estimation process follow aStudent-t distribution, the name of the �gure is "CCCt�CCCt". As can be seen fromthe �gure, the deviation estimators, obtained from the di¤erence of the true volatilitiesand correlations and the ones obtained from one-step and multiple steps estimators,follow each other very closely. However, compared to the normal distribution case,the tails of the distributions are thicker. As the sample size grows, these densityestimates of the deviation estimators become condensed around zero. Given that thedata is distributed from a symmetric Student-t distribution, one-step estimation whichis obtained from a t-based loglikelihood yields consistent estimators. In addition, themultiple steps volatility and correlation predictors follow the one-step ones very closely.For the DCC-GARCH similar results were obtained and are available from the authors

22

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1s2s3s

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr.

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

ECCCtECCCt


Figure 4: Kernel density estimates of the estimators of deviations of in-sample volatilityand correlation predictors obtained from one-step and multiple steps Student-t QMLestimators of the parameters of ECCC-GARCH model for di¤erent sample sizes.

upon request.

In the case of ECCC-GARCH, we obtained similar results, as shown in Figure 4,at least when the sample size was large. However, for the sample size 200, the densityestimate for the deviation estimator between the three-steps volatility predictor andthe volatility implied by the true parameters came out to be �at. This could be dueto the extreme shocks generated by the t-distribution.On the other hand, the �gure for the cDCC-GARCH model, Figure 5, shows that

the correlation predictor obtained using the three-step estimators is biased althoughthe volatility predictors are not. When we looked in detail, we realized that the biasmostly comes from the intercept estimator in the correlation equation. As we can seefrom the �gure, the two-steps predictors follow one-step predictors very closely for bothvolatilities and correlations. Therefore although one-step predictors are theoreticallyfeasible but two-steps predictors are not, we can conclude here that, practically byseparating the estimation of correlation parameters from the estimation of mean andvariance parameters, we don�t lose much from the e¢ ciency of the estimators.Similarly, Figure 6 shows the kernel density estimates for the volatility predictors

and for the correlation parameter estimators of the RSDC-GARCH model. Althoughthe density estimates for di¤erent correlation parameter estimators do not follow each

23

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1s2s3s

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr.

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

cDC CtcDCC t


Figure 5: Kernel density estimates of the estimators of deviations of in-sample volatilityand correlation predictors obtained from one-step and multiple steps Student-t QMLestimators of the parameters of cDCC-GARCH model for di¤erent sample sizes.

other very closely, the density estimates for the volatility predictors do. On the otherhand, the densities for the two steps estimators and three-steps estimators are relativelyclose to each other. In addition, although the one-step estimator is consistent in thismodel and setting, we don�t see that the mode of the density estimates for the one-stepestimator is very close to the true value of the parameter. Therefore we can concludethat the sample size could be too small for the results to emerge, but the existingresults suggest that two-steps estimation could be still usable.Therefore in this section, we conclude that when the true distribution of the errors is

Student-t and the estimations are performed using t-loglikelihoods, for the CCC, DCC,ECCC-GARCH models the two-steps or three-steps estimation is feasible although thisis not supported by the theory. For the cDCC and RSDC-GARCH model, according toour �ndings, we don�t suggest using a three-steps estimation, but a two-steps estimationwould yield very close estimates to the ones obtained by one-step estimation.

4.3 Robustness of the Estimators to Distribution

In this section, we changed the true distribution of the errors or the distribution as-sumption used to form the loglikelihood in order to check the robustness of the esti-mators. Whenever we generated a data using Student-t errors, we took the degrees of

24

1 0 10

2

4

6

8

10dev inh1

1s2s3s

1 0 10

2

4

6

8

10dev inh2

0 0.5 10

5

10

15

c orrs t1

0 0.5 10

5

10

15

c orrs t2

0 0.5 10

5

10

15

pc orrs t1

0 0.5 10

5

10

15

pc orrs t2

1 0 10

2

4

6

8

10dev inh1

1 0 10

2

4

6

8

10dev inh2

0 0.5 10

5

10

15

c orrs t1

0 0.5 10

5

10

15

c orrs t2

0 0.5 10

5

10

15

pc orrs t1

0 0.5 10

5

10

15

pc orrs t2

1 0 10

2

4

6

8

10dev inh1

1 0 10

2

4

6

8

10dev inh2

0 0.5 10

5

10

15

c orrs t1

0 0.5 10

5

10

15

c orrs t2

0 0.5 10

5

10

15

pc orrs t1

0 0.5 10

5

10

15

pc orrs t2

RSDCtRSDCt


Figure 6: Kernel density estimates of the estimators of deviations of in-sample volatilityand correlation predictors obtained from one-step and multiple steps Student-t QMLestimators of the parameters of RSDC-GARCH model for di¤erent sample sizes.

freedom as 5. We considered only the models and distributions listed in Table 3, sincefrom the previous part, we know that the three-step estimators based on t-loglikelihooddon�t perform well when estimation cDCC-GARCG and RSDC-GARCH models.Figure 7 shows the kernel density estimates for di¤erent estimators for the CCC-

GARCHmodel when the data is generated using a Student-t distribution and estimatedassuming gaussian errors. Compared to the case when the true distribution and as-sumed distribution are both normal, in this �gure we see that the tails of the estimateddensities are thicker. On the other hand, when we compare the results with the casewhen the true distribution and assumed distribution are both Student-t, we see thatthe �gures are very similar although in the latter case there are relatively more di¤er-ences between one-step and multiple steps estimators. Finally, in this part as well wecon�rm that the results hold that the multiple-steps estimators follow one-step estima-tors very closely and the density estimates of the deviation estimators collapse to zeroas the sample size grows. Similar �gures are obtained for the other models in the caseof simulating the data with Student-t and estimating using gaussian loglikelihood.When we simulate the data with gaussian errors and estimate the model with

Student-t loglikelihood, we see that the �gures look very similar to the case when thetrue distribution and the assumed distribution are both normal. Figure 8 shows theresults for the CCC-GARCH model in this case. This similarity makes sense since

25

Model Simulation EstimationCCCGARCH Studentt NormalCCCGARCH Normal StudenttDCCGARCH Studentt NormalDCCGARCH Normal StudenttcDCCGARCH Studentt NormalECCCGARCH Studentt NormalECCCGARCH Normal StudenttRSDCGARCH Studentt NormalTable 3: The models and distributionsconsidered in the robustness analysisin Section 4.3.

Student-t distribution has another parameter, namely the degrees of freedom, suchthat this distribution could approximate Gaussian distribution when this parameteris su¢ ciently large. In fact, when we were doing our MC experiments for this lastcase, we obtained very large estimates for the degrees of freedom. Similar �gures wereobtained and are available on request from the authors.

5 Empirical Application

This section illustrates how multiple step estimators can be used to estimate volatilityinteractions in several European stock markets. The question we would like to answer isthe following: are there � volatility leading" and/or �volatility lagging" stock markets inEurope?. In other words, among European stock markets, is there a particular market,say M, whose volatility is not a¤ected by the previous volatility of the others marketswhile the volatility of several markets depend on lagged volatility of market M?With this question in mind, we use data containing daily returns on 10 Euro-

pean stock markets: AEX (Amsterdam), ATX (Vienna), BEL-20 (Brussels), CAC-40(Paris), DAX (Frankfurt), FTSE-100 (London), IGBM (Madrid), MIBTEL (Milan),OMXS (Stockolm) and SMI (Switzerland)2 observed from January 8, 2002 to April

30, 2009. Descriptive statistics of the returns series, computed as 100� log�

ptpt�1

�, of

sample size 1770, are given in Table 2.Evidence against normality is provided by di¤erent normality tests. However, we

can still estimate consistently the parameters of the ECCC-GARCH model using theresults in Ling and McAleer (2003). Also, based on the results provided by the MonteCarlo experiments, using multiple steps estimators seems to be reasonable. Therefore,we have estimated the ECCC-GARCH model given in equations (1), (2) and (3) withthe following results:

2Series have been obtained in the webpage http://�nance.yahoo.com.

26

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1s2s3s

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr.

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

CCCtCCCn


Figure 7: Kernel density estimates of the estimators of deviations of in-sample volatilityand correlation predictors obtained from one-step and multiple steps Gaussian QMLestimators of the parameters of CCC-GARCH model for di¤erent sample sizes whenthe data is simulated with Student-t errors.

Table 1: Descriptive statistics of return series

Mean SD Skewness KurtosisAEX �0:04 1:77 0:04 9:02

ATX 0:03 1:62 �0:56 13:4

BEL-20 �0:02 1:45 �0:05 9:12

CAC-40 �0:02 1:65 0:13 8:99

DAX �0:01 1:74 0:15 7:80

FTSE-100 �0:01 1:41 �0:03 10:3

IGBM 0:01 1:42 �0:09 10:1

MIBTEL �0:02 1:33 �0:03 10:7

OMXS �0:01 1:53 0:21 8:11

SMI �0:01 1:39 0:14 8:79

27

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1s2s3s

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr.

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

1 0.5 0 0.5 10

2

4

6

8

10dev inh1

1 0.5 0 0.5 10

2

4

6

8

10dev inh2

0.2 0.1 0 0.1 0.20

20

40

60

80

100

120dev inc orr

CCCnC CCt


Figure 8: Kernel density estimates of the estimators of deviations of in-sample volatilityand correlation predictors obtained from one-step and multiple steps Student-t QMLestimators of the parameters of CCC-GARCH model for di¤erent sample sizes whenthe data is simulated with Gaussian errors.

28

b� = � �0:04 0:03 �0:02 �0:02 �0:01 �0:01 0:01 �0:02 �0:01 �0:01�0

b� = diag��0:03 0:03 0:06 �0:06 �0:05 �0:08 �0:04 �0:01 �0:00 0:01

�cW =

�0:00 0:03 0:02 0:01 0:02 0:00 0:03 0:06 0:16 0:06

�0

bA =

266666666666666664

� � 0:06 � � 0:05 � � � �� 0:14 � � � � � � � �� 0:05 � � 0:11 � � � �� 0:05 � � � � �� 0:02 � � � � 0:07 � �� 0:03 �� 0:06 � � �� 0:05 � � � �

377777777777777775

bG =

266666666666666664

0:81 � � � � � � � � �� 0:85 � � � � � � � �� 0:79 � � � � � � �� 0:45 � � � � � � �� 0:91 � � � � �� 0:48 � � � � � � �� 0:81 0:65 � � � � � � �� 0:39 � � � � � � �� 0:51 � � � � � � �� 0:41 � � � � � � �

377777777777777775

bR =

266666666666666664

1 � � � � � � � � �0:60 1 � � � � � � � �0:84 0:57 1 � � � � � � �0:93 0:60 0:84 1 � � � � � �0:88 0:59 0:79 0:91 1 � � � � �0:84 0:60 0:77 0:86 0:80 1 � � � �0:83 0:59 0:77 0:87 0:84 0:79 1 � � �0:86 0:60 0:79 0:89 0:86 0:81 0:84 1 � �0:80 0:59 0:73 0:82 0:80 0:76 0:76 0:78 1 �0:83 0:57 0:76 0:85 0:80 0:80 0:78 0:79 0:75 1

377777777777777775The standard errors of the parameter estimators are calculated analytically follow-

ing Hafner and Herwartz (2008). The VAR-ECCC-GARCH model could help us to

29

01/08/02 08/30/05 04/30/090

10

20

30

40

50

60

AEXATXBEL20CAC40DAXFTSE100IGBMMIBTELOMXSSMI

Figure 9: Estimated conditional volatilities of di¤erent time series over the periodconsidered.

answer our question given that in this model interactions among volatilities of di¤erentmarkets are allowed through parameters in matrices A and G. The estimated parame-ters suggest that, if there is a leading market in Europe, the �rst candidate would bethe stock market index BEL-20 in Brussels followed by the FTSE-100 in London. Whenestimating the volatility of the 10 European stock markets, the signi�cant coe¢ cientsreported in bA and bG show that previous day returns squared of the BEL-20 index a¤ectin a signi�cant way the volatility of AEX in Amsterdam and FTSE-100. Moreover,previous day volatility of the BEL-20 index a¤ect in a signi�cant way the volatility ofall the other markets with the exception of AEX, ATX and DAX. On the other hand,the volatility of the BEL-20 index depends on its previous day volatility and squaredreturns together with previous day squared returns of the FTSE-100 index.

6 Conclusions

In this paper we have performed several Monte Carlo experiments to study how muche¢ ciency is lost in small samples when estimating Vector Autoregressive MultivariateConditional Correlation GARCH models in multiple steps. The estimation of the cor-relation parameters could be separated from the estimation of the mean and varianceparameters, which we refer to as two-steps estimation. Also, the mean parameterscould be estimated in a �rst step, before doing this two-step estimation and we refer tothis procedure as three-steps estimation. Although one-step estimators are preferable

30

because of their theoretical properties, they are not always feasible and therefore, esti-mating the parameters of a model in multiple steps could be a reasonable alternative.To see how they perform in practice, we start by running several MC experiments usinga CCC-GARCH model with di¤erent model structures, parameters values or number ofseries. Our results show that, when the distribution of the errors is gaussian, multiplesteps estimators have a very good performance even in small samples.When other models are considered, namely ECCC and DCC-GARCH models, we

compute kernel density estimates of the in-sample volatility and correlation predic-tions obtained from one-step and multiple steps estimators, in order to compare theperformance of such estimators when �tting volatilities and correlations. If errors fol-low a gaussian (Student-t) distribution and estimation is based on the true gaussian(Student-t) distribution, we �nd that the volatility and correlation predictions obtainedwith di¤erent estimators are very similar. However, in the case of cDCC and RSDC-GARCH models with Student-t errors, we �nd that two-steps estimators outperformthree-steps estimators which seem to be biased even when the sample size increases.Finally, our results also show that the performance of multiple step estimators in

the models considered is robust to the error distribution. In other words, if the trueerror distribution is Student-t but estimation is based on the gaussian distribution,kernel density estimates of the in-sample volatility and correlation predictors obtainedfrom one-step and multiple steps estimators are quite similar. Analogously, if the trueerror distribution is gaussian but estimation is based on the Student-t distribution, thedensity estimates obtained are very similar and in this case, also similar to the onesobtained when the true and assumed distributions are both gaussian. This is due tothe fact that the Student-t distribution can approximate the gaussian distribution forlarge values of the degrees of freedom parameter.

Acknowledgements

Financial support from projects ECO2008-05721/ECON by the Spanish Governmentand IVIE9-09I by the Instituto Valenciano de Investigaciones Económicas is gratefullyacknowledged.

References

[1] Aielli, G. P. (2006). Consistent estimation of large scale dynamic conditional cor-relations. Unpublished paper: Department of Statistics, University of Florence.

[2] Bollerslev, T. (1987), A conditional heteroskedastic time series model for specula-tive prices and rates of return. Review of Economics and Statistics, 69, 542-547.A

31

[3] Bollerslev, T. (1990), Modelling the Coherence in Short-Run Nominal ExchangeRates: A Multivariate Generalized ARCH Model. The Review of Economics andStatistics, 72, 498-505.

[4] Bollerslev, Tim and Je¤rey M. Wooldridge (1992), "Quasi Maximum LikelihoodEstimation and Inference in Dynamic Models with time Varying Covariances,"Econometric Reviews, 11, 143-172.

[5] Bauwens, L., L. Sebastien and J.V.K. Rombouts (2006), Multivariate GARCHModels: A Survey. Journal of Applied Econometrics, 21, 79-109.

[6] Caporin, M. and M. McAleer (2009a), Do we really need both BEKK and DCC?A tale of two covariance models, Department of Economics, University of Padova(Available at SSRN: http://ssrn.com/abstract=1338190).

[7] Carnero, M. Angeles, Peña Daniel, Ruiz, Esther, 2004, "Persistence and Kurtosisin GARCH and Stochastic Volatility Models", Journal of Financial Econometrics,2, 319-342.

[8] Carnero, M.A., D. Peña and E. Ruiz (2007), E¤ects of outliers on the identi�cationand estimation of GARCH models. Journal of Time Series Analysis, 28(4), 471-497. A

[9] Carnero, M. Angeles, Peña Daniel, Ruiz, Esther, 2008, "Estimating and Forecast-ing GARCH Volatility in the presence of Outliers", Working Paper.

[10] Christodoulakis, G. A. and S.E. Satchell (2002), Correlated ARCH (CorrARCH):Modelling the time-varying conditional correlation between �nancial asset returns.European Journal of Operational Research, 139, 351-370.

[11] Conrad, C., Karanasos, M., 2008a. Negative volatility spillovers in the unrestrictedECCC-GARCH model. ETH Zurich, KOF Working Papers No. 189.

[12] DEMPSTER, A. P., N. M. LAIRD, AND D. B. RUBIN (1977): �Maximum like-lihood from incomplete data via the EM algorithm,�J. Roy. Statist. Soc. Ser. B,39(1), 1�38, With discussion.

[13] Engle, R., C.W.J.Granger and D. Kraft (1984), Combining competing forecastsof in�ation using a bivariate ARCH model. Journal of Economics Dynamics andControl, 8, 151-165.

[14] Engle, Robert, Sheppard, Kevin, 2001, "Theoretical and Empirical Properties ofDynamic Conditional Correlation M,ultivariate GARCH", NBERWorking Paper,No: W8554.

32

[15] Engle, R. (2002), Dynamic Conditional Correlation: A Simple Class of Multivari-ate Generalized Autoregressive Conditional Heteroskedasticity Models. Journal ofBusiness and Economic Statistics, 20, 339-350.

[16] Engle, Robert F., Shephard, Neil and Sheppard, Kevin,Fitting and Testing VastDimensional Time-Varying Covariance Models(October 2007). NYU Working Pa-per No. FIN-07-046. Available at SSRN: http://ssrn.com/abstract=1293629

[17] Fiorentini, G., Sentana, E., Calzolari, G. (2003), Maximum likelihood estima-tion and inference in multivariate conditionally heteroskedastic dynamic regressionmodels with Student-t innovations. Journal of Business and Economic Statistics21, 532�546. A

[18] HAMILTON, J. D. (1994): Time Series Analysis. Princeton University Press.

[19] Harvey AC, Ruiz E, Shephard N. 1992. Unobservable component time series mod-els with ARCH disturbances. Journal of Econometrics 52: 129�158.

[20] Jeantheau, T. (1998), Strong Consistency of Estimators For Multivariate ARCHModels. Econometric Theory, 14, 70-86.

[21] Ling, S. and M. McAleer (2003), Asymptotic Theory for a Vector ARMA-GARCHModel. Econometric Theory, 19, 280-310.

[22] Longin, Francois, Solnik, Bruno, 1995, "Is the correlation in international equityreturns constant: 1960-1990?", Journal of International Money and Finance, 14,1, 3-26.

[23] Nakatani, Tomoaki, Terasvirta, Timo, 2008, "Testing for Volatility Interactions inthe Constant Conditional Correlation GARCH Model", SSE/EFI Working PaperSeries in Economics and Finance, No: 649.

[24] Newey, Whitney K., McFadden, Daniel, 1994, "Large Sample Estimation andHypothesis Testing", Handbook of Econometrics, 4, 2111-2245.

[25] Newey WK, Steigerwald DS. 1997. Asymptotic bias for quasi maximum likelihoodestimators in conditional heteroskedasticity models. Econometrica 3: 587�599.

[26] Pelletier D. (2006), Regime switching for dynamic correlations. Journal of Econo-metrics, 131, 445-473.

[27] Silvennoinen, A. and T. Terasvirta (2008), Multivariate GARCH Models. To ap-pear in T. G. Andersen, R. A. Davis, J.-P. Kreiss and T. Mikosch, eds. Handbookof Financial Time Series. New York: Springer.

33

[28] Tse, Y.K. and A.K.C. Tsui (2000), A Multivariate Generalized AutoregressiveConditional Heteroscedasticity Model with Time-Varying Correlations. Journal ofBusiness and Economic Statistics, 20, 351-362.

34

estimating var-mgarch models in multiple...

Documents