comments on “kernel density estimation for time series data”

5

Click here to load reader

Upload: ana-perez

Post on 05-Sep-2016

220 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Comments on “Kernel density estimation for time series data”

International Journal of Forecasting 28 (2012) 15–19

Contents lists available at SciVerse ScienceDirect

International Journal of Forecasting

journal homepage: www.elsevier.com/locate/ijforecast

Discussion

Comments on ‘‘Kernel density estimation for time series data’’Ana PérezDepartamento de Economía Aplicada, Universidad de Valladolid, Spain

s. P

I am very pleased to have the opportunity to commenton this article, which makes interesting contributions tothe literature on the estimation of evolving distributionsand time-varying quantiles, and their applications tofinancial time series. The motivation of this paper istwofold. First, in univariate analysis, it is already wellknown that financial returns are non-Gaussian and thattheir distribution is not stable over time. Hence, featuressuch as the variance, skewness and tail dependence changeover time, and these changes can be measured usingcoefficients which are based on time-varying quantiles.Second, if we move to bivariate analysis, it seems that theco-movements of two assets are not captured adequatelyby linear correlation, and their relationship may changeover time. Therefore, new measures of dependence areneeded. In this framework, the probabilities associatedwith a copula become a useful tool for expressing thedependence in terms of quantiles. This leads to the conceptof changing bivariate copulas.

The main achievements of the paper at hand fallunder the heading of univariate analysis. In particular,the paper proposes a new method for estimating thewhole changing distribution of a time series using non-parametric kernel estimation and exponentially weightedmoving average filters (EWMA). Once the time-varyingcumulative distribution function (CDF) has been estimated,any time-varying τ -quantile, 0 < τ < 1, can be retrievedby inverting the estimated CDFs. The authors illustratetheir methodology through an empirical application to aseries of daily NASDAQ returns and provide quantile-basedmeasures of changing tail dispersion and skewness. Theyalso sketch an application to bivariate data with two seriesof Korea and Hong Kong return indices.

The proposed methodology seems to be very appealingand easy to apply, but there are certain aspects, both

E-mail address: [email protected].

0169-2070/$ – see front matter© 2011 International Institute of Forecasterdoi:10.1016/j.ijforecast.2011.02.001

theoretical and practical, that deserve further explanationor need further development. For instance, the practitioneris left to make a subjective choice of the preset number mof observations needed to initialize the procedure, with nouseful theory for guidance. How ‘‘big’’ or ‘‘small’’ shouldm be? How will the selection of the value of m affectthe results? Similar comments apply to the procedurefor computing quantiles from the CDF estimates. Someguidance is needed on the choice of the points where theCDF is to be computed, and also on the starting value forsolving the inverse transformation numerically. It wouldalso have been nice if the article had compared the newproposal with other methods proposed in the literaturefor estimating changing distributions and time-varyingquantiles.

The first part of my discussion addresses these mainconcerns, which will be illustrated with some examples.The second part suggests possible applications of themethodology proposed by the authors and possibleextensions to multivariate analysis.

1. Estimating changing distributions

The key contribution of this article is to provide a simplemethod for estimating the whole changing distribution,rather than some selectedmoments, as in previous papers.The authors propose to carry over the idea of exponentialweighting, which is commonly used to estimate the levelor the variance, to the estimation of thewhole distribution.For instance, in estimating the changing level of a non-stationary time series y1, . . . , yT , we all know that thesample mean of the observations,

µ =1T

T−i=1

yi,

ublished by Elsevier B.V. All rights reserved.

Page 2: Comments on “Kernel density estimation for time series data”

16 A. Pérez / International Journal of Forecasting 28 (2012) 15–19

-8

-6

-4

-2

0

2

4

6

8

Fig. 1. Time series simulated from an EGARCH − t7 model.

is not an appropriate choice, but a better estimator of thelevel at time t is provided by:

mt =

t−i=1

ωt,iyi,

where the weights decline in such a way that more weightis given to themost recent observations and the sum of theweights is one. The authors propose to estimate a time-varying distribution by extending this idea to the kernelestimators. Therefore, if the kernel estimator of the CDF fora sample of i.i.d. observations y1, . . . , yT is given by

F(y) =1T

T−i=1

Hy − yi

h

,

a time varying CDF might be estimated at time t byintroducing a weighting scheme as:

Ft(y) =

t−i=1

Hyt − yi

h

ωt,i,

where H(.) is a kernel. In particular, the authors proposeto use either the Epanechnikov or the Gaussian kernel andthe simple exponentially declining weights given by1

ωt,i =1 − ω

1 − ωtωt−i.

In order to compute Ft(y), feasible values of ω and hare needed. This article provides a rigorous method forestimating these parameters, since a closed expression forthe likelihood function is worked out. Hence, the first stepin applying this methodology is maximizing the likelihoodfunction, given by:

L(ω, h) =1

T − m

T−1−t=m

ln ft/t−1(yt+1)

=1

T − m

T−1−t=m

ln

1h

t−i=1

Kyt+1 − yi

h

ωt,i(ω)

,

where m is some preset number of observations usedto initialize the procedure and K is, for example, the

1 Note that this formula for the weights is not given in the paper.

Table 1Parameters estimated by maximum likelihood for different values ofm.

Parameters m1 10 50 100

ω 0.9931 0.9932 0.9940 0.9937h 0.6616 0.6592 0.5951 0.6608

Epanechnikov or Gaussian kernel. Then, the estimatedparameters, {ω, h}, are put back into the expression ofFt(·), and this is evaluated at any required value. Finally,the estimated CDF at time t can be drawn by plotting andconnecting the points (y, Ft(y)).

Unlike othermethodswhichhave beenusedpreviously,the proposal by Harvey and Oryshchenko has the advan-tage of yielding a closed formula for Ft(y) which can becomputed easily at any point y and does not even requirethe recursions which are also provided in the paper. As Iwill mention later, this will also be an advantage for re-trieving time varying quantiles from the CDF. However, theproblem of selecting m receives scant attention in the pa-per, and the guidelines provided by the authors seem to bead hoc and not very precise.

To address this point as to the way in which the choiceof m will affect the results, I perform the following ex-periment. I simulate a series of 1000 observations, plot-ted in Fig. 1, from an EGARCH(1,1) model with Student-7errors, given by:

yt = σtzt;

log(σ 2t ) =

(1 + 0.5L)(1 − 0.5L)

{−0.3zt−1 + 0.5(|zt−1| − µz)}.

With these data, I compute andmaximize the likelihoodfunction L(ω, h) for several values ofm, using the Gaussiankernel in all cases. The estimated values of ω and h arereported in Table 1. It seems that the weighting parameterω is stable, but the bandwidth h depends on the value ofm.

In order to obtain an estimate of the CDF for eachtime period, I split the support of my series, which goesfrom a minimum of −7.928 to a maximum of 7.5468,into N intervals, defined by the sample quantiles. Hence,I come up with a grid of N + 1 points, say x0, . . . , xN .For each t , the estimated CDF is evaluated at these points,yielding Ft(x0), . . . , Ft(xN), for t = 1, . . . , 1000. A visualimpression of the changing distribution can be obtained by

Page 3: Comments on “Kernel density estimation for time series data”

A. Pérez / International Journal of Forecasting 28 (2012) 15–19 17

t

Ft(

x)

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Fig. 2. Estimated values of Ft (x0), Ft (x1), . . . , Ft (x10), t = 1, . . . , 1000, withm = 50.

data

CD

F

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-8 -6 -4 -2 0 2 4 6 8

Fig. 3. Estimated CDF at time t = 100 withm = 1 (solid) andm = 50 (dashed) N = 10.

plotting these 1000×(N+1) points, which can be thoughtof as N + 1 trajectories, each one describing how Ft(xn) isevolving over time t . Fig. 2 plots the trajectories by carryingoutmaximum likelihoodwithm = 50, i.e., with ω = 0.994and h = 0.5951, and takingN = 10 for the grid, so that thepoints which make up the grid are the sample deciles.

The estimated CDF at the N + 1 points of the gridfor a particular time period, namely t = 100, is plottedin Fig. 3 (dashed line), together with the estimated CDFcorresponding to the maximum likelihood with m = 1,i.e., with ω = 0.9931 and h = 0.6616 (solid line). Thetwo CDFs plotted are hardly distinguishable, with the onecorresponding to m = 50 being slightly above the otherfor positive data values and slightly lower for negative datavalues. The same conclusions are drawn if the comparisonsare performed with respect to m = 10 and m = 100.Also, a similar result is obtained if I make a finer grid bysplitting the support into N = 50 intervals, though theresulting curves are obviously smoother. This seems toindicate that different values of m have very little impacton the estimated CDF. Therefore, it is to be expected thatthe value of m would not be relevant in obtaining time-varying quantiles by inverting estimated CDFs either.

2. Estimating time-varying quantiles

Time varying quantiles can be retrieved from the kernelestimate of the CDF by considering ξt(τ ) = F−1

t(τ ), 0 <

τ < 1. Since the authors provide a closed form forcomputing Ft , the changing quantiles can be obtained bysolving the equation Ft(ξt(τ )) = τ by numerical methods,given a starting value of ξt(τ ). In order to get such aninitial value, the authors propose evaluating Ft at a grid ofN points and interpolating linearly between the two pointscloser to the required quantile. The authors argue thatwitha large value of N , the precision of the initial estimate willbe sufficient for all practical purposes. However, I wouldhave liked to see how the grid and the initial value affectedthe final solution.

In order to discuss this problem, I perform the followinganalysis. Using the simulated EGARCH series from theprevious section, I first estimate two sets of τ -quantilesby linear interpolation. The first set comes from using thegrid defined in the previous section with N = 10, andthe other comes from a grid with N = 50. Let us denotesuch initial quantiles as ξ 10

t0 (τ ) and ξ 50t0 (τ ), respectively.

Page 4: Comments on “Kernel density estimation for time series data”

18 A. Pérez / International Journal of Forecasting 28 (2012) 15–19

0 100 200 300 400 500 600 700 800 900 1000

0 100 200 300 400 500 600 700 800 900 1000

-8

-6

-4

-2

0

2

4

6

8Quantiles estimated by linear interpolation from splitting the support into N=50 intervals

-8

-6

-4

-2

0

2

4

6

8Quantiles estimated by linear interpolation from splitting the support into N=10 intervals

Fig. 4. Simulated series with τ -quantiles, τ = {0.1, 0.5, 0.9}, estimated by linear interpolation from grids of different sizes: N = 10 (top) and N = 50(bottom).

Fig. 4 represents the time series, together with, in the toppanel, the quantiles ξ 10

t0 (τ ), τ = {0.1, 0.5, 0.9}, and in thebottom panel, the quantiles ξ 50

t0 (τ ) for the same τ . Clearly,there is not much difference between the 50% quantiles,but as we move to more extreme quantiles, i.e., with τ =

{0.1, 0.9}, there is a remarkable difference between theresults from interpolation based on a grid made up of 10intervals and those based on a grid of 50 intervals, with thelatter being much smoother, as expected.

The question which follows is whether these differ-ences in the initial values carry over to the final quantilesobtained by inverting the CDF. Fig. 5 plots the estimatedquantiles obtained by solving Ft(ξt(τ )) = τ , using as initialvalues the previous quantiles based on a grid of 10 intervals(top panel) and a grid of 50 intervals (bottom panel). Twoconclusions emerge from this figure. First, the results com-ing from the two sets of initial values seem to be identical.Actually, the maximum difference between them is up toanorder of 10−14. Hence, the initial solution required tonu-merically invert the CDF does not seem very relevant, anda coarse grid will be sufficient as a first step. This is a greatadvantage over the other methods proposed in the litera-ture, where both the size of the grid and the way in whichthe grid is made up have dramatic effects on the quantiles;see the discussion by Pérez (unpublished) on the paper byHarvey (unpublished). Second, this figure confirms that thequantiles estimated by interpolation from a fine grid areaccurate enough (compare the plots in Fig. 5 with that inthe bottom panel of Fig. 4), rendering the inversion of theCDF redundant, as was pointed out by the authors.

3. Some ideas for further research

Once the time-varying quantiles have been obtained,the methodology of Harvey (2010) can be applied to esti-mate time-varying copulas, and these can then be used tocompute changingmeasures of association and tail depen-dence. In particular, this author proposes estimating time-varying copulas from time-varying marginal distributionsby adapting a linear filter for binary data from structuraltime series models. Let y1t , y2t denote the two time serieswhose evolving relationship is to be measured. The pro-cedure suggested by Harvey (2010) involves the followingsteps:

• Compute changing τ -quantiles for each of the series,say ξ1t(τ ), ξ2t(τ ), 0 < τ < 1.

• Define the indicator It(τ , τ ) to be equal to 1 if {y1t ≤

ξ1t(τ ) and y2t ≤ ξ2t(τ )}, where this event occurs witha probability given by the copula Ct(τ , τ ) = p(y1t ≤

ξ1t(τ ), y2t ≤ ξ2t(τ )).• Estimate the copula using these indicators via the

following EWMA filter:

Ct(τ , τ ) = (1 − ω)It(τ , τ ) + ωCt−1(τ , τ ),

with C0(τ , τ ) = τ 2.

• Compute changing measures of association and taildependence, as, for example:

p(y1t ≤ ξ1t(τ )/y2t ≤ ξ2t(τ )) = Ct(τ , τ )/τ .

Harvey (2010) includes an application of these mea-sures using estimated quantiles based on adapting a filterfor count data to the framework of structural time series

Page 5: Comments on “Kernel density estimation for time series data”

A. Pérez / International Journal of Forecasting 28 (2012) 15–19 19

-8

-6

-4

-2

0

2

4

6

8Quantiles estimated by inverting the kernel CDF with initial values from a grid of N=10 intervals

0 100 200 300 400 500 600 700 800 900 1000

0 100 200 300 400 500 600 700 800 900 1000-8

-6

-4

-2

0

2

4

6

8Quantiles estimated by inverting the kernel CDF with initial values from a grid of N=50 intervals

Fig. 5. Simulated series with τ -quantiles, τ = {0.1, 0.5, 0.9}, obtained by inverting the CDF with initial conditions from different grids: N = 10 (top) andN = 50 (bottom).

models. It would be interesting to carry out similar workusing time-varying quantiles computed from the method-ology proposed in the paper at hand and performing somecomparisons.

Finally, I would like to mention possible extensions tomultivariate analysis. Generalizing some of the ideas in thepaper, it would be interesting to analyze the dependencesbetween more than two assets or more than two financialmarkets. For instance, what happens in the Asian andEuropeanmarketswhenWall Street is below a given targetlevel? What is the probability of contagion among severalmarkets? In order to answer such questions, we can use amultivariate changing copula, defined, at time t , as:

Ct(τ1, τ2, . . . , τk)

= p(y1t ≤ ξ1t(τ ), y2t ≤ ξ2t(τ ), . . . , ykt ≤ ξkt(τ )).

Multivariate dependence measures based on quantileshave been proposed before in the literature for cross-sectional data; see for instance Joe (1990) andWolff (1980).Among the measures proposed, it is worth mentioning themultivariate Blomqvist’s beta, the multivariate Kendall’sTau and the multivariate Spearman’s Rho. An interestingtopic for further research would be the extension ofthese concepts to time series analysis. For instance, doesit make sense to define a time-varying multivariate taildependence coefficient as:

p(y1t ≤ ξ1t(τ ), y2t ≤ ξ2t(τ )/y3t ≤ ξ3t(τ ))

= Ct(τ , τ , τ )/(1 − τ)?

Which properties will this and other possible mea-sures have? Would they be useful in measuring furtherrelationships in finance and also in other fields? These

questions remain open and I hope that they will encour-age other researchers to keep on working in this area.

4. Conclusions

The proposed methodology for computing changingdistributions and changing quantiles seems an attractiveone for measuring both the time-varying features ofa series of returns and the time-varying dependencebetween two assets. The method seems to be easy toapply and lacks some of the drawbacks of other methods.Interesting extensions to the multivariate frameworkcould also be possible.

Acknowledgements

I thank the authors for their prompt replies to my ques-tions and for making their Matlab codes available. Finan-cial support from projects ECO2009-08100, VA092A08 andVA027A08 is also acknowledged.

References

Harvey, A. C. (2009). Dynamic distributions and changing copulas.Presented at the Workshop on predictability in financial markets.International Institute of Forecasting. Lisbon, Portugal (unpublished).

Harvey, A. C. (2010). Tracking a changing copula. Journal of EmpiricalFinance, 17, 485–500.

Joe, H. (1990). Multivariate concordance. Journal of Multivariate Analysis,35, 12–30.

Pérez, A. (2009). Comments on ‘‘Dynamic distributions and changing cop-ulas’’. Presented at theWorkshop on predictability in financial markets.International Institute of Forecasting. Lisbon, Portugal (unpublished).

Wolff, E. F. (1980). N-dimensional measures of dependence. Stochastica,4(3), 175–188.