data-driven bandwidth selection in the spectral estimation ...€¦ · 1. spectral density...

27
1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time Mary-Ana ALLEN J-M. AZAIS, J-R. LEON October 2008 University of Toulouse(UPS)-Central University of Venezuela(UCV) Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for t

Upload: others

Post on 18-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Data-driven bandwidth selection in the spectralestimation for the study of the spectrum in time

Mary-Ana ALLENJ-M. AZAIS, J-R. LEON

October 2008University of Toulouse(UPS)-Central University of

Venezuela(UCV)

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 2: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

1. Spectral density estimation

2. Cross-validation methods

3. Simulations resultsCases from some theoretical spectraTable of results: MISE

4. Spectrum in timeUsing real dataTime spectrum derivative

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 3: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

1. Spectral density estimation

Let {Xk}k∈Z be a stationary time series with zero mean andautocovariance function γ(k). Suppose the process has a spectraldensity f .

Consider the vector

~x = (x0, ..., xT−1)

of the sample x0, ..., xT−1 where the size T is an even number.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 4: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Let λj = jT for j = 0, ...,T − 1 be the Fourier frequencies in [0, 1].

The periodogram of ~x ∈ CT is define at the Fourier frequencies λj

by

I (λj) =1

T

∣∣∣∣∣T−1∑k=0

xke−i2πλjk

∣∣∣∣∣2

RemarkIt is known that the periodogram is an asymptotically unbiasedestimator of the spectral density but it’s not a consistentestimator. The consistency is achieve smoothing the periodogram.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 5: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

In this work we study two types of spectral density estimators:A. Kernel density estimator. Given by

f̂Kh(λj) =1

Th

T∑k=0

K

(λj − λk

h

)I (λk)

where K is a kernel and h > 0 (small) is called bandwidth.

RemarkIn our case we use the triweight kernel

K (u) =35

32(1− u2)31|u|≤1.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 6: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

B. Lag window estimator. Defined by

f̂Lw (λ) =L−1∑

k=−(L−1)

γ̂(k)w(k/L)e−i2πλk

where γ̂ is the the sample autocovariance function L ≤ T , w(u) aneven function, piecewise continuous, such that w(0) = 1,|w(u)| ≤ 1 for all u and w(u) = 0 for |u| > 1.

RemarkThe toolbox WAFO uses this estimator in a function for calculatethe spectrum. Between the options of the programmed function,we work in all cases with the Parzen window. This function usesone method for the L selection.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 7: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

2. Cross-validation methods

One data-driven method for the bandwidth selection in the kerneldensity estimator is the cross-validation. We study three differentapproach but all are defined using the following notion:

We denote f̂ −jKh (λj) to the same kernel density estimator at

frequency λj , without the value of the periodogram at thecorresponding frequency.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 8: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

A. Cross-Validatory Log Likelihood (CVLL)

CVLL(f̂Kh) =1

T/2

T/2∑j=0

[ln f̂ −j

Kh (λj) +I (λj)

f̂ −jKh (λj)

];

B. Wahba-Hurvich Discrete Blackman- Tukey (WHDBT)

WHDBT (f̂Kh) =1

T/2

T/2∑j=0

{[ln f̂ −jKh (λj)− (ln I (λj) + C )]2 − π2

6};

donde C = 0.577216.C. Stuetzle’s smoothed estimate (SES)

SES(f̂Kh) =1

T/2

T/2∑j=0

(f̂ −jKh (λj)− I (λj)

)2

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 9: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

3. Simulations results

We simulate data using a function of WAFO from differenttheoretical spectra such as JONSWAP and Ochi Hubble and weestimate these spectra using the CVLL, WHDBT, SES, and WAFOmethods. We work with simulated sample of size 512, 2048.

We use the integrated square error (ISE) as a criterion to comparethe results.

We will show some results only for T=2048. For the 3 methods ofcross-validation we use 99 bandwidth values from 0.001 to 0.05.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 10: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Case 1: JONSWAP.With Hm0 = 7 and Tp = 11.

WAFO SES CVLL WHDBT

ISE 15,279 18,147 41,33 62,565

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 11: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100Densidad con h estimado por CVLL

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100

120

140Densidad con h estimado por WHDBT

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100Densidad con h estimado por SES

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100Densidad estimada por WAFO

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 12: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Case 2: Ochi-Hubble 2. With Hm0 = 7 (def=4).

WAFO SES CVLL WHDBT

ISE 22,509 14,83 47,453 74,398

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 13: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100

120Densidad con h estimado por CVLL

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100

120

140

160Densidad con h estimado por WHDBT

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100Densidad con h estimado por SES

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100Densidad estimada por WAFO

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 14: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Case 3: Ochi-Hubble 3. (def=7).

WAFO SES CVLL WHDBT

ISE 10,557 5,6506 23,954 42,091

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 15: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

0 0.1 0.2 0.3 0.4 0.50

10

20

30

40

50

60

70Densidad con h estimado por CVLL

0 0.1 0.2 0.3 0.4 0.50

20

40

60

80

100Densidad con h estimado por WHDBT

0 0.1 0.2 0.3 0.4 0.50

5

10

15

20

25

30

35

40Densidad con h estimado por SES

0 0.1 0.2 0.3 0.4 0.50

10

20

30

40

50Densidad estimada por WAFO

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 16: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Table of results: MISEWe ran 1000 simulations and calculate the mean integrated squareerror (MISE).

T=512 WAFO SES CVLL WHDBT

Jonswap 47,852 46,769 73,643 105,15Ochi-Hubble 2 39,455 33,398 58,038 82,119

Ochi-Hubble 3(1) 24,213 17,589 39,854 56,051Ochi-Hubble 3(2) 19,888 12,526 28,393 40,26

T=2048 WAFO SES CVLL WHDBT

Jonswap 19,232 17,534 42,576 70,566Ochi-Hubble 2 16,01 11,563 35,118 57,616

Ochi-Hubble 3(1) 10,735 5,5139 23,586 38,04Ochi-Hubble 3(2) 8,1776 3,8169 16,843 27,255

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 17: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Boxplot JONSWAP

1 2 3 4

0

50

100

150

200

250

300

350

Val

ues

Column Number

Boxplot ISE. Simulaciones JONSWAP

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 18: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Boxplot Ochi Hubble 2

1 2 3 4

0

50

100

150

200

Val

ues

Column Number

Boxplot ISE. Simulaciones Ochi−Hubble2. T=2048

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 19: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Boxplot Ochi Hubble 3

1 2 3 4

0

20

40

60

80

100

120

Val

ues

Column Number

Boxplot ISE. Simulaciones Ochi−Hubble3(1). T=2048

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 20: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Best performance

Analyzing the results, the best are obtained with SES, follow byWAFO. The CVLL and WHDBT methods give very bad results.

RemarkAlthough SES is the best, there are some outliers for which we areworried about. But after a detailed study we realize that the worstresults were achieve when the selected badwidth was between thesmallest of possible choices. As a solution we restrict these values.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 21: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

4. Spectrum in time

Let N be the size of a large size sample. Let T ≤≤ N be thewidth of the time window. At every point xk , forT2 − 1 ≤ k ≤ N − 1− T

2 we calculate the periodogram for thepoints xk−T/2, ..., xk+T/2 and procede to smoothing using the SESmethod for the bandwidth selection.

Using real data We use sets of data from the webpage of CDIP(The Coastal Data Information Program). We have worked with 7hours samples of some stations buoys where the samplingfrequency is 1.28 Hz. Then we have 32256 points.We use a width of 1600 sec for the time window, i.e, 2048 pointsfor the spectral estimation at each time point.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 22: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Example 1: Spectrum in time for data of station 155(CDIP)

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 23: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Example 2: Spectrum in time for data of station 141(CDIP)

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 24: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Time spectrum derivative

Finally we study the time derivative of the spectrum and wecalcule his norm. Is well-kwnon that the derivative represents therate of change of a function. In this way, we can thought the normof derivative at each time, as a time series and try to detectchange-points in the spectrum in time applying some test with thedistribution of maximum of stationary Gaussian processes.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 25: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Derivative (station 155)

0 0.1 0.2 0.3 0.4 0.5 0.6

0.5

1

1.5

2

x 104

freq(Hz)

time

(se

c)

Curvas de nivel. Derivada espectro en tiempo

−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25

0 0.5 1 1.5 2 2.5

x 104

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

time(sec)

no

rm

Norma de la derivada

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 26: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

Derivative (station 141)

0 0.1 0.2 0.3 0.4 0.5 0.6

0.5

1

1.5

2

x 104

freq(Hz)

time

(se

c)

Curvas de nivel. Derivada espectro en tiempo

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0 0.5 1 1.5 2 2.5

x 104

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

time(sec)

no

rm

Norma de la derivada

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time

Page 27: Data-driven bandwidth selection in the spectral estimation ...€¦ · 1. Spectral density estimation 2. Cross-validation methods 3. Simulations results 4. Spectrum in time 1. Spectral

1. Spectral density estimation2. Cross-validation methods

3. Simulations results4. Spectrum in time

References

[1] Beltrao,K. and Bloomfield, P.(1987) Determining theBandwidth of a Kernel Spectrum Estimate. J. Time Series Anal.,Vol. 8, No. 1, pp.21-38 .

[2] Hurvich, C. (1985) Data-Driven Choice of a SpectrumEstimate: Extending the Applicability of Cross-Validation Methods.J. Amer. Statist. Ass., Vol.80, No.392, pp.933-940.

[3] Pritsley, M.(1981) Spectral Analysis and Time Series. NewYork: Academic Press.

Mary-Ana ALLEN J-M. AZAIS, J-R. LEON Data-driven bandwidth selection in the spectral estimation for the study of the spectrum in time