fourier series (spectral) analysis
DESCRIPTION
Fourier Series (Spectral) Analysis. Fourier Series Analysis. Suitable for modelling seasonality and/or cyclicalness Identifying peaks and troughs. A sine wave is a repeating pattern that goes through one cycle every 2 (i.e. 2 3.141593 = 6.283186) units of time. For example,. - PowerPoint PPT PresentationTRANSCRIPT
1
Fourier Series (Spectral) Analysis
2
Fourier Series Analysis
Suitable for modelling seasonality and/or cyclicalness
Identifying peaks and troughs
3
A sine wave is a repeating pattern that goes through one cycle every 2 (i.e. 2 3.141593 = 6.283186) units of time.
For example,
X 0 1.57 3.14 4.71 6.28
Y = sin (x) 0 1 0 – 1 0
4
So sin (6.283186 + x) = sin (x)
5
Since Wave Phase Shifts
Y = sin (x)
Z = sin (x + 1.5708)
0.5 (phase shift)
6
Amplitude
Y = 25 sin (x)
Amplitude
7
Combining Amplitude, Phase, Frequency and Wavelength
Yt = A*sin(2 ft + )
where
t = Time (i.e., 1, 2, 3, …, n)
Yt = Value of time series at time t
A = Amplitude
f = Number of cycle per observation
n = Number of observations in time series
2 = one complete cycle
= Phase shift
8
Amplitude determines the heights of peaks and depths of troughs
Phase shift determines where the peaks and troughs occur
Let be the wavelength, i.e. the number of
periods from the beginning of one cycle to the next.
fL
1
9
0
0 n
n
nLn
f
;1
2
;2
nL
nf
2
n
10
248
4sin200
5708.1248
4sin300
tY
tZ
Series A f P n L
Z 300 4/48 1.5708 48 12
Y 200 4/48 0 48 12
11
12
How to fit a single sine wave to a time series?
Consider the following data set:
YearQuarter
1 2 3 4
2002 1.52 0.81
2003 0.63 1.06 1.46 0.80
2004 0.71 0.98 1.50 0.85
2005 0.65 1.04 1.47 0.85
2006 0.72 0.95 1.37 0.91
2007 0.74 0.98
n = 20
Assume one cycle completes itself every year, then f = 5/20 and L = 4
13
So the model is :
This equation cannot be estimated by standard techniques.
t
tt
PtA
PtAY
2sin
20
52sin
14
But note that
PAb
PAb
tbtb
PtAPtA
PtA
sin
andcos
where
,2
cos2
sin
sin2
coscos2
sin
2sin
2
1
21
15
1
21
1
2
22
21
2
22222
21
tan
cos
sintan
Also,
.
sincos
Since
b
bP
b
b
P
PP
bbA
A
PPAbb
16
So the model becomes:
tt tbtbY
2cos
2sin 21
t Yt
1 1.52 1 0
2 0.81 0 – 1
20 0.98 0 1
t2
sin
t2
cos
17
data fourier1;
input y @@;
cards;
1.52 0.81 0.63 1.06 1.46 0.80 0.71 0.98 1.50 0.85
0.65 1.04 1.47 0.85 0.72 0.95 1.37 0.91 0.74 0.98
;run;
data fourier2;
set fourier1;
pi=3.1415926;
t+1;
s5=sin(pi*t/2);
c5=cos(pi*t/2);
run;
proc reg data=fourier2;
model y = s5 c5;
output out=out1 predicted=p residual=r;
run;
proc print data=out1;
var y p r;
run;
18
The SAS System
The REG Procedure
Model: MODEL1
Dependent Variable: y
Number of Observations Read 20
Number of Observations Used 20
Analysis of Variance
Source DF Sum of Squares Mean Square F Value Pr > F
Model 2 1.56010 0.78005 84.52 <.0001
Error 17 0.15690 0.00923
Corrected Total 19 1.71700
Root MSE 0.09607 R-square 0.9086
Dependent Mean
1.00000 Adj R-Sq 0.8979
Coeff Var 9.60698
Parameter Estimates
Variable DF Parameter Estimate Standard Error t Value Pr > t
Intercept 1 1.00000 0.02148 46.55 <.0001
S5 1 0.38700 0.03038 12.74 <.0001
C5 1 0.07900 0.03038 2.60 0.0187
19
The SAS System
Obs y p r
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1.52
0.81
0.63
1.06
1.46
0.80
0.71
0.98
1.50
0.85
0.65
1.04
1.47
0.85
0.72
0.95
1.37
0.91
0.74
0.98
1.38700
0.92100
0.61300
1.07900
1.38700
0.92100
0.61300
1.07900
1.38700
0.92100
0.61300
1.07900
1.38700
0.92100
0.61300
1.07900
1.38700
0.92100
0.61300
1.07900
0.13300
–0.11100
0.01700
–0.01900
0.07300
–0.12100
0.09700
–0.09900
0.11300
–0.07100
0.03700
–0.03900
0.08300
–0.07100
0.10700
–0.12900
–0.01700
–0.01100
0.12700
–0.09900MAPE = 8.525%
20
2014.0
387.0
079.0tanˆ
395.0
07900.038700.0ˆ
07900.0ˆ
38700.0ˆ,So
1
22
2
1
P
A
b
b
2014.0
2sin395.01ˆ
is model estimated theand
tYt
21
Fitting a set of sine waves to a time series
Amplitudes of peaks and troughs are rarely equal
General Fourier series model:
So, the general model is a linear combination of cycles at the “harmonic” frequencies, each with its amplitude (Aj) and phase shift (Pj)
t
h
jjjjt PtfAY
1
2sin
t
h
jjjjj tfbtfb
121 2cos2sin
22
Assume n is even
The highest harmonic, h, is if n is even and (n-1)/2
if n is odd.
Harmonic, j
Harmonic Frequency,
Wavelength, j
j fj
nL
1
12
321 n
2
n
n
1
n
2
n
3n
n
2
2
2
1
22
2
n
nn
2
n
3
n
2
n
n
jf j
23
Note that when ,
and
So,
2
nh
0sin2sin ttf j 2
1jf
tfbtfbY jj
h
jjjt 2cos2sin 2
1
11
2 cos 2h h tb f t
24
The General Fourier series model contains n unknowns to be estimated with n observations, i.e. no degree of freedom !!
The idea is NOT to use the General Fourier Series model for forecasting, but to use it for identifying significant cycles.
25
data fourier3;set fourier1;pi=3.1415926;t+1;s1=sin (pi*t/10);c1=cos (pi*t/10);s2=sin (pi*t/5);c2=cos (pi*t/5);s3=sin (3*pi*t/10);c3=cos (3*pi*t/10);s4=sin (2*pi*t/5);c4=cos (2*pi*t/5);s5=sin (pi*t/2);c5=cos (pi*t/2);s6=sin (3*pi*t/5);c6=cos (3*pi*t/5);s7=sin (7*pi*t/10);c7=cos (7*pi*t/10);s8=sin (4*pi*t/5);c8=cos (4*pi*t/5);s9=sin (9*pi*t/10);c9=cos (9*pi*t/10);s10=sin (pi*t);c10=cos (pi*t);run;
26
proc reg data=fourier3;model y = s1 c1 s2 c2 s3 c3 s4 c4 s5 c5 s6 c6 s7 c7 s8 c8 s9 c9 c10;output out=out1 predicted=p residual=r;run;proc print data=out1;var y p r;run;
proc spectra data=fourier1 out=out2;var y;run;
data out2; set out2; sq=p_01;if period=. or period=4 then sq=0;if round (freq, .0001)=3.1416 then sq=.5*p_01;run;proc print data=out2;sum sq;run;
27
The SAS System
The REG Procedure
Model: MODEL1
Dependent Variable: y
Number of Observations Read 20
Number of Observations Used 20
Analysis of Variance
Source DF Sum of Squares Mean Square F Value Pr > F
Model 19 1.71700 0.09037 . .
Error 0 0 .
Corrected Total 19 1.71700
Root MSE . R-Square 1.0000
Dependent Mean 1.00000 Adj R-Sq .
Coeff Var .
28
Parameter Estimates
Variable DF Parameter Estimate
Standard Error t Value Pr > ︳ t ︱
Intercept 1 1.00000 . . .
s1 1 -0.000681000 . . .
c1 1 -0.00026440 . . .
s2 1 0.00339 . . .
c2 1 0.01208 . . .
s3 1 -0.00723 . . .
c3 1 0.01673 . . .
s4 1 -0.02540 . . .
c4 1 0.00964 . . .
s5 1 0.38700 . . .
c5 1 0.07900 . . .
s6 1 0.00974 . . .
c6 1 -0.02258 . . .
s7 1 0.03026 . . .
c7 1 -0.03044 . . .
s8 1 -0.00781 . . .
c8 1 -0.00714 . . .
s9 1 0.00672 . . .
c9 1 -0.00002737 . . .
c10 1 -0.07700 . . .
29
The SAS System
Obs y p r
1 1.52 1.52 2.1684E - 18
2 0.81 0.81 4.944E - 17
3 0.63 0.63 1.6534E - 16
4 1.06 1.06 4.944E - 17
5 1.46 1.46 1.4637E - 16
6 0.80 0.80 2.0914E - 16
7 0.71 0.71 2.6563E - 17
8 0.98 0.98 6.1908E - 17
9 1.50 1.50 1.03E - 16
10 0.85 0.85 6.0011E - 17
11 0.65 0.65 1.3623E - 16
12 1.04 1.04 2.7625E - 16
13 1.47 1.47 1.7445E - 16
14 0.85 0.85 2.7934E - 16
15 0.72 0.72 1.912E - 16
16 0.95 0.95 1.165E - 16
17 1.37 1.37 2.5013E - 16
18 0.91 0.91 2.2503E - 16
19 0.74 0.74 1.4246E - 16
20 0.98 0.98 5.8059E - 17
30
The SAS System
Obs FREQ PERIOD P_01 sq
1 0.00000 . 40.0000 0.00000
2 0.31416 20.0000 0.0000 0.00001
3 0.62832 10.0000 0.0016 0.00157
4 0.94248 6.6667 0.0033 0.00332
5 1.25664 5.0000 0.0074 0.00738
6 1.57080 4.0000 1.5601 0.00000
7 1.88496 3.3333 0.0060 0.00605
8 2.19911 2.8571 0.0184 0.01842
9 2.51327 2.5000 0.0011 0.00112
10 2.82743 2.2222 0.0005 0.00045
11 3.14159 2.0000 0.2372 0.11858
0.15690
31
So,
0.1ˆ
387.0ˆ
00339.0ˆ
00681.0ˆ
15
12
11
b
b
b
077.0ˆ
;079.0ˆ
;01208.0ˆ
;0002644.0ˆ
210
25
22
21
b
b
b
b
32
The Line Spectrum (Periodogram)
To identify significant cyclical patterns.
The Line Spectrum is the amount of total sums of squares explained by the specific frequencies.
33
Line Spectrum can be computed as
The plot of Pj’s versus the wave length is called the periodogram. It measures the “intensity” of the specific cycles.
The “P_01” column in SAS output gives the line spectrums for all harmonics, except for the last harmonic, where the correct line spectrum is P_01/2.
jP 2 2
1 2
22
2 j j
j
nb b
nb
2
2
nh
nh
34
35
Source SS df F Decision
5th harmonic
Other harmonics
1.5601
0.1569
2
17
84.52 Significant
10th harmonic
Other harmonics
0.11858
0.03832
1
16
49.51 Significant
7th harmonic
Other harmonics
0.01842
0.0199
2
14
6.48 Significant
at 5% ; not significant at 1%
4th harmonic
Other harmonics
0.00738
0.01252
2
12
3.54 not significant
ANOVA test
36
Thus, the 5th & 10th harmonics (or equivalently, the
4- period & 2-period cycles) are significant which
means a suitable model would be
or equivalently,
tttYt
cos077.02
cos079.02
sin387.01ˆ
5708.1sin077.02014.02
sin395.01ˆ
t
tYt
37
Note:
since
0.077cos 0.077sin 1.5708t t
sinsin
2sincos
and
38
The SAS System
The REG Procedure
Model: MODEL1
Dependent Variable: y
Number of Observations Read 20
Number of Observations Used 20
Analysis of Variance
Source DF Sum of Squares Mean Square F Value Pr > F
Model 3 1.67868 0.55956 233.64 <.0001
Error 16 0.03832 0.00239
Corrected Total 19 1.71700
Root MSE 0.04894 R-Square 0.9777
Dependent Mean 1.00000 Adj R-Sq 0.9735
Coeff Var 4.89387
Parameter Estimates
Variable DF Parameter Estimate Standard Error t Value Pr > ︳ t ︱
Intercept 1 1.00000 0.01094 91.38 <.0001
s5 1 0.38700 0.01548 25.01 <.0001
c5 1 0.07900 0.01548 5.10 0.0001
c10 1 -0.07700 0.01094 -7.04 <.0001
39
The SAS System
Obs y p r
1 1.52 1.46400 0.056000
2 0.81 0.84400 -0.034000
3 0.63 0.69000 -0.060000
4 1.06 1.00200 0.058000
5 1.46 1.46400 -0.004000
6 0.80 0.84400 -0.044000
7 0.71 0.69000 0.020000
8 0.98 1.00200 -0.022000
9 1.50 1.46400 0.036000
10 0.85 0.84400 0.006000
11 0.65 0.69000 -0.040000
12 1.04 1.00200 0.038000
13 1.47 1.46400 0.006000
14 0.85 0.84400 0.006000
15 0.72 0.69000 0.030000
16 0.95 1.00200 -0.052000
17 1.37 1.46400 -0.094000
18 0.91 0.84400 0.066000
19 0.74 0.69000 0.050000
20 0.98 1.00200 -0.022000
MAPE = 4.113%
40
Out-of-Sample forecasts
5708.121sin077.02014.02
21sin395.012̂1
Y
464016.1
5708.124sin077.02014.02
24sin395.012̂4
Y
002016.1
41
Example: U.S./New Zealand foreign Exchange Rate(1986 Q2 to 2008 Q2) Quarterly average
U.S. / New Zealand Exchange Rate
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Q2 19
86
Q1 19
87
Q4 19
87
Q3 19
88
Q2 19
89
Q1 19
90
Q4 19
90
Q3 19
91
Q2 19
92
Q1 19
93
Q4 19
93
Q3 19
94
Q2 19
95
Q1 19
96
Q4 19
96
Q3 19
97
Q2 19
98
Q1 19
99
Q4 19
99
Q3 20
00
Q2 20
01
Q1 20
02
Q4 20
02
Q3 20
03
Q2 20
04
Q1 20
05
Q4 20
05
Q3 20
06
Q2 20
07
Q1 20
08
Q4 20
08
Date
Rat
e actual
predict
42
data Nzea;
set Nzea;
D_USNEWZD =USNEWZD - lag(USNEWZD);
if Date = “Q2 1986” then delete;
run;
data Nzea;
set Nzea;
t+1;
pi=3.1415926;
s1= sin(pi*t/44*1);
c1= cos(pi*t/44*1);
.
.
.
c43= cos(pi*t/44*43);
s44= sin(pi*t/44*44);
c44= cos(pi*t/44*44);
run;
43
proc reg data= Nzea;
model D_USNEWZD = s1 c1 s2 c2 s3 c3 s4 c4 s5 c5 s6 c6 s7 c7 s8 c8 s9 c9 s10 c10 s11 c11 s12 c12 s13 c13 s14 c14
s15 c15 s16 c16 s17 c17 s18 c18 s19 c19 s20 c20 s21 c21 s22 c22 s23 c23 s24 c24 s25 c25 s26 c26 s27 c27
s28 c28 s29 c29 s30 c30 s31 c31 s32 c32 s33 c33 s34 c34 s35 c35 s36 c36 s37 c37 s38 c38 s39 c39 s40 c40
s41 c41 s42 c42 s43 c43 c44 ;
output out=out1 predicted=p residual=r;
run;
quit;
proc spectra data= Nzea out=out2;
var D_USNEWZD;
run;
data out2; set out2; sq=p_01;
if period=. then sq=0;
if round (freq, .0001)=3.1416 then sq=.5*p_01;
run;
proc reg data= Nzea outest=outtest3;
model D_USNEWZD = c31 s31 c3 s3 c8 s8 ;
output out=out3 predicted=p residual=r;
run;
quit;
44
ANOVA test
Source SS df F Decision
31st harmonic 0.0090 2 4.8804 Significant
Other harmonics 0.0788 85
3rd harmonic 0.0066 2 3.7920 Significant
Other harmonics 0.0722 83
8th harmonic 0.0062 2 3.8210 Significant
Other harmonics 0.0660 81
9th harmonic 0.0039 2 2.4908 Insignificant
Other harmonics 0.0620 79
45
Fitted model equation
Forecasts
)44
8cos(0102.0)
44
8sin(0061.0)
44
3cos(0048.0)
44
3sin(0113.0)
44
31cos(0122.0)
44
31sin(0075.00025.0ˆ tttttt
Yt
)44
898cos(0102.0)
44
898sin(0061.0)
44
893cos(0048.0)
44
893sin(0113.0)
44
8931cos(0122.0)
44
8931sin(0075.00025.0ˆ
32008
QY
)44
908cos(0102.0)
44
908sin(0061.0)
44
903cos(0048.0)
44
903sin(0113.0)
44
9031cos(0122.0)
44
9031sin(0075.00025.0ˆ
42008
QY
7526.000098.0ˆ Y
7640.0
7516.0012407.0ˆ Y
7516.0