time series analysis. definition a time series {x t : t t} is a collection of random variables...
TRANSCRIPT
Time Series Analysis
Definition
A Time Series {xt : t T} is a collection of random variables usually parameterized by
1) the real line T = R= (-∞, ∞)2) the non-negative real line T = R+ = [0, ∞)3) the integers T = Z = {…,-2, -1, 0, 1, 2, …}
4) the non-negative integers T = Z+ = {0, 1, 2, …}
If xt is a vector, the collection of random vectors
{xt : t T}
is a multivariate time series or multi-channel time series.
If t is a vector, the collection of random variables
{xt : t T} is a multidimensional “time” series or spatial series.
(with T = Rk= k-dimensional Euclidean space or a k-dimensional lattice.)
Example of spatial time series
The project
• Buoys are located in a grid across the Pacific ocean
• Measuring– Surface temperature– Wind speed (two components)– Other measurements
The data is being collected almost continuously
The purpose is to study El Nino
Technical Note:The probability measure of a time series is defined by specifying the joint distribution (in a consistent manner) of all finite subsets of {xt : t T}.
i.e. marginal distributions of subsets of random variables computed from the joint density of a complete set of variables should agree with the distribution assigned to the subset of variables.
The time series is Normal if all finite subsets of
{xt : t T} have a multivariate normal distribution.
Similar statements are true for multi-channel time series and multidimensional time series.
Definition:(t) = mean value function of {xt : t T} = E[xt]
for t T.
(t,s) = covariance function of {xt : t T}
= E[(xt - (t))(xs - (s))] for t,s T.
For multichannel time series(t) = mean vector function of {xt : t T} = E[xt] for t T and(t,s) = covariance matrix function of {xt : t T}
= E[(xt - (t))(xs - (s))′] for t,s T.
The ith element of the k × 1 vector (t) i(t) =E[xit]
is the mean value function of the time series {xit : t T}
The i,jth element of the k × k matrix (t,s) ij(t,s) =E[(xit - i(t))(xjs - j(s))]
is called the cross-covariance function of the two time series {xit : t T} and {xjt : t T}
Definition:
The time series {xt : t T} is stationary if the joint distribution of xt1
, xt2, ... , xtk
is the
same as the joint distribution of xt1+h ,xt2+h , ... ,xtk+h for all finite subsets t1,
t2, ... , tk of T and all choices of h.
Definition:
The multi-channel time series {xt : t T} is stationary if the joint distribution of xt1
,
xt2, ... , xtk
is the same as the joint distribution
of xt1+h , xt2+h , ... , xtk+h for all finite subsets
t1, t2, ... , tk of T and all choices of h.
Definition:
The multidimensional time series {xt : t T} is stationary if the joint distribution of xt1
, xt2,
... , xtk is the same as the joint distribution of
xt1+h ,xt2+h , ... ,xtk+h for all finite subsets t1,
t2, ... , tk of T and all choices of h.
Time
The distribution of observations at these points in time
The distribution of observations at these points in timesame as
Stationarity
Some Implication of Stationarity
If {xt : t T} is stationary then:
1. The distribution of xt is the same for all t T.
2. The joint distribution of xt, xt + h is the same as the joint distribution of xs, xs + h .
Implication of Stationarity for the mean value function and the covariance function
If {xt : t T} is stationary then for t T.
(t) = E[xt] = and for t,s T.
(t,s) = E[(xt - )(xs - )]
= E[(xt+h - )(xs+h - )]
= E[(xt-s - )(x0 - )] with h = -s= (t-s)
If the multi-channel time series{xt : t T} is stationary then for t T.
(t) = E[xt] =
and for t,s T
(t,s) = (t-s)
Thus for stationary time series the mean value function is constant and the covariance function is only a function of the distance in time (t – s)
If the multidimensional time series {xt : t T} is stationary then for t T.
(t) = E[xt] =
and for t,s T.
(t,s) = E[(xt - )(xs - )]
= (t-s) (called the Covariogram)
Variogram
V(t,s) = V(t - s) = Var[(xt - xs)] = E[(xt - xs)2]
= Var[xt] + Var[xs] –2Cov[xt,xs]
= 2[(0) - (t-s)]
Definition:(t,s) = autocorrelation function of {xt : t T}
= correlation between xt and xs.
for t,s T.
sstt
st
xx
xx
st
st
,,
,
varvar
,cov
If {xt : t T} is stationary then
(h) = autocorrelation function of {xt : t T}
= correlation between xt and xt+h.
o
h
oo
h
xx
xx
tht
tht
varvar
,cov
Definition:
The time series {xt : t T} is weakly stationary if:
(t) = E[xt] = for all t T.
and
(t,s) = (t-s) for all t,s T.
or
(t,s) = (t-s) for all t,s T.
Examples
Stationary time series
1. Let X denote a single random variable with mean and standard deviation . In addition X may also be Normal (this condition is not necessary)
Let xt = X for all t T = { …,, -2, -1, 0, 1, 2, …}
Then E[xt] = = E[X] for t T and
(h) = E[(xt+h - )(xt - )]
= Cov(xt+h,xt )
= E[(X - )(X - )] = Var(X)
= 2 for all h.
. allfor 1 ho
hh
Excel file illustrating this time series
2. Suppose {xt : t T} are identically distributed and uncorrelated (independent).T = { …,, -2, -1, 0, 1, 2, …}
Then E[xt] = for t T and
(h) = E[(xt+h - )(xt - )]
= Cov(xt+h,xt )
00
0
h
hxVar t
00
02
h
h
The auto correlation function:
00
01
h
h
o
hh
Comment:
If = 0 then the time series {xt : t T} is called a white noise time series.
Thus a white noise time series consist of independent identically distributed random variables with mean 0 and common variance 2
Excel file illustrating this time series
3. Suppose X1, X2, … , Xk and Y1, Y2, … , Yk are independent independent random variables with
sincos1
k
iiiiit tYtXx
222 and iii YEXE 0 ii YEXE
Let 1, 2, … k denote k values in (0,)
For any t T = { …,, -2, -1, 0, 1, 2, …}
2sin2cos1
k
iiiii tYtX
2
sin2
cos1
k
i ii
ii P
tY
P
tX
Excel file illustrating this time series
Then
sincos1
k
iiiiit tYtXExE
tht xxEh
0 sincos1
k
iiiii tYEtXE
sincos1
k
iiiii htYhtXE
sincos1
k
jjjjj tYtX
Hence
coscos1 1
k
i
k
jjiji thtXXEh
thtYY jiji sinsin
sincos thtYX jiji cossin thtXY jiji
sinsincoscos1
2
k
iiiiii thttht
if 0 0,0 since jiYYEXXEYXE jijiji
and 222iii YEXE
Hence using cos(A – B) = cos(A) cos(B) + sin(A) sin(B)
k
iii
k
iiii hthth
1
2
1
2 cos cos
and
k
iiik
jj
k
iii
hwh
hh
1
1
2
1
2
coscos
0
k
jj
iiw
1
2
2
where
4. The Moving Average Time series of order q, MA(q)
qtqtttt uuuux 22110
Let 0 =1, 1, 2, … q denote q + 1 numbers.
Let {ut|t T} denote a white noise time series with variance 2.
– independent– mean 0, variance 2.
Let {xt|t T} be defined by the equation.
qtqttt uuuu 2211
Then {xt|t T} is called a Moving Average time series of order q. MA(q)
Excel file illustrating this time series
The mean
qtqtttt uuuuExE 22110
t h th E x x
qtqttt uEuEuEuE 22110
The auto covariance function
qhtqhththt uuuuE 2211
qtqttt uuuu 2211
q
jjtj
q
iihti uuE
00
q
i
q
jjtihtji uuE
0 0
q
i
q
jjtihtji uuE
0 0
qi
qihq
ihii
0
if0
2
. if 0 since jiuuE ji . and 22 iuE
qi
qih
hq
ihii
0
if0
2
The autocorrelation function for an MA(q) time series
The autocovariance function for an MA(q) time series
qi
qihh
q
ii
hq
ihii
0
if0 0
2
0
5. The Autoregressive Time series of order p, AR(p)
Let 1, 2, … p denote p numbers.
Let {ut|t T} denote a white noise time series with variance 2.
– independent– mean 0, variance 2.
Let {xt|t T} be defined by the equation.
2211 tptpttt uxxxx
Then {xt|t T} is called a Autoregressive time series of order p. AR(p)
Excel file illustrating this time series
Comment:
where {ut|t T} is a white noise time series with variance 2. i.e. 1 = 1 and = 0.
11 ttt uxx
An Autoregressive time series is not necessarily stationary.
Suppose {xt|t T} is an AR(1) time series satisfying the equation:
1 tt ux
121 tttttt uuxuxx
tt uuuux 1210
and is not constant.
ttt uEuEuEuExExE 1210
0xE
tt uVaruVarxVarxVar 10
but
20 txVar
A time series {xt|t T} satisfying the equation:
is called a Random Walk.
1 ttt uxx
Derivation of the mean, autocovariance function and autocorrelation function of a
stationary Autoregressive time series
We use extensively the rules of expectation
is stationary.
Assume that the autoregressive time series {xt|t T} be defined by the equation:
2211 tptpttt uxxxx
Let = E(xt). Then
2211 tptpttt uExExExExE
21 p
1 21 p
p
txE
211
or
The Autocovariance function, (h), of a stationary
autoregressive time series {xt|t T}can be determined by using the equation:
2211 tptpttt uxxxx
Thus
1 2Now 1 p
11 tptptt uxxx
The Autocovariance function, (h)
Hence
tht xxEh
thtphtpht xuxxE 11
where
tht xxE 11
thttphtp xuExxE
hphh uxp 11
0
00
hxuE
hxuEh
ttthtux
Now
0ux t tE u x
1 1t t p t p tE u x x u
211 tpttptt uExuExuE
2
The equations for the autocovariance function of an AR(p) time series
21 10 pp
101 1 pp
212 1 pp
323 1 pp
etc
Or using (-h) = (h)
21 10 pp
101 1 pp
212 1 pp
and
011 ppp
phhh p 11 for h > p
Use the first p + 1 equations to find (0), (1) and (p)
Then use
phhh p 11
for h > pTo compute (h)
The Autoregressive Time series of order p, AR(p)
Let 1, 2, … p denote p numbers.
Let {ut|t T} denote a white noise time series with variance 2.
– independent– mean 0, variance 2.
Let {xt|t T} be defined by the equation.
2211 tptpttt uxxxx
Then {xt|t T} is called a Autoregressive time series of order p. AR(p)
is stationary.
If the autoregressive time series {xt|t T} be defined by the equation:
2211 tptpttt uxxxx
Then
p
txE
211
The Autocovariance function, (h), of a stationary
autoregressive time series {xt|t T} be defined by the equation:
2211 tptpttt uxxxx
Satisfy the equations:
21 10 pp
101 1 pp
212 1 pp
and
011 ppp
phhh p 11 for h > p
Yule Walker Equations
The autocovariance function for an AR(p) time series
The mean 1 21t
p
E x
Use the first p + 1 equations (the Yole-Walker Equations) to find (0), (1) and (p)
Then use
phhh p 11
for h > pTo compute (h)
The Autocorrelation function, (h), of a stationary
autoregressive time series {xt|t T}:
0
hh
The Yule walker Equations become:
011
2
1 pp
111 1 pp
212 1 pp
and
111 ppp
phhh p 11 for h > p
pp
110
1
2
111 1 pp
212 1 pp
Then
111 ppp
phhh p 11
for h > p
To find (h) and (0): solve for (1), …, (p)
Example
Consider the AR(2) time series:
xt = 0.7xt – 1+ 0.2 xt – 2 + 4.1 + ut
where {ut} is a white noise time series with standard deviation = 2.0
White noise ≡ independent, mean zero (normal)
Find , (h), (h)
1 21 1 1
1 22 1 1
To find (h) solve the equations:
or
1 (0.7)1 0.2 1
2 0.7 1 0.2 1
thus 0.7 0.71 0.875
1 .2 0.8
2 0.7 0.875 0.2 0.8125
1 21 2h h h
for h > 2
This can be used in sequence to find:
0.7 1 0.2 2h h
3 , 4 , 5 , etc.
results
h 0 1 2 3 4 5 6 7 8h ) 1.0000 0.8750 0.8125 0.7438 0.6831 0.6269 0.5755 0.5282 0.4849 h
pp
110
1
2
To find (0) use:
= 17.778
or
2
1 2
01 1 2
22.0
1 0.70 0.8750 0.20 0.8125
0h h
To find (h) use:
1 21
4.1 4.1
411 0.70 0.20 0.1
To find use:
An explicit formula for (h)
Auto-regressive time series of order p.
Consider solving the difference equation:
011 phhh p
This difference equation can be solved by:Setting up the polynomial
pp xxx 11
pr
x
r
x
r
x111
21
where r1, r2, … , rp are the roots of the polynomial (x).
The difference equation
011 phhh p
has the general solution:
h
pp
hh
rc
rc
rch
111
22
11
where c1, c2, … , cp are determined by using the starting values of the sequence (h).
21
2
1
2
1110
ttt uxx 11
10
and
11 01
hhh 11 1 for h > 1
Example: An AR(1) time series
The difference equation
011 hh Can also be solved by:Setting up the polynomial
xx 11
11
1
1 where1
r
r
x
Then a general formula for (h) is:
10 since 1
1111
1
hh
h
cr
ch
tttt uxxx 2211
10
11 and 21
21 21 hhh
for h > 1
Example: An AR(2) time series
2
11 1
1or
Setting up the polynomial
2211 xxx
2
1 2 1 2 1 2
1 1 11 1 1
x xx x
r r r r r r
2
22
111 2
4 where
r
2
22
112 2
4 and
r
212
211
1 and
11 :Note
rrrr
Then a general formula for (h) is:
hh
rc
rch
22
11
11
For h = 0 and h = 1.
2
2
1
1
2
11 1
1r
c
r
c
211 cc
Solving for c1 and c2.
Then a general formula for (h) is:
hh
rrrrr
rr
rrrrr
rrh
22121
212
12121
221 1
1
11
1
1
and
2121
221
1 1
1
rrrr
rrc
Solving for c1 and c2.
2121
212
2 1
1
rrrr
rrc
If 04 22
1 21 and rr are real and
hh
rrrrr
rr
rrrrr
rrh
22121
212
12121
221 1
1
11
1
1
is a mixture of two exponentials
If 04 22
1 21 and rr are complex conjugates.
ieRiyxr 1
ieRiyxr 2
2 2 1where and tan , tanx x
R x yy y
cos sin , cos sini ie i e i
cos , sin2 2
i i i ie e e e
i
Some important complex identities
The above identities can be shown using the power series expansions:
2 3 4
12! 3! 4!
u u u ue u
2 4 6
cos 12! 4! 6!
u u uu
3 5 7
sin3! 5! 7!
u u uu u
Some other trig identities:
1. cos cos cos sin sinu v u v u v
2. cos cos cos sin sinu v u v u v
3. sin sin cos cos sinu v u v u v
4. sin sin cos cos sinu v u v u v
2 25. cos 2 cos sinu u u
6. sin 2 2sin cosu u u
ii
ii
eeRR
eReR
rrrr
rr
1
1
1
12
22
2121
221
ii
ii
eeRR
eReR
rrrr
rr
1
1
1
12
22
2121
212
sin212
2
iR
eRe ii
sin212
2
iR
eeR ii
h
hiii
h
hiii
R
e
iR
eeR
R
e
iR
eRe
sin21sin21 2
2
2
2
Hence
hh
rrrrr
rr
rrrrr
rrh
22121
212
12121
221 1
1
11
1
1
sin212
11112
iRR
eeeeRh
hihihihi
sin1
1sin1sin2
2
RR
hhRh
sin1
sincoscossinsincoscossin2
2
RR
hhhhRh
sin1
sincos1cossin12
22
RR
hRhRh
hR
hRR
h cotsin11
cos 2
2
hR
hh tansincos
cot1
1tan if
2
2
R
R
hR
hD
cos
hR
hhh
tansincos Hence
hR
hh
sinsincoscos
cos
1
222
tan1cos
sincos
cos
1 where
D
a damped cosine wave
Example
Consider the AR(2) time series:
xt = 0.7xt – 1+ 0.2 xt – 2 + 4.1 + ut
where {ut} is a white noise time series with standard deviation = 2.0
The correlation function found before using the difference equation:
(h) = 0.7 (h – 1) + 0.2 (h – 2)
h 0 1 2 3 4 5 6 7 8h ) 1.0000 0.8750 0.8125 0.7438 0.6831 0.6269 0.5755 0.5282 0.4849 h
Alternatively setting up the polynomial
2 21 21 1 .7 .2x x x x x
1 2
1 1x x
r r
221 1 2
12
.7 .7 4 .24where
2 2 .2r
221 1 2
22
.7 .7 4 .24and
2 2 .2r
1.089454
4.58945
Thus
hh
rrrrr
rr
rrrrr
rrh
22121
212
12121
221 1
1
11
1
1
1 2 1 21 22.7156r r r r
2 21 2 2 11 21.8578 and 1 0.85782r r r r
2 21 2 2 1
1 2 1 2 1 2 1 2
1 10.962237 and 0.037763
1 1
r r r r
r r r r r r r r
1 10.962237 0.037763
1.089454 4.58945
h h
h
Another Example
Consider the AR(2) time series:
xt = 0.2xt – 1- 0.5 xt – 2 + 4.1 + ut
where {ut} is a white noise time series with standard deviation = 2.0
The correlation function found before using the difference equation:
(h) = 0.2 (h – 1) - 0.5 (h – 2)
h 0 1 2 3 4 5 6 7 8h ) 1.0000 0.8750 0.8125 0.7438 0.6831 0.6269 0.5755 0.5282 0.4849 h
Alternatively setting up the polynomial
2 21 21 1 .2 .5x x x x x
1 2
1 1x x
r r
221 1 2
12
.2 .2 4 0.54where
2 2 0.5r
221 1 2
22
.2 .2 4 0.54and
2 2 0.5r
.2 1.96.2 1.96
1i
.2 1.96.2 1.96
1i
Thus
1 .2 1.96 ir i R e
2 .2 1.96 ir i R e
where
2 2 2.2 1.96 2R x y
and0.2
tan 0.142857,1.96
x
y
1thus tan 0.142857 0.141897
2
2
1 2 1Now tan cot cot .14897 2.33333
1 2 1
R
R
1Thus tan 2.33333 1.165905
2
2.538591cos 0.141897 1.165905
2h
h
cosFinally
h
D hh
R
2 2Also 1 tan 1 2.3333 2.538591D
hR
hD
cos
hR
hhh
tansincos Hence
hR
hh
sinsincoscos
cos
1
222
tan1cos
sincos
cos
1 where
D
a damped cosine wave
Conditions for stationarity
Autoregressive Time series of order p, AR(p)
The value of xt increases in magnitude and ut eventually becomes negligible.
i.e. 11 ttt uxx
If 1 = 1 and = 0.
The time series {xt|t T} satisfies the equation:
The time series {xt|t T} exhibits deterministic behaviour.
11 tt xx
Let 1, 2, … p denote p numbers.
Let {ut|t T} denote a white noise time series with variance 2.
– independent– mean 0, variance 2.
Let {xt|t T} be defined by the equation.
2211 tptpttt uxxxx
Then {xt|t T} is called a Autoregressive time series of order p. AR(p)
Consider the polynomial
pp xxx 11
pr
x
r
x
r
x111
21
with roots r1, r2 , … , rp
then {xt|t T} is stationary if |ri| > 1 for all i.
If |ri| < 1 for at least one i then {xt|t T} exhibits deterministic behaviour.
If |ri| ≥ 1 and |ri| = 1 for at least one i then {xt|t T} exhibits non-stationary random behaviour.
Special Cases: The AR(1) time
Let {xt|t T} be defined by the equation.
11 ttt uxx
Consider the polynomial
xx 11
1
1r
x
with root r1= 1/1
1. {xt|t T} is stationary if |r1| > 1 or |1| < 1 .
2. If |ri| < 1 or |1| > 1 then {xt|t T} exhibits deterministic behaviour.
3. If |ri| = 1 or |1| = 1 then {xt|t T} exhibits non-stationary random behaviour.
Special Cases: The AR(2) time
Let {xt|t T} be defined by the equation.
2211 tttt uxxx
Consider the polynomial
2211 xxx
21
11r
x
r
x
where r1 and r2 are the roots of (x)
1. {xt|t T} is stationary if |r1| > 1 and |r2| > 1 .
2. If |ri| < 1 or |1| > 1 then {xt|t T} exhibits deterministic behaviour.
3. If |ri| ≤ 1 for i = 1,2 and |ri| = 1 for at least on i then {xt|t T} exhibits non-stationary random behaviour.
This is true if 1+2 < 1 , 2 –1 < 1 and 2 > -1.These inequalities define a triangular region for 1 and 2.
Patterns of the ACF and PACF of AR(2) Time SeriesIn the shaded region the roots of the AR operator are complex
h kk
h kk
h kk
h kk
1
21
-1
2-2
III
IIIIV
2