chapter 3 prediction and model selection

60
Introduction to time seri es (2008) 1 Chapter 3 Prediction and model selection

Upload: gari

Post on 10-Feb-2016

60 views

Category:

Documents


0 download

DESCRIPTION

Chapter 3 Prediction and model selection. Chapter 3. Contents. 3.1.  Properties of MMSE of prediction. 3.2.  The computation of ARIMA forecasts. 3.3.  Interpreting the forecasts from ARIMA models. 3.4.  Prediction confidence intervals. 3.5. Forecasting updating. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 3 Prediction and model selection

Introduction to time series (2008) 1

Chapter 3

Prediction and model selection

Page 2: Chapter 3 Prediction and model selection

Introduction to time series (2008) 2

Chapter 3. Contents.3.1.  Properties of MMSE of prediction.

3.2.  The computation of ARIMA forecasts.      

3.3.  Interpreting the forecasts from ARIMA models.

3.4.  Prediction confidence intervals.

3.5. Forecasting updating.

3.6. Model selection criteria. 

Page 3: Chapter 3 Prediction and model selection

Introduction to time series (2008) 3

• We assume an observed sample:

• and want to generate predictions of future values given the observations,

• T is the forecast origin and k the forecast horizon.

)',...,,( 21 TT zzzZ

kTTT zzz ,....,, 21

Page 4: Chapter 3 Prediction and model selection

Introduction to time series (2008) 4

Three components

• 1. Estimation of new values: prediction.

• 2. Measure of the uncertainty: prediction intervals.

• 3. Arrival of new data: updating.

Page 5: Chapter 3 Prediction and model selection

Introduction to time series (2008) 5

Chapter 3. Prediction and model selection.

3.1. Properties of MMSE of prediction.

 

Page 6: Chapter 3 Prediction and model selection

Introduction to time series (2008) 6

Properties of MMSE of prediction

• Prediction by the conditional expectation

• We have T observations of a zero mean stationary time series , and we want to forecast the value of .

• In order to compare alternative forecasting procedures, we need a criterion of optima- lity.

kTz

)( TZ

Page 7: Chapter 3 Prediction and model selection

Introduction to time series (2008) 7

Properties of MMSE of prediction

• Minimum Mean Square Error Forecasts (MMSF). Forecasts that minimize this criterion can be computed as follows.

• Let be the forecast we want to generate, this forecast must minimize

– Where the expected value is taken over the joint distribution of and

)(kgT

2)]([),( kgzEgzMSE TkTkT

kTz TZ

Page 8: Chapter 3 Prediction and model selection

Introduction to time series (2008) 8

Properties of MMSE of prediction

• Using the well-known property of

• we obtain

)()( / yEEyE xyx

]/[)/( 2TkTTkT ZzEZzMSE

]/[)(2)( 2TkTTT ZzEkgkg

Page 9: Chapter 3 Prediction and model selection

Introduction to time series (2008) 9

Properties of MMSE of prediction

• and taking the derivative, we obtain

• This result indicates that, conditioning to the observed sample, the MMSEF is obtained by computing the conditional expectation of the random variable given the available information.

)(ˆ]/[)( kzZzEkg TTkTT

Page 10: Chapter 3 Prediction and model selection

Introduction to time series (2008) 10

Properties of MMSE of prediction

• Linear predictions

• Conditional expectations can be, in some cases, difficult to compute. – Restrict our search to forecasting functions that

are linear functions of the observations.• General equation for a linear predictor

• . TkTkTkoT Zbzbzbkz '1)1(....)(ˆ

Page 11: Chapter 3 Prediction and model selection

Introduction to time series (2008) 11

Properties of MMSE of prediction

• calling MSEL to the mean square error of a linear forecast

• minimizing this expression with respect to the parameters, we have

2' ][)/( TkkTTkT ZbzEZzMSEL

0])[( ' TTkkT ZZbzE

Page 12: Chapter 3 Prediction and model selection

Introduction to time series (2008) 12

Properties of MMSE of prediction

• Which implies that the best linear forecast must be such that the forecast error is uncorrelated with the set of observed variables.

• This property suggests the interpretations of the linear predictor as projections.

Page 13: Chapter 3 Prediction and model selection

Introduction to time series (2008) 13

Properties of MMSE of prediction

• that is, finding the coefficients of the best linear predictor is equivalent to regress,

• then,

• where is the covariance matrix of and is the covariance vector between and

TkT Zonz

kTkb 1

T TZk

kTz

TZ

Page 14: Chapter 3 Prediction and model selection

Introduction to time series (2008) 14

Chapter 3. Prediction and model selection.

3.2. The computation of ARIMA forecasts.

 

Page 15: Chapter 3 Prediction and model selection

Introduction to time series (2008) 15

The computation of ARIMA forecasts

• Suppose we want to forecast a time series that follows an ARIMA(p,d,q) model. First, we will assume that the parameters are known and the prediction horizon is 1 (k=1)

• where h=p+d

111111 ...... qTqTThThTT aaazzz

Page 16: Chapter 3 Prediction and model selection

Introduction to time series (2008) 16

The computation of ARIMA forecasts

• The one-step-ahead forecast will be,

• and because the expected value for the observed sample data or the errors are themselves, and the only unknown is

]/[)1(ˆ 1 TTT ZzEz

1Ta

1111 ......)1(ˆ qTqThThTT aazzz

Page 17: Chapter 3 Prediction and model selection

Introduction to time series (2008) 17

The computation of ARIMA forecasts

• Therefore, the one-step prediction error is,

– remember this is considering that the parameters are known, and therefore, the innovations are also known because we can compute them recursively from the observations

)1(ˆ11 TTT zza

Page 18: Chapter 3 Prediction and model selection

Introduction to time series (2008) 18

The computation of ARIMA forecasts

• Multiple steps ahead forecast.

• where

)(ˆ...)1(ˆ)(ˆ 1 hkzkzkz ThTT

)(ˆ...)1(ˆ1 qkaka TqT

kjZaEjakjZzEjz

TjTT

TjTT

,...,2,1]/[)(ˆ,...2,1]/[)(ˆ

Page 19: Chapter 3 Prediction and model selection

Introduction to time series (2008) 19

The computation of ARIMA forecasts

• This expression has two parts:– The first one, which depends on the AR

coefficients, will determine the form of the long run forecast (eventual forecast equation).

– The second one, which depends on the moving average coefficients, will dissapear for k>q

Page 20: Chapter 3 Prediction and model selection

Introduction to time series (2008) 20

The computation of ARIMA forecasts

• AR(1) model

• for large k, the term , and therefore, the long-run forecast (for any ARMA(p,q)) will go to the mean of the process.

Tk

TT

TTT

TT

zkzkzzzz

zz

)1(ˆ)(ˆ)1(ˆ)2(ˆ

)1(ˆ2

Tk z

Page 21: Chapter 3 Prediction and model selection

Introduction to time series (2008) 21

The computation of ARIMA forecasts

• Random walk with constant.

• The forecasts follow a straight line with slope c. If c=0, all forecasts are equal to the last observed value.

TTT

TTT

TT

zkckzckzzczcz

zcz

)1(ˆ)(ˆ2)1(ˆ)2(ˆ

)1(ˆ

Page 22: Chapter 3 Prediction and model selection

Introduction to time series (2008) 22

Chapter 3. Prediction and model selection.

3.3. Interpreting the forecasts from ARIMA models.

 

Page 23: Chapter 3 Prediction and model selection

Introduction to time series (2008) 23

Interpretation of the forecasts

Nonseasonal models.

• The eventual forecast function of a nonseasonal ARIMA model verifies for k>q

• where

0))(ˆ)(( kzB Td

)( tzmean

Page 24: Chapter 3 Prediction and model selection

Introduction to time series (2008) 24

Interpretation of the forecasts

• Espasa and Peña (1995) proved that the general solution for this equation can be written as,

• where, the permanent component is,

),0max()()()(ˆ pdqkKtkPkz TTT

)(kPTd

Page 25: Chapter 3 Prediction and model selection

Introduction to time series (2008) 25

Interpretation of the forecasts

• and the transitory component is,

• Permanent component will be given by

• With determined by the mean of the stationary process

0)( TtB

dd

TTT kkkP ...)( )(

1)(

0

!/ dd

Page 26: Chapter 3 Prediction and model selection

Introduction to time series (2008) 26

Interpretation of the forecasts

• whereas the rest of the parameters, depend on the initial values and change with the forecast origin.

• Examples:

)(Ti

22/)(1)(0)(

2)(1

)(

)(

dkkkPdkkPdkP

TToT

ToT

T

Page 27: Chapter 3 Prediction and model selection

Introduction to time series (2008) 27

Interpretation of the forecasts

• 1. will be constant for all horizons.• 2. deterministic linear trend with slope ,

if , then the permanent component is just a constant.

• 3. the solution is a quadratic trend with the leading term determined by . If the equation reduces to a linear trend, but now the slope depends on the origin of the forecast.

0

0

Page 28: Chapter 3 Prediction and model selection

Introduction to time series (2008) 28

Interpretation of the forecasts

• In summary, the long-run forecast from an ARIMA model is the mean if the series is stationary and a polynomial for nonstatio-nary models. – In this last case, the leading term of the

polynomial is a constant (when the mean is zero), whereas it depends on the forecast origin (adaptative) if the mean is different from zero.

Page 29: Chapter 3 Prediction and model selection

Introduction to time series (2008) 29

Interpretation of the forecasts

• Transitory component. Can be given by

• where are the roots of the AR polyno-mial and are coefficientes depending on the forecast origin.

p

i

kiiT GAkt

1)(

1iG

iA

Page 30: Chapter 3 Prediction and model selection

Introduction to time series (2008) 30

Interpretation of the forecasts

• Example. Consider the model,

• then , and the forecasts must have the form,

• where ,the constant that appears as the solution of and , the constant in the transitory equation must

tt azB )1(

1G

kTT Ackz 1)(ˆ

Tc0)( kPT 1A

Page 31: Chapter 3 Prediction and model selection

Introduction to time series (2008) 31

Interpretation of the forecasts

• be determined from the initial conditions and can be obtained by

• and the solution of these two equations is

1 12 2

1 1 1

ˆ (1) ( )

ˆ (2) ( ) ( )T T T T T

T T T T T T T

z c A Á z Á z zz c A Á z Á z z Á z z

1( )1T T

T TÁ z zc z Á

Page 32: Chapter 3 Prediction and model selection

Introduction to time series (2008) 32

Interpretation of the forecasts

• and,

• these results indicate that the forecasts are slowly approaching the long run forecast

• note that as goes to zero, the adjust-ment made by the transitory decreases exponentially. Cases for

11

( )1T TÁ z zA Á

Tc1kA Á

Á

Page 33: Chapter 3 Prediction and model selection

Introduction to time series (2008) 33

Interpretation of the forecasts

• Seasonal models. For seasonal processes the forecast will satisfy the equation

• Let us assume that D=1, then the seasonal difference

ˆ( ) ( )( ( ) ) 0s D ds TB Á B z k ¹

2 1(1 ) (1 ... )(1 ) ( )(1 )s ssB B B B B S B B

Page 34: Chapter 3 Prediction and model selection

Introduction to time series (2008) 34

Interpretation of the forecasts

• and therefore,

• which has the property that all the operators involved do not share roots in common. The solution is given by

1 ˆ( ) ( )( ( ) ( ) ) 0s ds TB Á B S B z k ¹

ˆ ( ) ( ) ( ) ( )T T T Tz k T k E k t k

Page 35: Chapter 3 Prediction and model selection

Introduction to time series (2008) 35

Interpretation of the forecasts

• Permanent component has been splitted into two terms, trend component

• and the seasonal component

1 ( )dT

¹T k s

( ) ( ) 0s TS B E k

Page 36: Chapter 3 Prediction and model selection

Introduction to time series (2008) 36

Interpretation of the forecasts

• Finally the transitory component, which will die out for large horizon is,

• the trend component has the same form as for nonseasonal data, but the order is d+1 and therefore the last term is

( ) ( ) ( ) 0sTB Á B t k

1 / ( 1)!d¯ ¹ s d

Page 37: Chapter 3 Prediction and model selection

Introduction to time series (2008) 37

Interpretation of the forecasts

• the seasonal component will be given by

• and the solution of this equation is a function of period s and values summing zero each s lags. The coefficients are called seasonal coefficients and depend on the forecasting origin.

2

1 1

( ) ( ) 0s s

T Tj sE j E j

Page 38: Chapter 3 Prediction and model selection

Introduction to time series (2008) 38

Interpretation of the forecasts

• Example: the airline model.

• The equation of the forecast is

1212 (1 )(1 )t tz µB B a

ˆ ˆ ˆ ˆ( ) ( 1) ( 12) ( 13)ˆ ˆ ˆ( 1) ( 12) ( 13)

T T T T

T T T

z k z k z k z kµa k a k µ a k

Page 39: Chapter 3 Prediction and model selection

Introduction to time series (2008) 39

Interpretation of the forecasts

• This equation can be written,

• that is a linear trend plus a seasonal compo-nent with coefficients that are changing over time. In order to determine the parameters, we need 13 initial conditions

( ) ( ) ( )0 1

ˆ ˆˆ ( ) T T TT kz k ¯ ¯ k S

0 1ˆ ˆˆ ( ) 1,2,...,13T T T

T jz j ¯ j ¯ S j

Page 40: Chapter 3 Prediction and model selection

Introduction to time series (2008) 40

Interpretation of the forecasts

• With , we obtain that the slope is,

• and calling

12T Tj jS S

1ˆ ˆ(13) (1)ˆ

12T T Tz z¯

12

1 ... 1210 112 12

1

ˆ ˆˆ ( ) T TTz z j ¯ ¯

Page 41: Chapter 3 Prediction and model selection

Introduction to time series (2008) 41

Interpretation of the forecasts

• we have that

• The seasonal ceofficients are

• and will be given by the deviations of the forecast from the trend component.

0 113ˆ ˆ2

T T¯ z ¯

0 1ˆ ˆˆ ( )T T T

j TS z j ¯ ¯ j

Page 42: Chapter 3 Prediction and model selection

Introduction to time series (2008) 42

Prediction confidence intervals

• Known parameter values. Let us write

• then, we can write

• taking expected values conditional to data

( )t tz à B a

0T k j T k j

jz à a

0

ˆ ( )T j k T jj

z k à a

Page 43: Chapter 3 Prediction and model selection

Introduction to time series (2008) 43

Prediction confidence intervals

• The forecast error is

• with variance

• this equation indicates that the uncertainty of the long run forecast is different for stationary and nonstationary models.

1 1 1 1ˆ( ) ( ) ...T T k T T k T k k Te k z z k a Ãa à a

2 2 21 1( ( )) (1 ... )T kVar e k ¾ Ã Ã

Page 44: Chapter 3 Prediction and model selection

Introduction to time series (2008) 44

Prediction confidence intervals

• For a stationary model the series converge since

• For an AR(1) model, for instance

• The long run forecast goes to the mean, and the uncertainty is finite.

0kà k

2( ( )) /(1 )TVar e k ¾ Á

Page 45: Chapter 3 Prediction and model selection

Introduction to time series (2008) 45

Prediction confidence intervals

• When the model is nonstationary, the variance of the forecast grows without bounds. This means that we cannot make useful long run forecasts.

• If the distribution of the forecast error is known, we can compute confidence intervals for the forecast.

Page 46: Chapter 3 Prediction and model selection

Introduction to time series (2008) 46

Prediction confidence intervals

• Assuming normality, the 95% confidence interval for the random variable

• We may also need the covariances, for h>0

T kz

2 2 1/ 21 1ˆ ˆ( ) 1.96 (1 ... )T kz k ¾ Ã Ã

12

1

ˆ ˆcov( ( ), ( 1) ( ( ), ( 1))i

T T T T h j jj

z i z i E e i e i ¾ Ã Ã

Page 47: Chapter 3 Prediction and model selection

Introduction to time series (2008) 47

Prediction confidence intervals

• Unknown parameter values. It can be shown that the uncertainty introduced in the forecast for this additional source is small for moderate sample size, and can be ignored in practice.

• Suppose an AR(1) model,

ˆˆ (1)T Tz Áz

Page 48: Chapter 3 Prediction and model selection

Introduction to time series (2008) 48

Prediction confidence intervals

• the true forecast error , is related to the observed forecast error

• assuming that is fixed, and using that

(1)T Te a*

1ˆ(1)T T Te z Áz

* ˆ(1) (1) ( )T T Te e Á Á z

Tz2 2

1ˆ( ) / tVar Á ¾ z

Page 49: Chapter 3 Prediction and model selection

Introduction to time series (2008) 49

Prediction confidence intervals

• we have that

• This equation indicates that the forecast error has two components:– uncertainty due to the random behavoir of the

observation.– The second measures the parameter uncertainty

becuse the parameter are estimated from the sample. (order 1/n - can be ignored for large n).

* 2 2 21( (1)) (1 / )T T tVar e ¾ z n z

Page 50: Chapter 3 Prediction and model selection

Introduction to time series (2008) 50

Chapter 3. Prediction and model selection.

3.5. Forecasting updating.

 

Page 51: Chapter 3 Prediction and model selection

Introduction to time series (2008) 51

Forecasting updating.

• Computing updated forecasts. When new observations become available, forecasts are updated by

• which leads to

1 1

1 1 1

ˆ ( ) ...ˆ ( 1) ...

T k T k T

T k T k T

z k à a à az k à a à a

1 1 1ˆ ˆ( 1) ( )T T k Tz k z k à a

Page 52: Chapter 3 Prediction and model selection

Introduction to time series (2008) 52

Forecasting updating.

• where , is the one-step-ahead forecast error.

• and so the forecasts are updated by

– Note that the forecasts are updated by adding some part of the observed last forecast error to the previous forecast, and the coefficients for forecast updating are the weights.

1 1 ˆ (1)T T Ta z z

1 1 1ˆ ˆ( 1) ( )T T k Tz k z k à a

Ã

Page 53: Chapter 3 Prediction and model selection

Introduction to time series (2008) 53

Forecasting updating.

• Testing model stability Box and Tiao(1976) If the model is correct, we have that the statistic,

• because we need an estimate of the variance

21

2

ˆ( )

hT jj

ha

Q h ¾

21*

,2

ˆ /( )

ˆ

hT jj

h n p qa h

Q h F¾

Page 54: Chapter 3 Prediction and model selection

Introduction to time series (2008) 54

Chapter 3. Prediction and model selection.

3.6. Model selection criteria.

 

Page 55: Chapter 3 Prediction and model selection

Introduction to time series (2008) 55

Model selection criteria.

• The FPE and AIC criteria. Suppose we want to select the order of an AR(p) model in such a way that the out-of-sample one-step- ahead prediction mean-square error is minimized. This MSE is given by

• With

21 1( ) [ ' ]T T pMSE z E z Á Z

),...,( pTTp zzZ

Page 56: Chapter 3 Prediction and model selection

Introduction to time series (2008) 56

Model selection criteria.

• The forecast error can be decomposed as

• and so

• which decomposes the forecast error as the sum of the variable uncertainty and the parameter uncertainty.

1 1ˆ' ( ) 'T T p pe z Á Z Á Á Z

2 '1

ˆ ˆ( ) [( ) ' ( )]T p pMSE Z ¾ E Á Á Z Z Á Á

Page 57: Chapter 3 Prediction and model selection

Introduction to time series (2008) 57

Model selection criteria.

• This expectation can be approximated by,

• An unbiased estimate of is, Inserting this, we have an estimation of the out-of-sample forecast error. If we want to minimize this value, it implies that the order p must be chosen by minimizing the FPE criterion

)1()( 21 n

pTzMSE

2 )/(ˆ 2 pnn

Page 58: Chapter 3 Prediction and model selection

Introduction to time series (2008) 58

Model selection criteria.

• The Final Prediction Error (FPE) combines fitting with parsimony, due to the penalty introduced by the term (n+p)/(n-p).

pnpnFPE

)(ˆ 2

Page 59: Chapter 3 Prediction and model selection

Introduction to time series (2008) 59

Model selection criteria.

• An equivalent form for this criterion is

• Multiplying for n, we ontain the AIC criterion

– Tends to overstimate the number of parameters

npnnFPE n

pnp

/2ˆlog)1(log)1(logˆloglog

2

2

pnAIC 2ˆlog 2

Page 60: Chapter 3 Prediction and model selection

Introduction to time series (2008) 60

Model selection criteria.

• Bayesian Information Criterion (BIC).

• In this criteria the penalty is grater than in AIC, so BIC tends to select simpler models.

pnnBIC )(logˆlog 2

))((log)(2)(2)(2qpndevianceBIC

qpdevianceAIC