energy infrastructure planning: forecastinge4210...time series examples hourly electricity demand...
TRANSCRIPT
Who am I?
PhD student in the IEOR department
Advisors: Prof. Vijay Modi and Prof. Garud Iyengar
Research:
Robust control algorithms for solar micro-gridsControl, signal detection, and forecasting methods for managing DRprograms
Who am I?
PhD student in the IEOR department
Advisors: Prof. Vijay Modi and Prof. Garud Iyengar
Research:
Robust control algorithms for solar micro-gridsControl, signal detection, and forecasting methods for managing DRprograms
Who am I?
PhD student in the IEOR department
Advisors: Prof. Vijay Modi and Prof. Garud Iyengar
Research:
Robust control algorithms for solar micro-gridsControl, signal detection, and forecasting methods for managing DRprograms
Who am I?
PhD student in the IEOR department
Advisors: Prof. Vijay Modi and Prof. Garud Iyengar
Research:
Robust control algorithms for solar micro-gridsControl, signal detection, and forecasting methods for managing DRprograms
References
Hyndman, R. J. & Athanasopoulos, G.(2013) Forecasting: principlesand practice.
www.otexts.org/fpp/
R package fpp
Outline
1 Time series in R
2 Simple forecasting methods
3 Measuring forecast accuracy
4 Seasonality and stationarity
5 ARIMA forecasting
6 Exponential smoothing
Time series data
Time series consists of sequences of observationscollected over time.
We will assume the time periods are equally spaced
Time series examples
Hourly electricity demandDaily maximum temperatureWeekly wind generationMonthly rainfall
Forecasting is estimating how the sequence ofobservations will continue into the future.
Time series data
Time series consists of sequences of observationscollected over time.We will assume the time periods are equally spaced
Time series examples
Hourly electricity demandDaily maximum temperatureWeekly wind generationMonthly rainfall
Forecasting is estimating how the sequence ofobservations will continue into the future.
Time series data
Time series consists of sequences of observationscollected over time.We will assume the time periods are equally spaced
Time series examples
Hourly electricity demandDaily maximum temperatureWeekly wind generationMonthly rainfall
Forecasting is estimating how the sequence ofobservations will continue into the future.
Time series data
Time series consists of sequences of observationscollected over time.We will assume the time periods are equally spaced
Time series examplesHourly electricity demand
Daily maximum temperatureWeekly wind generationMonthly rainfall
Forecasting is estimating how the sequence ofobservations will continue into the future.
Time series data
Time series consists of sequences of observationscollected over time.We will assume the time periods are equally spaced
Time series examplesHourly electricity demandDaily maximum temperature
Weekly wind generationMonthly rainfall
Forecasting is estimating how the sequence ofobservations will continue into the future.
Time series data
Time series consists of sequences of observationscollected over time.We will assume the time periods are equally spaced
Time series examplesHourly electricity demandDaily maximum temperatureWeekly wind generation
Monthly rainfall
Forecasting is estimating how the sequence ofobservations will continue into the future.
Time series data
Time series consists of sequences of observationscollected over time.We will assume the time periods are equally spaced
Time series examplesHourly electricity demandDaily maximum temperatureWeekly wind generationMonthly rainfall
Forecasting is estimating how the sequence ofobservations will continue into the future.
Time series in RMain package used in this course> library(fpp)
Forecasting using R Time series data 34
Time series in RMain package used in this course> library(fpp)This loads:
some data for use in examples and exercisesforecast package (for forecasting functions)tseries package (for a few time seriesfunctions)fma package (for lots of time series data)expsmooth package (for more time seriesdata)lmtest package (for some regressionfunctions)
Forecasting using R Time series data 34
Time series in RMain package used in this course> library(fpp)This loads:
some data for use in examples and exercisesforecast package (for forecasting functions)tseries package (for a few time seriesfunctions)fma package (for lots of time series data)expsmooth package (for more time seriesdata)lmtest package (for some regressionfunctions)
Forecasting using R Time series data 34
Time series in RMain package used in this course> library(fpp)This loads:
some data for use in examples and exercisesforecast package (for forecasting functions)tseries package (for a few time seriesfunctions)fma package (for lots of time series data)expsmooth package (for more time seriesdata)lmtest package (for some regressionfunctions)
Forecasting using R Time series data 34
Time series in RMain package used in this course> library(fpp)This loads:
some data for use in examples and exercisesforecast package (for forecasting functions)tseries package (for a few time seriesfunctions)fma package (for lots of time series data)expsmooth package (for more time seriesdata)lmtest package (for some regressionfunctions)
Forecasting using R Time series data 34
Time series in RMain package used in this course> library(fpp)This loads:
some data for use in examples and exercisesforecast package (for forecasting functions)tseries package (for a few time seriesfunctions)fma package (for lots of time series data)expsmooth package (for more time seriesdata)lmtest package (for some regressionfunctions)
Forecasting using R Time series data 34
Time series in RMain package used in this course> library(fpp)This loads:
some data for use in examples and exercisesforecast package (for forecasting functions)tseries package (for a few time seriesfunctions)fma package (for lots of time series data)expsmooth package (for more time seriesdata)lmtest package (for some regressionfunctions)
Forecasting using R Time series data 34
Time series in R
Other packages> library(xts)
Order time series by timestamp
Nicer plotsEasier time aggregation
Time series in R
Other packages> library(xts)
Order time series by timestampNicer plots
Easier time aggregation
Time series in R
Other packages> library(xts)
Order time series by timestampNicer plotsEasier time aggregation
Outline
1 Time series in R
2 Simple forecasting methods
3 Measuring forecast accuracy
4 Seasonality and stationarity
5 ARIMA forecasting
6 Exponential smoothing
Notation
yt : observed value at time t
yT+h|T : forecast for time T + h made at time T withhistorical information up to time T
Notation
yt : observed value at time t
yT+h|T : forecast for time T + h made at time T withhistorical information up to time T
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Some simple forecasting methods
Average method
Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T
Naïve method (for time series only)
Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.
Seasonal naïve method
Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.
Forecasting using R Some simple forecasting methods 39
Drift method
Forecasts equal to last value plus averagechange.
Forecasts:
yT+h|T = yT +h
T − 1
T∑
t=2
(yt − yt−1)
= yT +h
T − 1(yT − y1).
Equivalent to extrapolating a line drawnbetween first and last observations.
Forecasting using R Some simple forecasting methods 41
Drift method
Forecasts equal to last value plus averagechange.
Forecasts:
yT+h|T = yT +h
T − 1
T∑
t=2
(yt − yt−1)
= yT +h
T − 1(yT − y1).
Equivalent to extrapolating a line drawnbetween first and last observations.
Forecasting using R Some simple forecasting methods 41
Drift method
Forecasts equal to last value plus averagechange.
Forecasts:
yT+h|T = yT +h
T − 1
T∑
t=2
(yt − yt−1)
= yT +h
T − 1(yT − y1).
Equivalent to extrapolating a line drawnbetween first and last observations.
Forecasting using R Some simple forecasting methods 41
Some simple forecasting methods
Mean: meanf(x, h=20)
Naive: naive(x, h=20) or rwf(x, h=20)
Seasonal naive: snaive(x, h=20)
Drift: rwf(x, drift=TRUE, h=20)
Forecasting using R Some simple forecasting methods 43
Some simple forecasting methods
Mean: meanf(x, h=20)
Naive: naive(x, h=20) or rwf(x, h=20)
Seasonal naive: snaive(x, h=20)
Drift: rwf(x, drift=TRUE, h=20)
Forecasting using R Some simple forecasting methods 43
Some simple forecasting methods
Mean: meanf(x, h=20)
Naive: naive(x, h=20) or rwf(x, h=20)
Seasonal naive: snaive(x, h=20)
Drift: rwf(x, drift=TRUE, h=20)
Forecasting using R Some simple forecasting methods 43
Some simple forecasting methods
Mean: meanf(x, h=20)
Naive: naive(x, h=20) or rwf(x, h=20)
Seasonal naive: snaive(x, h=20)
Drift: rwf(x, drift=TRUE, h=20)
Forecasting using R Some simple forecasting methods 43
Outline
1 Time series in R
2 Simple forecasting methods
3 Measuring forecast accuracy
4 Seasonality and stationarity
5 ARIMA forecasting
6 Exponential smoothing
Forecasting residuals
Residuals in forecasting: difference betweenobserved value and its forecast based on allprevious observations: et = yt − yt|t−1.
Assumptions1 {et} uncorrelated. If they aren’t, then
information left in residuals that should be usedin computing forecasts.
2 {et} have mean zero. If they don’t, thenforecasts are biased.
Useful properties (for prediction intervals)3 {et} have constant variance.4 {et} are normally distributed.
Forecasting using R Forecast residuals 10
Forecasting residuals
Residuals in forecasting: difference betweenobserved value and its forecast based on allprevious observations: et = yt − yt|t−1.
Assumptions1 {et} uncorrelated. If they aren’t, then
information left in residuals that should be usedin computing forecasts.
2 {et} have mean zero. If they don’t, thenforecasts are biased.
Useful properties (for prediction intervals)3 {et} have constant variance.4 {et} are normally distributed.
Forecasting using R Forecast residuals 10
Forecasting residuals
Residuals in forecasting: difference betweenobserved value and its forecast based on allprevious observations: et = yt − yt|t−1.
Assumptions1 {et} uncorrelated. If they aren’t, then
information left in residuals that should be usedin computing forecasts.
2 {et} have mean zero. If they don’t, thenforecasts are biased.
Useful properties (for prediction intervals)3 {et} have constant variance.4 {et} are normally distributed.
Forecasting using R Forecast residuals 10
Measures of forecast accuracy
Let yt denote the tth observation and yt|t−1 denote its forecastbased on all previous data, where t = 1, . . . , T. Then thefollowing measures are useful.
MAE = T−1T∑
t=1
|yt − yt|t−1|
MSE = T−1T∑
t=1
(yt − yt|t−1)2 RMSE =
√√√√T−1
T∑
t=1
(yt − yt|t−1)2
MAPE = 100T−1T∑
t=1
|yt − yt|t−1|/|yt|
MAE, MSE, RMSE are all scale dependent.
MAPE is scale independent but is only sensible if yt � 0for all t, and y has a natural zero.
Forecasting using R Evaluating forecast accuracy 17
Measures of forecast accuracy
Let yt denote the tth observation and yt|t−1 denote its forecastbased on all previous data, where t = 1, . . . , T. Then thefollowing measures are useful.
MAE = T−1T∑
t=1
|yt − yt|t−1|
MSE = T−1T∑
t=1
(yt − yt|t−1)2 RMSE =
√√√√T−1
T∑
t=1
(yt − yt|t−1)2
MAPE = 100T−1T∑
t=1
|yt − yt|t−1|/|yt|
MAE, MSE, RMSE are all scale dependent.
MAPE is scale independent but is only sensible if yt � 0for all t, and y has a natural zero.
Forecasting using R Evaluating forecast accuracy 17
Measures of forecast accuracy
Let yt denote the tth observation and yt|t−1 denote its forecastbased on all previous data, where t = 1, . . . , T. Then thefollowing measures are useful.
MAE = T−1T∑
t=1
|yt − yt|t−1|
MSE = T−1T∑
t=1
(yt − yt|t−1)2 RMSE =
√√√√T−1
T∑
t=1
(yt − yt|t−1)2
MAPE = 100T−1T∑
t=1
|yt − yt|t−1|/|yt|
MAE, MSE, RMSE are all scale dependent.
MAPE is scale independent but is only sensible if yt � 0for all t, and y has a natural zero.
Forecasting using R Evaluating forecast accuracy 17
Measures of forecast accuracy
Mean Absolute Scaled Error
MASE = T−1T∑
t=1
|yt − yt|t−1|/Q
where Q is a stable measure of the scale of the timeseries {yt}.
Forecasting using R Evaluating forecast accuracy 18
Measures of forecast accuracy
Mean Absolute Scaled Error
MASE = T−1T∑
t=1
|yt − yt|t−1|/Q
where Q is a stable measure of the scale of the timeseries {yt}.
For non-seasonal time series,
Q = (T − 1)−1T∑
t=2
|yt − yt−1|
works well. Then MASE is equivalent to MAE relativeto a naive method.
Forecasting using R Evaluating forecast accuracy 18
Measures of forecast accuracy
Mean Absolute Scaled Error
MASE = T−1T∑
t=1
|yt − yt|t−1|/Q
where Q is a stable measure of the scale of the timeseries {yt}.
For seasonal time series,
Q = (T −m)−1T∑
t=m+1
|yt − yt−m|
works well. Then MASE is equivalent to MAE relativeto a seasonal naive method.
Forecasting using R Evaluating forecast accuracy 19
Training and test sets
Available data
Training set Test set(e.g., 80%) (e.g., 20%)
The test set must not be used for any aspect ofmodel development or calculation of forecasts.
Forecast accuracy is based only on the test set.
Forecasting using R Evaluating forecast accuracy 24
Training and test sets
Available data
Training set Test set(e.g., 80%) (e.g., 20%)
The test set must not be used for any aspect ofmodel development or calculation of forecasts.
Forecast accuracy is based only on the test set.
Forecasting using R Evaluating forecast accuracy 24
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting using R Evaluating forecast accuracy 26
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting using R Evaluating forecast accuracy 26
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting using R Evaluating forecast accuracy 26
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting using R Evaluating forecast accuracy 26
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting using R Evaluating forecast accuracy 26
Outline
1 Time series in R
2 Simple forecasting methods
3 Measuring forecast accuracy
4 Seasonality and stationarity
5 ARIMA forecasting
6 Exponential smoothing
Time series graphics
Time plotsR command: plot or plot.ts
Seasonal plotsR command: seasonplot
Seasonal subseries plotsR command: monthplot
Lag plotsR command: lag.plot
ACF plotsR command: Acf
Forecasting using R Time series graphics 3
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting using R Time series graphics 7
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting using R Time series graphics 7
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting using R Time series graphics 7
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting using R Time series graphics 7
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting using R Time series graphics 9
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting using R Time series graphics 9
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting using R Time series graphics 9
Time series patterns
Trend pattern exists when there is a long-termincrease or decrease in the data.
Seasonal pattern exists when a series isinfluenced by seasonal factors (e.g., thequarter of the year, the month, or day ofthe week).
Cyclic pattern exists when data exhibit rises andfalls that are not of fixed period (durationusually of at least 2 years).
Forecasting using R Seasonal or cyclic? 15
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Time series patterns
Forecasting using R Seasonal or cyclic? 16
Australian electricity production
Year
GW
h
1980 1985 1990 1995
8000
1000
012
000
1400
0
Time series patterns
Forecasting using R Seasonal or cyclic? 17
Australian clay brick production
Year
mill
ion
units
1960 1970 1980 1990
200
300
400
500
600
Stationarity
DefinitionIf {yt} is a stationary time series, then for
all s, the distribution of (yt, . . . , yt+s) does
not depend on t.
A stationary series is:
roughly horizontal
constant variance
no patterns predictable in the long-term
Forecasting using R Stationarity 3
Stationarity
DefinitionIf {yt} is a stationary time series, then for
all s, the distribution of (yt, . . . , yt+s) does
not depend on t.
A stationary series is:
roughly horizontal
constant variance
no patterns predictable in the long-term
Forecasting using R Stationarity 3
Stationarity
DefinitionIf {yt} is a stationary time series, then for
all s, the distribution of (yt, . . . , yt+s) does
not depend on t.
A stationary series is:
roughly horizontal
constant variance
no patterns predictable in the long-term
Forecasting using R Stationarity 3
Stationarity
DefinitionIf {yt} is a stationary time series, then for
all s, the distribution of (yt, . . . , yt+s) does
not depend on t.
A stationary series is:
roughly horizontal
constant variance
no patterns predictable in the long-term
Forecasting using R Stationarity 3
Stationary?
Forecasting using R Stationarity 4
Day
Dow
−Jo
nes
inde
x
0 50 100 150 200 250 300
3600
3700
3800
3900
Stationary?
Forecasting using R Stationarity 5
Day
Cha
nge
in D
ow−
Jone
s in
dex
0 50 100 150 200 250 300
−10
0−
500
50
Stationarity
DefinitionIf {yt} is a stationary time series, then for
all s, the distribution of (yt, . . . , yt+s) does
not depend on t.
Transformations help to stabilize the
variance.
For ARIMA modelling, we also need to
stabilize the mean.Forecasting using R Stationarity 13
Non-stationarity in the mean
Identifying non-stationary series
time plot.
The ACF of stationary data drops to
zero relatively quickly
The ACF of non-stationary data
decreases slowly.
For non-stationary data, the value of r1is often large and positive.
Forecasting using R Stationarity 14
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting using R Autocorrelation 23
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting using R Autocorrelation 23
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting using R Autocorrelation 23
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑
t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting using R Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑
t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting using R Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑
t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting using R Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑
t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting using R Autocorrelation 27
Recognizing seasonality in a time series
If there is seasonality, the ACF at the seasonal lag(e.g., 12 for monthly data) will be large andpositive.
For seasonal monthly data, a large ACF valuewill be seen at lag 12 and possibly also at lags24, 36, . . .
For seasonal quarterly data, a large ACF valuewill be seen at lag 4 and possibly also at lags 8,12, . . .
Forecasting using R Autocorrelation 31
Recognizing seasonality in a time series
If there is seasonality, the ACF at the seasonal lag(e.g., 12 for monthly data) will be large andpositive.
For seasonal monthly data, a large ACF valuewill be seen at lag 12 and possibly also at lags24, 36, . . .
For seasonal quarterly data, a large ACF valuewill be seen at lag 4 and possibly also at lags 8,12, . . .
Forecasting using R Autocorrelation 31
Example: White noise
Forecasting using R White noise 5
White noise
Time
x
0 10 20 30 40 50
−3
−2
−1
01
2
Example: White noise
Forecasting using R White noise 5
White noise
Time
x
0 10 20 30 40 50
−3
−2
−1
01
2
White noise data is uncorrelated acrosstime with zero mean and constant variance.(Technically, we require independence aswell.)
Example: White noise
Forecasting using R White noise 5
White noise
Time
x
0 10 20 30 40 50
−3
−2
−1
01
2
White noise data is uncorrelated acrosstime with zero mean and constant variance.(Technically, we require independence aswell.)
Think of white noise as completelyuninteresting with no predictable patterns.
Example: White noise
r1 = 0.013r2 = −0.163r3 = 0.163r4 = −0.259r5 = −0.198r6 = 0.064r7 = −0.139r8 = −0.032r9 = 0.199r10 = −0.240
Sample autocorrelations for white noise series.For uncorrelated data, we would expect eachautocorrelation to be close to zero.
Forecasting using R White noise 6
−0.
4−
0.2
0.0
0.2
0.4
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1511 13
Sampling distribution of autocorrelations
Sampling distribution of rk for white noise data isasymptotically N(0,1/T).
95% of all rk for white noise must lie within±1.96/
√T.
If this is not the case, the series is probably notWN.
Common to plot lines at ±1.96/√T when
plotting ACF. These are the critical values.
Forecasting using R White noise 7
Sampling distribution of autocorrelations
Sampling distribution of rk for white noise data isasymptotically N(0,1/T).
95% of all rk for white noise must lie within±1.96/
√T.
If this is not the case, the series is probably notWN.
Common to plot lines at ±1.96/√T when
plotting ACF. These are the critical values.
Forecasting using R White noise 7
Sampling distribution of autocorrelations
Sampling distribution of rk for white noise data isasymptotically N(0,1/T).
95% of all rk for white noise must lie within±1.96/
√T.
If this is not the case, the series is probably notWN.
Common to plot lines at ±1.96/√T when
plotting ACF. These are the critical values.
Forecasting using R White noise 7
Sampling distribution of autocorrelations
Sampling distribution of rk for white noise data isasymptotically N(0,1/T).
95% of all rk for white noise must lie within±1.96/
√T.
If this is not the case, the series is probably notWN.
Common to plot lines at ±1.96/√T when
plotting ACF. These are the critical values.
Forecasting using R White noise 7
Autocorrelation
Forecasting using R White noise 8
−0.
4−
0.2
0.0
0.2
0.4
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1511 13
Example:T = 50 and socritical values at±1.96/
√50 =
±0.28.All autocorrelationcoefficients lie withinthese limits,confirming that thedata are white noise.(More precisely, the data cannot bedistinguished from white noise.)
ACF of residuals
We assume that the residuals are white noise(uncorrelated, mean zero, constant variance). Ifthey aren’t, then there is information left in theresiduals that should be used in computingforecasts.So a standard residual diagnostic is to checkthe ACF of the residuals of a forecastingmethod.We expect these to look like white noise.
Dow-Jones naive forecasts revisited
yt|t−1 = yt−1
et = yt − yt−1
Forecasting using R White noise 12
ACF of residuals
We assume that the residuals are white noise(uncorrelated, mean zero, constant variance). Ifthey aren’t, then there is information left in theresiduals that should be used in computingforecasts.So a standard residual diagnostic is to checkthe ACF of the residuals of a forecastingmethod.We expect these to look like white noise.
Dow-Jones naive forecasts revisited
yt|t−1 = yt−1
et = yt − yt−1
Forecasting using R White noise 12
ACF of residuals
We assume that the residuals are white noise(uncorrelated, mean zero, constant variance). Ifthey aren’t, then there is information left in theresiduals that should be used in computingforecasts.So a standard residual diagnostic is to checkthe ACF of the residuals of a forecastingmethod.We expect these to look like white noise.
Dow-Jones naive forecasts revisited
yt|t−1 = yt−1
et = yt − yt−1
Forecasting using R White noise 12
Non-stationarity in the mean
Identifying non-stationary series
time plot.
The ACF of stationary data drops to
zero relatively quickly
The ACF of non-stationary data
decreases slowly.
For non-stationary data, the value of r1is often large and positive.
Forecasting using R Stationarity 14
Non-stationarity in the mean
Identifying non-stationary series
time plot.
The ACF of stationary data drops to
zero relatively quickly
The ACF of non-stationary data
decreases slowly.
For non-stationary data, the value of r1is often large and positive.
Forecasting using R Stationarity 14
Non-stationarity in the mean
Identifying non-stationary series
time plot.
The ACF of stationary data drops to
zero relatively quickly
The ACF of non-stationary data
decreases slowly.
For non-stationary data, the value of r1is often large and positive.
Forecasting using R Stationarity 14
Non-stationarity in the mean
Identifying non-stationary series
time plot.
The ACF of stationary data drops to
zero relatively quickly
The ACF of non-stationary data
decreases slowly.
For non-stationary data, the value of r1is often large and positive.
Forecasting using R Stationarity 14
Differencing
Differencing helps to stabilize the
mean.
The differenced series is the change
between each observation in the
original series: y′t = yt − yt−1.
The differenced series will have only
T − 1 values since it is not possible to
calculate a difference y′1 for the first
observation.
Forecasting using R Ordinary differencing 20
Differencing
Differencing helps to stabilize the
mean.
The differenced series is the change
between each observation in the
original series: y′t = yt − yt−1.
The differenced series will have only
T − 1 values since it is not possible to
calculate a difference y′1 for the first
observation.
Forecasting using R Ordinary differencing 20
Differencing
Differencing helps to stabilize the
mean.
The differenced series is the change
between each observation in the
original series: y′t = yt − yt−1.
The differenced series will have only
T − 1 values since it is not possible to
calculate a difference y′1 for the first
observation.
Forecasting using R Ordinary differencing 20
Second-order differencing
Occasionally the differenced data will not
appear stationary and it may be necessary
to difference the data a second time:
y′′t = y′t − y′t−1
= (yt − yt−1)− (yt−1 − yt−2)
= yt − 2yt−1 + yt−2.
y′′t will have T − 2 values.
In practice, it is almost never necessary
to go beyond second-order differences.Forecasting using R Ordinary differencing 24
Second-order differencing
Occasionally the differenced data will not
appear stationary and it may be necessary
to difference the data a second time:
y′′t = y′t − y′t−1
= (yt − yt−1)− (yt−1 − yt−2)
= yt − 2yt−1 + yt−2.
y′′t will have T − 2 values.
In practice, it is almost never necessary
to go beyond second-order differences.Forecasting using R Ordinary differencing 24
Second-order differencing
Occasionally the differenced data will not
appear stationary and it may be necessary
to difference the data a second time:
y′′t = y′t − y′t−1
= (yt − yt−1)− (yt−1 − yt−2)
= yt − 2yt−1 + yt−2.
y′′t will have T − 2 values.
In practice, it is almost never necessary
to go beyond second-order differences.Forecasting using R Ordinary differencing 24
Second-order differencing
Occasionally the differenced data will not
appear stationary and it may be necessary
to difference the data a second time:
y′′t = y′t − y′t−1
= (yt − yt−1)− (yt−1 − yt−2)
= yt − 2yt−1 + yt−2.
y′′t will have T − 2 values.
In practice, it is almost never necessary
to go beyond second-order differences.Forecasting using R Ordinary differencing 24
Seasonal differencing
A seasonal difference is the difference
between an observation and the
corresponding observation from the
previous year.
y′t = yt − yt−m
where m = number of seasons.
For monthly data m = 12.
For quarterly data m = 4.Forecasting using R Seasonal differencing 26
Seasonal differencing
A seasonal difference is the difference
between an observation and the
corresponding observation from the
previous year.
y′t = yt − yt−m
where m = number of seasons.
For monthly data m = 12.
For quarterly data m = 4.Forecasting using R Seasonal differencing 26
Seasonal differencing
A seasonal difference is the difference
between an observation and the
corresponding observation from the
previous year.
y′t = yt − yt−m
where m = number of seasons.
For monthly data m = 12.
For quarterly data m = 4.Forecasting using R Seasonal differencing 26
Seasonal differencing
A seasonal difference is the difference
between an observation and the
corresponding observation from the
previous year.
y′t = yt − yt−m
where m = number of seasons.
For monthly data m = 12.
For quarterly data m = 4.Forecasting using R Seasonal differencing 26
Seasonal differencing
When both seasonal and first differences areapplied. . .
it makes no difference which is done first—theresult will be the same.
If seasonality is strong, we recommend thatseasonal differencing be done first becausesometimes the resulting series will bestationary and there will be no need for furtherfirst difference.
It is important that if differencing is used, thedifferences are interpretable.
Forecasting using R Seasonal differencing 35
Seasonal differencing
When both seasonal and first differences areapplied. . .
it makes no difference which is done first—theresult will be the same.
If seasonality is strong, we recommend thatseasonal differencing be done first becausesometimes the resulting series will bestationary and there will be no need for furtherfirst difference.
It is important that if differencing is used, thedifferences are interpretable.
Forecasting using R Seasonal differencing 35
Seasonal differencing
When both seasonal and first differences areapplied. . .
it makes no difference which is done first—theresult will be the same.
If seasonality is strong, we recommend thatseasonal differencing be done first becausesometimes the resulting series will bestationary and there will be no need for furtherfirst difference.
It is important that if differencing is used, thedifferences are interpretable.
Forecasting using R Seasonal differencing 35
Seasonal differencing
When both seasonal and first differences areapplied. . .
it makes no difference which is done first—theresult will be the same.
If seasonality is strong, we recommend thatseasonal differencing be done first becausesometimes the resulting series will bestationary and there will be no need for furtherfirst difference.
It is important that if differencing is used, thedifferences are interpretable.
Forecasting using R Seasonal differencing 35
Seasonal differencing
When both seasonal and first differences areapplied. . .
it makes no difference which is done first—theresult will be the same.
If seasonality is strong, we recommend thatseasonal differencing be done first becausesometimes the resulting series will bestationary and there will be no need for furtherfirst difference.
It is important that if differencing is used, thedifferences are interpretable.
Forecasting using R Seasonal differencing 35
Interpretation of differencing
first differences are the change
between one observation and the
next;
seasonal differences are the change
between one year to the next.
But taking lag 3 differences for yearly data,
for example, results in a model which
cannot be sensibly interpreted.
Forecasting using R Seasonal differencing 36
Interpretation of differencing
first differences are the change
between one observation and the
next;
seasonal differences are the change
between one year to the next.
But taking lag 3 differences for yearly data,
for example, results in a model which
cannot be sensibly interpreted.
Forecasting using R Seasonal differencing 36
Interpretation of differencing
first differences are the change
between one observation and the
next;
seasonal differences are the change
between one year to the next.
But taking lag 3 differences for yearly data,
for example, results in a model which
cannot be sensibly interpreted.
Forecasting using R Seasonal differencing 36
Interpretation of differencing
first differences are the change
between one observation and the
next;
seasonal differences are the change
between one year to the next.
But taking lag 3 differences for yearly data,
for example, results in a model which
cannot be sensibly interpreted.
Forecasting using R Seasonal differencing 36
Outline
1 Time series in R
2 Simple forecasting methods
3 Measuring forecast accuracy
4 Seasonality and stationarity
5 ARIMA forecasting
6 Exponential smoothing
Autoregressive modelsAutoregressive (AR) models:
yt = c+ φ1yt−1 + φ2yt−2 + · · ·+ φpyt−p + et,
where et is white noise. This is a multiple regressionwith lagged values of yt as predictors.
Forecasting using R Non-seasonal ARIMA models 3
Autoregressive modelsAutoregressive (AR) models:
yt = c+ φ1yt−1 + φ2yt−2 + · · ·+ φpyt−p + et,
where et is white noise. This is a multiple regressionwith lagged values of yt as predictors.
AR(1)
Time
0 20 40 60 80 100
78
910
1112
13
AR(2)
Time
0 20 40 60 80 100
1618
2022
24
Forecasting using R Non-seasonal ARIMA models 3
AR(1) model
yt = c+ φ1yt−1 + et
When φ1 = 0, yt is equivalent to WN
When φ1 = 1 and c = 0, yt is
equivalent to a RW
When φ1 = 1 and c 6= 0, yt is
equivalent to a RW with drift
When φ1 < 0, yt tends to oscillate
between positive and negative
values.
Forecasting using R Non-seasonal ARIMA models 4
AR(1) model
yt = c+ φ1yt−1 + et
When φ1 = 0, yt is equivalent to WN
When φ1 = 1 and c = 0, yt is
equivalent to a RW
When φ1 = 1 and c 6= 0, yt is
equivalent to a RW with drift
When φ1 < 0, yt tends to oscillate
between positive and negative
values.
Forecasting using R Non-seasonal ARIMA models 4
AR(1) model
yt = c+ φ1yt−1 + et
When φ1 = 0, yt is equivalent to WN
When φ1 = 1 and c = 0, yt is
equivalent to a RW
When φ1 = 1 and c 6= 0, yt is
equivalent to a RW with drift
When φ1 < 0, yt tends to oscillate
between positive and negative
values.
Forecasting using R Non-seasonal ARIMA models 4
AR(1) model
yt = c+ φ1yt−1 + et
When φ1 = 0, yt is equivalent to WN
When φ1 = 1 and c = 0, yt is
equivalent to a RW
When φ1 = 1 and c 6= 0, yt is
equivalent to a RW with drift
When φ1 < 0, yt tends to oscillate
between positive and negative
values.
Forecasting using R Non-seasonal ARIMA models 4
Moving Average (MA) models
Moving Average (MA) models:
yt = c+ et + θ1et−1 + θ2et−2 + · · ·+ θqet−q,
where et is white noise. This is a multiple regressionwith past errors as predictors. Don’t confuse thiswith moving average smoothing!
Forecasting using R Non-seasonal ARIMA models 5
Moving Average (MA) models
Moving Average (MA) models:
yt = c+ et + θ1et−1 + θ2et−2 + · · ·+ θqet−q,
where et is white noise. This is a multiple regressionwith past errors as predictors. Don’t confuse thiswith moving average smoothing!
MA(1)
Time
0 20 40 60 80 100
1718
1920
2122
23
MA(2)
Time
0 20 40 60 80 100
−4
−2
02
4
Forecasting using R Non-seasonal ARIMA models 5
ARIMA models
Autoregressive Moving Average models:
yt = c+ φ1yt−1 + · · ·+ φpyt−p+ θ1et−1 + · · ·+ θqet−q + et.
Predictors include both lagged values of ytand lagged errors.
ARMA models can be used for a huge range ofstationary time series.
They model the short-term dynamics.
An ARMA model applied to differenced data isan ARIMA model.
Forecasting using R Non-seasonal ARIMA models 6
ARIMA models
Autoregressive Moving Average models:
yt = c+ φ1yt−1 + · · ·+ φpyt−p+ θ1et−1 + · · ·+ θqet−q + et.
Predictors include both lagged values of ytand lagged errors.
ARMA models can be used for a huge range ofstationary time series.
They model the short-term dynamics.
An ARMA model applied to differenced data isan ARIMA model.
Forecasting using R Non-seasonal ARIMA models 6
ARIMA models
Autoregressive Moving Average models:
yt = c+ φ1yt−1 + · · ·+ φpyt−p+ θ1et−1 + · · ·+ θqet−q + et.
Predictors include both lagged values of ytand lagged errors.
ARMA models can be used for a huge range ofstationary time series.
They model the short-term dynamics.
An ARMA model applied to differenced data isan ARIMA model.
Forecasting using R Non-seasonal ARIMA models 6
ARIMA models
Autoregressive Moving Average models:
yt = c+ φ1yt−1 + · · ·+ φpyt−p+ θ1et−1 + · · ·+ θqet−q + et.
Predictors include both lagged values of ytand lagged errors.
ARMA models can be used for a huge range ofstationary time series.
They model the short-term dynamics.
An ARMA model applied to differenced data isan ARIMA model.
Forecasting using R Non-seasonal ARIMA models 6
ARIMA models
Autoregressive Moving Average models:
yt = c+ φ1yt−1 + · · ·+ φpyt−p+ θ1et−1 + · · ·+ θqet−q + et.
Predictors include both lagged values of ytand lagged errors.
ARMA models can be used for a huge range ofstationary time series.
They model the short-term dynamics.
An ARMA model applied to differenced data isan ARIMA model.
Forecasting using R Non-seasonal ARIMA models 6
ARIMA modelsAutoregressive Integrated Moving AveragemodelsARIMA(p,d,q) model
AR: p = order of the autoregressive partI: d = degree of first differencing involved
MA: q = order of the moving average part.
White noise model: ARIMA(0,0,0)
Random walk: ARIMA(0,1,0) with no constant
Random walk with drift: ARIMA(0,1,0) with const.
AR(p): ARIMA(p,0,0)
MA(q): ARIMA(0,0,q)
Forecasting using R Non-seasonal ARIMA models 7
ARIMA modelsAutoregressive Integrated Moving AveragemodelsARIMA(p,d,q) model
AR: p = order of the autoregressive partI: d = degree of first differencing involved
MA: q = order of the moving average part.
White noise model: ARIMA(0,0,0)
Random walk: ARIMA(0,1,0) with no constant
Random walk with drift: ARIMA(0,1,0) with const.
AR(p): ARIMA(p,0,0)
MA(q): ARIMA(0,0,q)
Forecasting using R Non-seasonal ARIMA models 7
ARIMA modelsAutoregressive Integrated Moving AveragemodelsARIMA(p,d,q) model
AR: p = order of the autoregressive partI: d = degree of first differencing involved
MA: q = order of the moving average part.
White noise model: ARIMA(0,0,0)
Random walk: ARIMA(0,1,0) with no constant
Random walk with drift: ARIMA(0,1,0) with const.
AR(p): ARIMA(p,0,0)
MA(q): ARIMA(0,0,q)
Forecasting using R Non-seasonal ARIMA models 7
ARIMA modelsAutoregressive Integrated Moving AveragemodelsARIMA(p,d,q) model
AR: p = order of the autoregressive partI: d = degree of first differencing involved
MA: q = order of the moving average part.
White noise model: ARIMA(0,0,0)
Random walk: ARIMA(0,1,0) with no constant
Random walk with drift: ARIMA(0,1,0) with const.
AR(p): ARIMA(p,0,0)
MA(q): ARIMA(0,0,q)
Forecasting using R Non-seasonal ARIMA models 7
ARIMA modelsAutoregressive Integrated Moving AveragemodelsARIMA(p,d,q) model
AR: p = order of the autoregressive partI: d = degree of first differencing involved
MA: q = order of the moving average part.
White noise model: ARIMA(0,0,0)
Random walk: ARIMA(0,1,0) with no constant
Random walk with drift: ARIMA(0,1,0) with const.
AR(p): ARIMA(p,0,0)
MA(q): ARIMA(0,0,q)
Forecasting using R Non-seasonal ARIMA models 7
Understanding ARIMA models
If c = 0 and d = 0, the long-term forecasts willgo to zero.If c = 0 and d = 1, the long-term forecasts willgo to a non-zero constant.If c = 0 and d = 2, the long-term forecasts willfollow a straight line.If c 6= 0 and d = 0, the long-term forecasts willgo to the mean of the data.If c 6= 0 and d = 1, the long-term forecasts willfollow a straight line.If c 6= 0 and d = 2, the long-term forecasts willfollow a quadratic trend.
Forecasting using R Non-seasonal ARIMA models 11
ACF and PACF plots
Recall that k-th autocorrelation rk measures the linearrelationship between yt and yt−k
Now, if yt and yt−1 are correlated, then yt and yt−2must be correlatedWhat is the correlation between yt and yt−2 afterremoving the correlation between yt and yt−1?αk : k-th partial autocorrelationαk : linear relationship between yt and yt−k afterremoving the effects of time lags κ = 1, 2, . . . , k − 1αk = the estimate of φk in the autoregression model
yt = c + φ1yt−1 + φ2yt−2 + . . . + φkyt−k + et
ACF and PACF plots
Recall that k-th autocorrelation rk measures the linearrelationship between yt and yt−kNow, if yt and yt−1 are correlated, then yt and yt−2must be correlated
What is the correlation between yt and yt−2 afterremoving the correlation between yt and yt−1?αk : k-th partial autocorrelationαk : linear relationship between yt and yt−k afterremoving the effects of time lags κ = 1, 2, . . . , k − 1αk = the estimate of φk in the autoregression model
yt = c + φ1yt−1 + φ2yt−2 + . . . + φkyt−k + et
ACF and PACF plots
Recall that k-th autocorrelation rk measures the linearrelationship between yt and yt−kNow, if yt and yt−1 are correlated, then yt and yt−2must be correlatedWhat is the correlation between yt and yt−2 afterremoving the correlation between yt and yt−1?
αk : k-th partial autocorrelationαk : linear relationship between yt and yt−k afterremoving the effects of time lags κ = 1, 2, . . . , k − 1αk = the estimate of φk in the autoregression model
yt = c + φ1yt−1 + φ2yt−2 + . . . + φkyt−k + et
ACF and PACF plots
Recall that k-th autocorrelation rk measures the linearrelationship between yt and yt−kNow, if yt and yt−1 are correlated, then yt and yt−2must be correlatedWhat is the correlation between yt and yt−2 afterremoving the correlation between yt and yt−1?αk : k-th partial autocorrelation
αk : linear relationship between yt and yt−k afterremoving the effects of time lags κ = 1, 2, . . . , k − 1αk = the estimate of φk in the autoregression model
yt = c + φ1yt−1 + φ2yt−2 + . . . + φkyt−k + et
ACF and PACF plots
Recall that k-th autocorrelation rk measures the linearrelationship between yt and yt−kNow, if yt and yt−1 are correlated, then yt and yt−2must be correlatedWhat is the correlation between yt and yt−2 afterremoving the correlation between yt and yt−1?αk : k-th partial autocorrelationαk : linear relationship between yt and yt−k afterremoving the effects of time lags κ = 1, 2, . . . , k − 1
αk = the estimate of φk in the autoregression model
yt = c + φ1yt−1 + φ2yt−2 + . . . + φkyt−k + et
ACF and PACF plots
Recall that k-th autocorrelation rk measures the linearrelationship between yt and yt−kNow, if yt and yt−1 are correlated, then yt and yt−2must be correlatedWhat is the correlation between yt and yt−2 afterremoving the correlation between yt and yt−1?αk : k-th partial autocorrelationαk : linear relationship between yt and yt−k afterremoving the effects of time lags κ = 1, 2, . . . , k − 1αk = the estimate of φk in the autoregression model
yt = c + φ1yt−1 + φ2yt−2 + . . . + φkyt−k + et
ACF and PACF plots
If data follows an ARIMA(p, d , 0) or amdARIMA(0, d , q) model, ACF and PACF plots can helpto determine the value of p or q
If both p and q are positive, ACF and PACF plots arenot helpfulData may follow ARIMA(p, d , 0) model if
ACF is exponentially decaying or sinusoidalSignificant spike at lag p in PACF, but none beyond lag p
Data may follow ARIMA(0, d , q) model if
PACF is exponentially decaying or sinusoidalSignificant spike at lag p in ACF, but none beyond lag p
ACF and PACF plots
If data follows an ARIMA(p, d , 0) or amdARIMA(0, d , q) model, ACF and PACF plots can helpto determine the value of p or qIf both p and q are positive, ACF and PACF plots arenot helpful
Data may follow ARIMA(p, d , 0) model if
ACF is exponentially decaying or sinusoidalSignificant spike at lag p in PACF, but none beyond lag p
Data may follow ARIMA(0, d , q) model if
PACF is exponentially decaying or sinusoidalSignificant spike at lag p in ACF, but none beyond lag p
ACF and PACF plots
If data follows an ARIMA(p, d , 0) or amdARIMA(0, d , q) model, ACF and PACF plots can helpto determine the value of p or qIf both p and q are positive, ACF and PACF plots arenot helpfulData may follow ARIMA(p, d , 0) model if
ACF is exponentially decaying or sinusoidalSignificant spike at lag p in PACF, but none beyond lag p
Data may follow ARIMA(0, d , q) model if
PACF is exponentially decaying or sinusoidalSignificant spike at lag p in ACF, but none beyond lag p
ACF and PACF plots
If data follows an ARIMA(p, d , 0) or amdARIMA(0, d , q) model, ACF and PACF plots can helpto determine the value of p or qIf both p and q are positive, ACF and PACF plots arenot helpfulData may follow ARIMA(p, d , 0) model if
ACF is exponentially decaying or sinusoidal
Significant spike at lag p in PACF, but none beyond lag p
Data may follow ARIMA(0, d , q) model if
PACF is exponentially decaying or sinusoidalSignificant spike at lag p in ACF, but none beyond lag p
ACF and PACF plots
If data follows an ARIMA(p, d , 0) or amdARIMA(0, d , q) model, ACF and PACF plots can helpto determine the value of p or qIf both p and q are positive, ACF and PACF plots arenot helpfulData may follow ARIMA(p, d , 0) model if
ACF is exponentially decaying or sinusoidalSignificant spike at lag p in PACF, but none beyond lag p
Data may follow ARIMA(0, d , q) model if
PACF is exponentially decaying or sinusoidalSignificant spike at lag p in ACF, but none beyond lag p
ACF and PACF plots
If data follows an ARIMA(p, d , 0) or amdARIMA(0, d , q) model, ACF and PACF plots can helpto determine the value of p or qIf both p and q are positive, ACF and PACF plots arenot helpfulData may follow ARIMA(p, d , 0) model if
ACF is exponentially decaying or sinusoidalSignificant spike at lag p in PACF, but none beyond lag p
Data may follow ARIMA(0, d , q) model if
PACF is exponentially decaying or sinusoidalSignificant spike at lag p in ACF, but none beyond lag p
ACF and PACF plots
If data follows an ARIMA(p, d , 0) or amdARIMA(0, d , q) model, ACF and PACF plots can helpto determine the value of p or qIf both p and q are positive, ACF and PACF plots arenot helpfulData may follow ARIMA(p, d , 0) model if
ACF is exponentially decaying or sinusoidalSignificant spike at lag p in PACF, but none beyond lag p
Data may follow ARIMA(0, d , q) model ifPACF is exponentially decaying or sinusoidal
Significant spike at lag p in ACF, but none beyond lag p
ACF and PACF plots
If data follows an ARIMA(p, d , 0) or amdARIMA(0, d , q) model, ACF and PACF plots can helpto determine the value of p or qIf both p and q are positive, ACF and PACF plots arenot helpfulData may follow ARIMA(p, d , 0) model if
ACF is exponentially decaying or sinusoidalSignificant spike at lag p in PACF, but none beyond lag p
Data may follow ARIMA(0, d , q) model ifPACF is exponentially decaying or sinusoidalSignificant spike at lag p in ACF, but none beyond lag p
Akaike’s Information Criterion
AIC = −2 log(Likelihood) + 2p
where p is the number of estimated parameters inthe model.
Minimizing the AIC gives the best model forprediction.
AIC corrected (for small sample bias)
AICC = AIC +2(p+ 1)(p+ 2)
n− p
Schwartz’ Bayesian IC
BIC = AIC + p(log(n)− 2)
Forecasting using R Exponential smoothing state space models 18
Akaike’s Information Criterion
AIC = −2 log(Likelihood) + 2p
where p is the number of estimated parameters inthe model.
Minimizing the AIC gives the best model forprediction.
AIC corrected (for small sample bias)
AICC = AIC +2(p+ 1)(p+ 2)
n− p
Schwartz’ Bayesian IC
BIC = AIC + p(log(n)− 2)
Forecasting using R Exponential smoothing state space models 18
Akaike’s Information Criterion
AIC = −2 log(Likelihood) + 2p
where p is the number of estimated parameters inthe model.
Minimizing the AIC gives the best model forprediction.
AIC corrected (for small sample bias)
AICC = AIC +2(p+ 1)(p+ 2)
n− p
Schwartz’ Bayesian IC
BIC = AIC + p(log(n)− 2)
Forecasting using R Exponential smoothing state space models 18
Akaike’s Information Criterion
AIC = −2 log(Likelihood) + 2p
where p is the number of estimated parameters inthe model.
Minimizing the AIC gives the best model forprediction.
AIC corrected (for small sample bias)
AICC = AIC +2(p+ 1)(p+ 2)
n− p
Schwartz’ Bayesian IC
BIC = AIC + p(log(n)− 2)
Forecasting using R Exponential smoothing state space models 18
Akaike’s Information Criterion
AIC = −2 log(Likelihood) + 2p
where p is the number of estimated parameters inthe model.
Minimizing the AIC gives the best model forprediction.
AIC corrected (for small sample bias)
AICC = AIC +2(p+ 1)(p+ 2)
n− p
Schwartz’ Bayesian IC
BIC = AIC + p(log(n)− 2)
Forecasting using R Exponential smoothing state space models 18
Akaike’s Information Criterion
Value of AIC/AICc/BIC given in the R output.
AIC does not have much meaning by itself. Onlyuseful in comparison to AIC value for anothermodel fitted to same data set.
Consider several models with AIC values closeto the minimum.
A difference in AIC values of 2 or less is notregarded as substantial and you may choosethe simpler but non-optimal model.
AIC can be negative.
Forecasting using R Exponential smoothing state space models 19
Akaike’s Information Criterion
Value of AIC/AICc/BIC given in the R output.
AIC does not have much meaning by itself. Onlyuseful in comparison to AIC value for anothermodel fitted to same data set.
Consider several models with AIC values closeto the minimum.
A difference in AIC values of 2 or less is notregarded as substantial and you may choosethe simpler but non-optimal model.
AIC can be negative.
Forecasting using R Exponential smoothing state space models 19
Akaike’s Information Criterion
Value of AIC/AICc/BIC given in the R output.
AIC does not have much meaning by itself. Onlyuseful in comparison to AIC value for anothermodel fitted to same data set.
Consider several models with AIC values closeto the minimum.
A difference in AIC values of 2 or less is notregarded as substantial and you may choosethe simpler but non-optimal model.
AIC can be negative.
Forecasting using R Exponential smoothing state space models 19
Akaike’s Information Criterion
Value of AIC/AICc/BIC given in the R output.
AIC does not have much meaning by itself. Onlyuseful in comparison to AIC value for anothermodel fitted to same data set.
Consider several models with AIC values closeto the minimum.
A difference in AIC values of 2 or less is notregarded as substantial and you may choosethe simpler but non-optimal model.
AIC can be negative.
Forecasting using R Exponential smoothing state space models 19
Akaike’s Information Criterion
Value of AIC/AICc/BIC given in the R output.
AIC does not have much meaning by itself. Onlyuseful in comparison to AIC value for anothermodel fitted to same data set.
Consider several models with AIC values closeto the minimum.
A difference in AIC values of 2 or less is notregarded as substantial and you may choosethe simpler but non-optimal model.
AIC can be negative.
Forecasting using R Exponential smoothing state space models 19
Backshift notationA very useful notational device is the backwardshift operator, B, which is used as follows:
Byt = yt−1 .
In other words, B, operating on yt, has the effect ofshifting the data back one period. Twoapplications of B to yt shifts the data back twoperiods:
B(Byt) = B2yt = yt−2 .
For monthly data, if we wish to shift attention to“the same month last year,” then B12 is used, andthe notation is B12yt = yt−12.
Forecasting using R Backshift notation 3
Backshift notationA very useful notational device is the backwardshift operator, B, which is used as follows:
Byt = yt−1 .
In other words, B, operating on yt, has the effect ofshifting the data back one period. Twoapplications of B to yt shifts the data back twoperiods:
B(Byt) = B2yt = yt−2 .
For monthly data, if we wish to shift attention to“the same month last year,” then B12 is used, andthe notation is B12yt = yt−12.
Forecasting using R Backshift notation 3
Backshift notationA very useful notational device is the backwardshift operator, B, which is used as follows:
Byt = yt−1 .
In other words, B, operating on yt, has the effect ofshifting the data back one period. Twoapplications of B to yt shifts the data back twoperiods:
B(Byt) = B2yt = yt−2 .
For monthly data, if we wish to shift attention to“the same month last year,” then B12 is used, andthe notation is B12yt = yt−12.
Forecasting using R Backshift notation 3
Backshift notationA very useful notational device is the backwardshift operator, B, which is used as follows:
Byt = yt−1 .
In other words, B, operating on yt, has the effect ofshifting the data back one period. Twoapplications of B to yt shifts the data back twoperiods:
B(Byt) = B2yt = yt−2 .
For monthly data, if we wish to shift attention to“the same month last year,” then B12 is used, andthe notation is B12yt = yt−12.
Forecasting using R Backshift notation 3
Backshift notation
First difference: 1− B.
Double difference: (1− B)2.
dth-order difference: (1− B)dyt.
Seasonal difference: 1− Bm.
Seasonal difference followed by a firstdifference: (1− B)(1− Bm).
Multiply terms together together to see thecombined effect:
(1− B)(1− Bm)yt = (1− B− Bm + Bm+1)yt= yt − yt−1 − yt−m + yt−m−1.
Forecasting using R Backshift notation 4
Backshift notation
First difference: 1− B.
Double difference: (1− B)2.
dth-order difference: (1− B)dyt.
Seasonal difference: 1− Bm.
Seasonal difference followed by a firstdifference: (1− B)(1− Bm).
Multiply terms together together to see thecombined effect:
(1− B)(1− Bm)yt = (1− B− Bm + Bm+1)yt= yt − yt−1 − yt−m + yt−m−1.
Forecasting using R Backshift notation 4
Backshift notation
First difference: 1− B.
Double difference: (1− B)2.
dth-order difference: (1− B)dyt.
Seasonal difference: 1− Bm.
Seasonal difference followed by a firstdifference: (1− B)(1− Bm).
Multiply terms together together to see thecombined effect:
(1− B)(1− Bm)yt = (1− B− Bm + Bm+1)yt= yt − yt−1 − yt−m + yt−m−1.
Forecasting using R Backshift notation 4
Backshift notation
First difference: 1− B.
Double difference: (1− B)2.
dth-order difference: (1− B)dyt.
Seasonal difference: 1− Bm.
Seasonal difference followed by a firstdifference: (1− B)(1− Bm).
Multiply terms together together to see thecombined effect:
(1− B)(1− Bm)yt = (1− B− Bm + Bm+1)yt= yt − yt−1 − yt−m + yt−m−1.
Forecasting using R Backshift notation 4
Backshift notation
First difference: 1− B.
Double difference: (1− B)2.
dth-order difference: (1− B)dyt.
Seasonal difference: 1− Bm.
Seasonal difference followed by a firstdifference: (1− B)(1− Bm).
Multiply terms together together to see thecombined effect:
(1− B)(1− Bm)yt = (1− B− Bm + Bm+1)yt= yt − yt−1 − yt−m + yt−m−1.
Forecasting using R Backshift notation 4
Backshift notation
First difference: 1− B.
Double difference: (1− B)2.
dth-order difference: (1− B)dyt.
Seasonal difference: 1− Bm.
Seasonal difference followed by a firstdifference: (1− B)(1− Bm).
Multiply terms together together to see thecombined effect:
(1− B)(1− Bm)yt = (1− B− Bm + Bm+1)yt= yt − yt−1 − yt−m + yt−m−1.
Forecasting using R Backshift notation 4
Backshift notation for ARIMA
ARMA model:yt = c + φ1yt−1 + · · ·+ φpyt−p + et + θ1et−1 + · · ·+ θqet−q
= c + φ1Byt + · · ·+ φpBpyt + et + θ1Bet + · · ·+ θqB
qet
φ(B)yt = c + θ(B)et
where φ(B) = 1− φ1B− · · · − φpBp
and θ(B) = 1 + θ1B + · · ·+ θqBq.
ARIMA(1,1,1) model:
(1− φ1B) (1− B)yt = c + (1 + θ1B)et
Forecasting using R Backshift notation 5
Backshift notation for ARIMA
ARMA model:yt = c + φ1yt−1 + · · ·+ φpyt−p + et + θ1et−1 + · · ·+ θqet−q
= c + φ1Byt + · · ·+ φpBpyt + et + θ1Bet + · · ·+ θqB
qet
φ(B)yt = c + θ(B)et
where φ(B) = 1− φ1B− · · · − φpBp
and θ(B) = 1 + θ1B + · · ·+ θqBq.
ARIMA(1,1,1) model:
(1− φ1B) (1− B)yt = c + (1 + θ1B)et
Forecasting using R Backshift notation 5
Backshift notation for ARIMA
ARMA model:yt = c + φ1yt−1 + · · ·+ φpyt−p + et + θ1et−1 + · · ·+ θqet−q
= c + φ1Byt + · · ·+ φpBpyt + et + θ1Bet + · · ·+ θqB
qet
φ(B)yt = c + θ(B)et
where φ(B) = 1− φ1B− · · · − φpBp
and θ(B) = 1 + θ1B + · · ·+ θqBq.
ARIMA(1,1,1) model:
(1− φ1B) (1− B)yt = c + (1 + θ1B)et↑
Firstdifference
Forecasting using R Backshift notation 5
Backshift notation for ARIMA
ARMA model:yt = c + φ1yt−1 + · · ·+ φpyt−p + et + θ1et−1 + · · ·+ θqet−q
= c + φ1Byt + · · ·+ φpBpyt + et + θ1Bet + · · ·+ θqB
qet
φ(B)yt = c + θ(B)et
where φ(B) = 1− φ1B− · · · − φpBp
and θ(B) = 1 + θ1B + · · ·+ θqBq.
ARIMA(1,1,1) model:
(1− φ1B) (1− B)yt = c + (1 + θ1B)et↑
AR(1)
Forecasting using R Backshift notation 5
Backshift notation for ARIMA
ARMA model:yt = c + φ1yt−1 + · · ·+ φpyt−p + et + θ1et−1 + · · ·+ θqet−q
= c + φ1Byt + · · ·+ φpBpyt + et + θ1Bet + · · ·+ θqB
qet
φ(B)yt = c + θ(B)et
where φ(B) = 1− φ1B− · · · − φpBp
and θ(B) = 1 + θ1B + · · ·+ θqBq.
ARIMA(1,1,1) model:
(1− φ1B) (1− B)yt = c + (1 + θ1B)et↑
MA(1)
Forecasting using R Backshift notation 5
Seasonal ARIMA models
ARIMA (p,d,q) (P,D,Q)m
where m = number of periods per season.
Forecasting using R Seasonal ARIMA models 7
Seasonal ARIMA models
ARIMA (p,d,q)︸ ︷︷ ︸ (P,D,Q)m
↑
Non-seasonalpart of themodel
where m = number of periods per season.
Forecasting using R Seasonal ARIMA models 7
Seasonal ARIMA models
ARIMA (p,d,q) (P,D,Q)m︸ ︷︷ ︸↑
Seasonalpart ofthemodel
where m = number of periods per season.
Forecasting using R Seasonal ARIMA models 7
Seasonal ARIMA modelsE.g., ARIMA(1,1,1)(1,1,1)4 model (without constant)
(1− φ1B)(1−Φ1B4)(1− B)(1− B4)yt = (1 + θ1B)(1 + Θ1B
4)et.
Forecasting using R Seasonal ARIMA models 8
Seasonal ARIMA modelsE.g., ARIMA(1,1,1)(1,1,1)4 model (without constant)
(1− φ1B)(1−Φ1B4)(1− B)(1− B4)yt = (1 + θ1B)(1 + Θ1B
4)et.
Forecasting using R Seasonal ARIMA models 8
Seasonal ARIMA modelsE.g., ARIMA(1,1,1)(1,1,1)4 model (without constant)
(1− φ1B)(1−Φ1B4)(1− B)(1− B4)yt = (1 + θ1B)(1 + Θ1B
4)et.
6
(Seasonaldifference
)
Forecasting using R Seasonal ARIMA models 8
Seasonal ARIMA modelsE.g., ARIMA(1,1,1)(1,1,1)4 model (without constant)
(1− φ1B)(1−Φ1B4)(1− B)(1− B4)yt = (1 + θ1B)(1 + Θ1B
4)et.
6(Non-seasonal
difference
)
Forecasting using R Seasonal ARIMA models 8
Seasonal ARIMA modelsE.g., ARIMA(1,1,1)(1,1,1)4 model (without constant)
(1− φ1B)(1−Φ1B4)(1− B)(1− B4)yt = (1 + θ1B)(1 + Θ1B
4)et.
6
(Seasonal
AR(1)
)
Forecasting using R Seasonal ARIMA models 8
Seasonal ARIMA modelsE.g., ARIMA(1,1,1)(1,1,1)4 model (without constant)
(1− φ1B)(1−Φ1B4)(1− B)(1− B4)yt = (1 + θ1B)(1 + Θ1B
4)et.
6(Non-seasonal
AR(1)
)
Forecasting using R Seasonal ARIMA models 8
Seasonal ARIMA modelsE.g., ARIMA(1,1,1)(1,1,1)4 model (without constant)
(1− φ1B)(1−Φ1B4)(1− B)(1− B4)yt = (1 + θ1B)(1 + Θ1B
4)et.
6
(Seasonal
MA(1)
)
Forecasting using R Seasonal ARIMA models 8
Seasonal ARIMA modelsE.g., ARIMA(1,1,1)(1,1,1)4 model (without constant)
(1− φ1B)(1−Φ1B4)(1− B)(1− B4)yt = (1 + θ1B)(1 + Θ1B
4)et.
6(Non-seasonal
MA(1)
)
Forecasting using R Seasonal ARIMA models 8
ACF and PACF plots
The seasonal part of an AR or MA model will be seenin the seasonal lags of the ACF and PACF
An ARIMA(0, 0, 0)(1, 0, 0)12 model will show
Exponential decay in the seasonal lags of the ACF: 12, 24, 36, ...Single significant spike at lag 12 in the PACF
An ARIMA(0, 0, 0)(0, 0, 1)12 model will show
Exponential decay in the seasonal lags of the PACF: 12, 24, 36, ...Single significant spike at lag 12 in the ACF
ACF and PACF plots
The seasonal part of an AR or MA model will be seenin the seasonal lags of the ACF and PACFAn ARIMA(0, 0, 0)(1, 0, 0)12 model will show
Exponential decay in the seasonal lags of the ACF: 12, 24, 36, ...Single significant spike at lag 12 in the PACF
An ARIMA(0, 0, 0)(0, 0, 1)12 model will show
Exponential decay in the seasonal lags of the PACF: 12, 24, 36, ...Single significant spike at lag 12 in the ACF
ACF and PACF plots
The seasonal part of an AR or MA model will be seenin the seasonal lags of the ACF and PACFAn ARIMA(0, 0, 0)(1, 0, 0)12 model will show
Exponential decay in the seasonal lags of the ACF: 12, 24, 36, ...
Single significant spike at lag 12 in the PACF
An ARIMA(0, 0, 0)(0, 0, 1)12 model will show
Exponential decay in the seasonal lags of the PACF: 12, 24, 36, ...Single significant spike at lag 12 in the ACF
ACF and PACF plots
The seasonal part of an AR or MA model will be seenin the seasonal lags of the ACF and PACFAn ARIMA(0, 0, 0)(1, 0, 0)12 model will show
Exponential decay in the seasonal lags of the ACF: 12, 24, 36, ...Single significant spike at lag 12 in the PACF
An ARIMA(0, 0, 0)(0, 0, 1)12 model will show
Exponential decay in the seasonal lags of the PACF: 12, 24, 36, ...Single significant spike at lag 12 in the ACF
ACF and PACF plots
The seasonal part of an AR or MA model will be seenin the seasonal lags of the ACF and PACFAn ARIMA(0, 0, 0)(1, 0, 0)12 model will show
Exponential decay in the seasonal lags of the ACF: 12, 24, 36, ...Single significant spike at lag 12 in the PACF
An ARIMA(0, 0, 0)(0, 0, 1)12 model will show
Exponential decay in the seasonal lags of the PACF: 12, 24, 36, ...Single significant spike at lag 12 in the ACF
ACF and PACF plots
The seasonal part of an AR or MA model will be seenin the seasonal lags of the ACF and PACFAn ARIMA(0, 0, 0)(1, 0, 0)12 model will show
Exponential decay in the seasonal lags of the ACF: 12, 24, 36, ...Single significant spike at lag 12 in the PACF
An ARIMA(0, 0, 0)(0, 0, 1)12 model will showExponential decay in the seasonal lags of the PACF: 12, 24, 36, ...
Single significant spike at lag 12 in the ACF
ACF and PACF plots
The seasonal part of an AR or MA model will be seenin the seasonal lags of the ACF and PACFAn ARIMA(0, 0, 0)(1, 0, 0)12 model will show
Exponential decay in the seasonal lags of the ACF: 12, 24, 36, ...Single significant spike at lag 12 in the PACF
An ARIMA(0, 0, 0)(0, 0, 1)12 model will showExponential decay in the seasonal lags of the PACF: 12, 24, 36, ...Single significant spike at lag 12 in the ACF
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Regression with ARIMA errorsAny regression with an ARIMA error can be rewrittenas a regression with an ARMA error by differencingall variables with the same differencing operator asin the ARIMA model.
Original data
yt = b0 + b1x1,t + · · ·+ bkxk,t + nt
where φ(B)(1− B)dnt = θ(B)et
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
where φ(B)nt = θ(B)et
and y′t = (1− B)dyt, etc.Forecasting using R Regression with ARIMA errors 5
Regression with ARIMA errorsAny regression with an ARIMA error can be rewrittenas a regression with an ARMA error by differencingall variables with the same differencing operator asin the ARIMA model.
Original data
yt = b0 + b1x1,t + · · ·+ bkxk,t + nt
where φ(B)(1− B)dnt = θ(B)et
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
where φ(B)nt = θ(B)et
and y′t = (1− B)dyt, etc.Forecasting using R Regression with ARIMA errors 5
Regression with ARIMA errorsAny regression with an ARIMA error can be rewrittenas a regression with an ARMA error by differencingall variables with the same differencing operator asin the ARIMA model.
Original data
yt = b0 + b1x1,t + · · ·+ bkxk,t + nt
where φ(B)(1− B)dnt = θ(B)et
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
where φ(B)nt = θ(B)et
and y′t = (1− B)dyt, etc.Forecasting using R Regression with ARIMA errors 5
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedure
Estimation only works when all predictorvariables are deterministic or stationary andthe errors are stationary.
So difference stochastic variables as requireduntil all variables appear stationary. Then fitmodel with ARMA errors.
auto.arima() will handle order selection anddifferencing (but only checks that errors arestationary).
Forecasting using R Regression with ARIMA errors 7
Modeling procedure
Estimation only works when all predictorvariables are deterministic or stationary andthe errors are stationary.
So difference stochastic variables as requireduntil all variables appear stationary. Then fitmodel with ARMA errors.
auto.arima() will handle order selection anddifferencing (but only checks that errors arestationary).
Forecasting using R Regression with ARIMA errors 7
Modeling procedure
Estimation only works when all predictorvariables are deterministic or stationary andthe errors are stationary.
So difference stochastic variables as requireduntil all variables appear stationary. Then fitmodel with ARMA errors.
auto.arima() will handle order selection anddifferencing (but only checks that errors arestationary).
Forecasting using R Regression with ARIMA errors 7
Outline
1 Time series in R
2 Simple forecasting methods
3 Measuring forecast accuracy
4 Seasonality and stationarity
5 ARIMA forecasting
6 Exponential smoothing
Time series decomposition
Yt = f(St, Tt,Et)
where Yt = data at period tSt = seasonal component at period tTt = trend-cycle component at period tEt = remainder (or irregular or error)
component at period t
Additive decomposition: Yt = St + Tt + Et.Multiplicative decomposition: Yt = St × Tt × Et.
Forecasting using R Time series decomposition 21
Time series decomposition
Yt = f(St, Tt,Et)
where Yt = data at period tSt = seasonal component at period tTt = trend-cycle component at period tEt = remainder (or irregular or error)
component at period t
Additive decomposition: Yt = St + Tt + Et.Multiplicative decomposition: Yt = St × Tt × Et.
Forecasting using R Time series decomposition 21
Time series decomposition
Yt = f(St, Tt,Et)
where Yt = data at period tSt = seasonal component at period tTt = trend-cycle component at period tEt = remainder (or irregular or error)
component at period t
Additive decomposition: Yt = St + Tt + Et.Multiplicative decomposition: Yt = St × Tt × Et.
Forecasting using R Time series decomposition 21
Time series decomposition
Additive model appropriate if magnitude ofseasonal fluctuations does not vary with level.
If seasonal are proportional to level of series,then multiplicative model appropriate.
Multiplicative decomposition more prevalentwith economic series
Logs turn multiplicative relationship into anadditive relationship:
Yt = St×Tt×Et ⇒ log Yt = logSt+log Tt+logEt.
Forecasting using R Time series decomposition 22
Time series decomposition
Additive model appropriate if magnitude ofseasonal fluctuations does not vary with level.
If seasonal are proportional to level of series,then multiplicative model appropriate.
Multiplicative decomposition more prevalentwith economic series
Logs turn multiplicative relationship into anadditive relationship:
Yt = St×Tt×Et ⇒ log Yt = logSt+log Tt+logEt.
Forecasting using R Time series decomposition 22
Time series decomposition
Additive model appropriate if magnitude ofseasonal fluctuations does not vary with level.
If seasonal are proportional to level of series,then multiplicative model appropriate.
Multiplicative decomposition more prevalentwith economic series
Logs turn multiplicative relationship into anadditive relationship:
Yt = St×Tt×Et ⇒ log Yt = logSt+log Tt+logEt.
Forecasting using R Time series decomposition 22
Time series decomposition
Additive model appropriate if magnitude ofseasonal fluctuations does not vary with level.
If seasonal are proportional to level of series,then multiplicative model appropriate.
Multiplicative decomposition more prevalentwith economic series
Logs turn multiplicative relationship into anadditive relationship:
Yt = St×Tt×Et ⇒ log Yt = logSt+log Tt+logEt.
Forecasting using R Time series decomposition 22
Seasonal adjustment
Useful by-product of decomposition: an easyway to calculate seasonally adjusted data.
Additive decomposition: seasonally adjusteddata given by
Yt − St = Tt + Et
Multiplicative decomposition: seasonallyadjusted data given by
Yt/St = Tt × Et
Forecasting using R Seasonal adjustment 42
Seasonal adjustment
Useful by-product of decomposition: an easyway to calculate seasonally adjusted data.
Additive decomposition: seasonally adjusteddata given by
Yt − St = Tt + Et
Multiplicative decomposition: seasonallyadjusted data given by
Yt/St = Tt × Et
Forecasting using R Seasonal adjustment 42
Seasonal adjustment
Useful by-product of decomposition: an easyway to calculate seasonally adjusted data.
Additive decomposition: seasonally adjusteddata given by
Yt − St = Tt + Et
Multiplicative decomposition: seasonallyadjusted data given by
Yt/St = Tt × Et
Forecasting using R Seasonal adjustment 42
Forecasting and decomposition
Forecast seasonal component by repeating thelast year
Forecast seasonally adjusted data usingnon-seasonal time series method. E.g.,
Holt’s method — next topicRandom walk with drift model
Combine forecasts of seasonal component withforecasts of seasonally adjusted data to getforecasts of original data.
Sometimes a decomposition is useful just forunderstanding the data before building aseparate forecasting model.
Forecasting using R Forecasting and decomposition 46
Forecasting and decomposition
Forecast seasonal component by repeating thelast year
Forecast seasonally adjusted data usingnon-seasonal time series method. E.g.,
Holt’s method — next topicRandom walk with drift model
Combine forecasts of seasonal component withforecasts of seasonally adjusted data to getforecasts of original data.
Sometimes a decomposition is useful just forunderstanding the data before building aseparate forecasting model.
Forecasting using R Forecasting and decomposition 46
Forecasting and decomposition
Forecast seasonal component by repeating thelast year
Forecast seasonally adjusted data usingnon-seasonal time series method. E.g.,
Holt’s method — next topicRandom walk with drift model
Combine forecasts of seasonal component withforecasts of seasonally adjusted data to getforecasts of original data.
Sometimes a decomposition is useful just forunderstanding the data before building aseparate forecasting model.
Forecasting using R Forecasting and decomposition 46
Forecasting and decomposition
Forecast seasonal component by repeating thelast year
Forecast seasonally adjusted data usingnon-seasonal time series method. E.g.,
Holt’s method — next topicRandom walk with drift model
Combine forecasts of seasonal component withforecasts of seasonally adjusted data to getforecasts of original data.
Sometimes a decomposition is useful just forunderstanding the data before building aseparate forecasting model.
Forecasting using R Forecasting and decomposition 46
Forecasting and decomposition
Forecast seasonal component by repeating thelast year
Forecast seasonally adjusted data usingnon-seasonal time series method. E.g.,
Holt’s method — next topicRandom walk with drift model
Combine forecasts of seasonal component withforecasts of seasonally adjusted data to getforecasts of original data.
Sometimes a decomposition is useful just forunderstanding the data before building aseparate forecasting model.
Forecasting using R Forecasting and decomposition 46
Forecasting and decomposition
Forecast seasonal component by repeating thelast year
Forecast seasonally adjusted data usingnon-seasonal time series method. E.g.,
Holt’s method — next topicRandom walk with drift model
Combine forecasts of seasonal component withforecasts of seasonally adjusted data to getforecasts of original data.
Sometimes a decomposition is useful just forunderstanding the data before building aseparate forecasting model.
Forecasting using R Forecasting and decomposition 46
Simple methods
Random walk forecasts
yT+1|T = yT
Average forecasts
yT+1|T =1
T
T∑
t=1
yt
Want something in between that weights mostrecent data more highly.Simple exponential smoothing uses a weightedmoving average with weights that decreaseexponentially.
Forecasting using R Simple exponential smoothing 3
Simple methods
Random walk forecasts
yT+1|T = yT
Average forecasts
yT+1|T =1
T
T∑
t=1
yt
Want something in between that weights mostrecent data more highly.Simple exponential smoothing uses a weightedmoving average with weights that decreaseexponentially.
Forecasting using R Simple exponential smoothing 3
Simple methods
Random walk forecasts
yT+1|T = yT
Average forecasts
yT+1|T =1
T
T∑
t=1
yt
Want something in between that weights mostrecent data more highly.Simple exponential smoothing uses a weightedmoving average with weights that decreaseexponentially.
Forecasting using R Simple exponential smoothing 3
Simple methods
Random walk forecasts
yT+1|T = yT
Average forecasts
yT+1|T =1
T
T∑
t=1
yt
Want something in between that weights mostrecent data more highly.Simple exponential smoothing uses a weightedmoving average with weights that decreaseexponentially.
Forecasting using R Simple exponential smoothing 3
Simple Exponential Smoothing
Forecast equation
yT+1|T = αyT + α(1− α)yT−1 + α(1− α)2yT−2 + · · · ,
where 0 ≤ α ≤ 1.
Weights assigned to observations for:Observation α = 0.2 α = 0.4 α = 0.6 α = 0.8
yT 0.2 0.4 0.6 0.8yT−1 0.16 0.24 0.24 0.16yT−2 0.128 0.144 0.096 0.032yT−3 0.1024 0.0864 0.0384 0.0064yT−4 (0.2)(0.8)4 (0.4)(0.6)4 (0.6)(0.4)4 (0.8)(0.2)4
yT−5 (0.2)(0.8)5 (0.4)(0.6)5 (0.6)(0.4)5 (0.8)(0.2)5
Forecasting using R Simple exponential smoothing 4
Simple Exponential Smoothing
Forecast equation
yT+1|T = αyT + α(1− α)yT−1 + α(1− α)2yT−2 + · · · ,
where 0 ≤ α ≤ 1.
Weights assigned to observations for:Observation α = 0.2 α = 0.4 α = 0.6 α = 0.8
yT 0.2 0.4 0.6 0.8yT−1 0.16 0.24 0.24 0.16yT−2 0.128 0.144 0.096 0.032yT−3 0.1024 0.0864 0.0384 0.0064yT−4 (0.2)(0.8)4 (0.4)(0.6)4 (0.6)(0.4)4 (0.8)(0.2)4
yT−5 (0.2)(0.8)5 (0.4)(0.6)5 (0.6)(0.4)5 (0.8)(0.2)5
Forecasting using R Simple exponential smoothing 4
Simple Exponential Smoothing
Weighted average form
yt+1|t = αyt + (1− α)yt|t−1
for t = 1, . . . , T, where 0 ≤ α ≤ 1 is the smoothingparameter.
The process has to start somewhere, so we let thefirst forecast of y1 be denoted by `0. Then
y2|1 = αy1 + (1− α)`0
y3|2 = αy2 + (1− α)y2|1y4|3 = αy3 + (1− α)y3|2
...
yT+1|T = αyT + (1− α)yT|T−1Forecasting using R Simple exponential smoothing 5
Simple Exponential Smoothing
Weighted average form
yt+1|t = αyt + (1− α)yt|t−1
for t = 1, . . . , T, where 0 ≤ α ≤ 1 is the smoothingparameter.
The process has to start somewhere, so we let thefirst forecast of y1 be denoted by `0. Then
y2|1 = αy1 + (1− α)`0
y3|2 = αy2 + (1− α)y2|1y4|3 = αy3 + (1− α)y3|2
...
yT+1|T = αyT + (1− α)yT|T−1Forecasting using R Simple exponential smoothing 5
Simple Exponential Smoothingyt+1|t = αyt + (1− α)yt|t−1
Substituting each equation into the following equation:
y3|2 = αy2 + (1− α)y2|1= αy2 + (1− α) [αy1 + (1− α)`0]
= αy2 + α(1− α)y1 + (1− α)2`0
y4|3 = αy3 + (1− α)[αy2 + α(1− α)y1 + (1− α)2`0]
= αy3 + α(1− α)y2 + α(1− α)2y1 + (1− α)3`0
...
yT+1|T = αyT + α(1− α)yT−1 + α(1− α)2yT−2 + · · ·+ (1− α)T`0
Exponentially weighted average
yT+1|T =T−1∑
j=0
α(1− α)jyT−j + (1− α)T`0
Forecasting using R Simple exponential smoothing 6
Simple Exponential Smoothingyt+1|t = αyt + (1− α)yt|t−1
Substituting each equation into the following equation:
y3|2 = αy2 + (1− α)y2|1= αy2 + (1− α) [αy1 + (1− α)`0]
= αy2 + α(1− α)y1 + (1− α)2`0
y4|3 = αy3 + (1− α)[αy2 + α(1− α)y1 + (1− α)2`0]
= αy3 + α(1− α)y2 + α(1− α)2y1 + (1− α)3`0
...
yT+1|T = αyT + α(1− α)yT−1 + α(1− α)2yT−2 + · · ·+ (1− α)T`0
Exponentially weighted average
yT+1|T =T−1∑
j=0
α(1− α)jyT−j + (1− α)T`0
Forecasting using R Simple exponential smoothing 6
Simple exponential smoothing
Initialization
Last term in weighted moving average is(1− α)T ˆ0.
So value of `0 plays a role in all subsequentforecasts.
Weight is small unless α close to zero or Tsmall.
Common to set `0 = y1. Better to treat it as aparameter, along with α.
Forecasting using R Simple exponential smoothing 7
Simple exponential smoothing
Initialization
Last term in weighted moving average is(1− α)T ˆ0.
So value of `0 plays a role in all subsequentforecasts.
Weight is small unless α close to zero or Tsmall.
Common to set `0 = y1. Better to treat it as aparameter, along with α.
Forecasting using R Simple exponential smoothing 7
Simple exponential smoothing
Initialization
Last term in weighted moving average is(1− α)T ˆ0.
So value of `0 plays a role in all subsequentforecasts.
Weight is small unless α close to zero or Tsmall.
Common to set `0 = y1. Better to treat it as aparameter, along with α.
Forecasting using R Simple exponential smoothing 7
Simple exponential smoothing
Initialization
Last term in weighted moving average is(1− α)T ˆ0.
So value of `0 plays a role in all subsequentforecasts.
Weight is small unless α close to zero or Tsmall.
Common to set `0 = y1. Better to treat it as aparameter, along with α.
Forecasting using R Simple exponential smoothing 7
Simple exponential smoothing
Optimization
We can choose α and `0 by minimizing MSE:
MSE =1
T − 1
T∑
t=2
(yt − yt|t−1)2
Unlike regression there is no closed formsolution — use numerical optimization.
Forecasting using R Simple exponential smoothing 10
Simple exponential smoothing
Optimization
We can choose α and `0 by minimizing MSE:
MSE =1
T − 1
T∑
t=2
(yt − yt|t−1)2
Unlike regression there is no closed formsolution — use numerical optimization.
Forecasting using R Simple exponential smoothing 10
Simple exponential smoothing
Multi-step forecasts
yT+h|T = yT+1|T, h = 2,3, . . .
A “flat” forecast function.
Remember, a forecast is an estimated mean ofa future value.
So with no trend, no seasonality, and no otherpatterns, the forecasts are constant.
Forecasting using R Simple exponential smoothing 13
Simple exponential smoothing
Multi-step forecasts
yT+h|T = yT+1|T, h = 2,3, . . .
A “flat” forecast function.
Remember, a forecast is an estimated mean ofa future value.
So with no trend, no seasonality, and no otherpatterns, the forecasts are constant.
Forecasting using R Simple exponential smoothing 13
Simple exponential smoothing
Multi-step forecasts
yT+h|T = yT+1|T, h = 2,3, . . .
A “flat” forecast function.
Remember, a forecast is an estimated mean ofa future value.
So with no trend, no seasonality, and no otherpatterns, the forecasts are constant.
Forecasting using R Simple exponential smoothing 13
Simple exponential smoothing
Multi-step forecasts
yT+h|T = yT+1|T, h = 2,3, . . .
A “flat” forecast function.
Remember, a forecast is an estimated mean ofa future value.
So with no trend, no seasonality, and no otherpatterns, the forecasts are constant.
Forecasting using R Simple exponential smoothing 13