part ii – time series analysis c3 exponential smoothing methods © angel a. juan & carles...
TRANSCRIPT
Part II – TIME SERIES ANALYSIS
C3 Exponential Smoothing Methods
© Angel A. Juan & Carles Serrat - UPC 2007/2008
2.3.1: Simple TS and Smoothing Methods
The simple forecasting and smoothing methods model components in a series that are usually easy to see in a time series plot of the data.
These methods decompose the data into its trend and seasonal components, and then extend the estimates of the components into the future to provide forecasts.
Static methods have patterns that do not change over time; dynamic methods have patterns that do change over time and estimates are updated using neighboring values.
You may use two methods in combination. That is, you may choose a static method to model one component and a dynamic method to model another component.
A disadvantage of combining methods is that the confidence intervals for forecasts are not valid.
A disadvantage of combining methods is that the confidence intervals for forecasts are not valid.
STATIC (SIMPLE) METHODS
• Trend Analysis
• Decomposition
STATIC (SIMPLE) METHODS
• Trend Analysis
• Decomposition
DYNAMIC (SMOOTHING) METHODS
• Moving Average
• Single Exponential Smoothing
• Double Exponential Smoothing
• Winters’ Method (Triple Exp. Smoothing)
DYNAMIC (SMOOTHING) METHODS
• Moving Average
• Single Exponential Smoothing
• Double Exponential Smoothing
• Winters’ Method (Triple Exp. Smoothing)
2.3.2: Selecting an Exp. Smoothing Method
SINGLE EXP. SMOOTHING
Series without trend and without seasonal components.
SINGLE EXP. SMOOTHING
Series without trend and without seasonal components.
DOUBLE EXP. SMOOTHING
Series with trend but without seasonal component.
DOUBLE EXP. SMOOTHING
Series with trend but without seasonal component.
TRIPLE EXP. SMOOTHING (WINTERS’ METHOD)
Series with trend and seasonal components.
TRIPLE EXP. SMOOTHING (WINTERS’ METHOD)
Series with trend and seasonal components.
2.3.3: Measures of Accuracy
One major difference between MSD and MAD is that the MSD measure is influenced much more by large fitting errors than by small errors (since for the MSD measure the errors are squared).
One major difference between MSD and MAD is that the MSD measure is influenced much more by large fitting errors than by small errors (since for the MSD measure the errors are squared).
Mean Absolute Percentage Error
MAPE
Mean Absolute Percentage Error
MAPE
Mean Absolute Deviation MAD
Mean Absolute Deviation MAD
Mean Squared Deviation MSD
Mean Squared Deviation MSD
2.3.4: Single Exp. Smoothing (1/4) Weighted Moving Averages (WMA):
In the moving averages method, each observation in the MA calculation receives the same weight.
One variation, known as weighted moving averages, involves selecting a different weight for each data value and then computing a weighted average of the most recent m values as the forecast. In most cases, the most recent observation receives the most weight, and the weight decreases for older data values.
Note that for the WMA the sum of each weights is equal to 1.
Single Exponential Smoothing (SES o EWMA):
SES is a special case of the WMA method in which we select only one weight, α, the weight for the most recent observation.
The weights for the other data values are computed automatically and become exponentially smaller as the observations move farther into the past.
With the time series data and the forecasting formulas in a spreadsheet, you can experiment with different values of α (or MA weights) and choose the value(s) of α providing the smallest MSD or MAD.
With the time series data and the forecasting formulas in a spreadsheet, you can experiment with different values of α (or MA weights) and choose the value(s) of α providing the smallest MSD or MAD.
S. Makridakis has conducted research showing that the SES method usually outperforms more complex procedures for short-term forecasting.
S. Makridakis has conducted research showing that the SES method usually outperforms more complex procedures for short-term forecasting.
2.3.4: Single Exp. Smoothing (2/4)
SES recursive formulation:
Notes:
When applied recursively to each successive observation in the series, each new smoothed value (forecast) is computed as the weighted average of the current observation and the previous smoothed observation.
Each smoothed value is the weighted average of the previous observations, where the weights decrease exponentially depending on the value of parameter α.
If α = 1 Previous observations are ignored entirely (short memory).
If α = 0 Current observation is ignored entirely (long memory).
The most straightforward way of evaluating the accuracy of the forecasts based on a particular α value is to simply plot the observed values and the one-step-ahead forecasts.
The most straightforward way of evaluating the accuracy of the forecasts based on a particular α value is to simply plot the observed values and the one-step-ahead forecasts.
1ˆ ˆ1t t tY Y Y 1
ˆ ˆ1t t tY Y Y time series value
ˆ forecasted or fitted value
weight 0 2 (usually 0 1)
t
t
Y
Y
time series value
ˆ forecasted or fitted value
weight 0 2 (usually 0 1)
t
t
Y
Y
2
1
1
1 1
1
ˆ1
ˆ1
ˆ
ˆ1 1
1 ...
t t
t t t
t t
t
t
Y Y
Y Y Y
Y
YY Y
2
1
1
1 1
1
ˆ1
ˆ1
ˆ
ˆ1 1
1 ...
t t
t t t
t t
t
t
Y Y
Y Y Y
Y
YY Y
The initial value for the smoothing recursive process can affect the quality of the forecasts for many observations. In practice, when there are many leading observations prior to a crucial actual forecast, the initial value will not affect that forecast by much, since its effect will have long "faded" from the smoothed series.
The initial value for the smoothing recursive process can affect the quality of the forecasts for many observations. In practice, when there are many leading observations prior to a crucial actual forecast, the initial value will not affect that forecast by much, since its effect will have long "faded" from the smoothed series.
2.3.4: Single Exp. Smoothing (3/4)
This worksheet shows the observed values overlaid with a one-parameter exponential smoothed curve. The smoothing factor is set at 0,1. In the lower-right corner, the worksheet contains an area curve that indicates the relative weight assigned to prior observations. The most recent observation has the most weight, with observations exponentially decreasing in importance. The chosen smoothing factor results in a forecast curve that is much less variable than the observed one. The smoothed values are also less susceptible to the influence of large outlying values. Using this value for the smoothing factor, the final forecasted value is 0.069
This worksheet shows the observed values overlaid with a one-parameter exponential smoothed curve. The smoothing factor is set at 0,1. In the lower-right corner, the worksheet contains an area curve that indicates the relative weight assigned to prior observations. The most recent observation has the most weight, with observations exponentially decreasing in importance. The chosen smoothing factor results in a forecast curve that is much less variable than the observed one. The smoothed values are also less susceptible to the influence of large outlying values. Using this value for the smoothing factor, the final forecasted value is 0.069
With a larger value for the smoothing factor, 0.9 in this case, the forecasted values are much more variable –almost as variable as the data. This is because the forecasted values are more likely the data, and less recent observations receive hardly any weight at all in the calculation. If an observation has a large upward swing, then the forecasted value for the next observation tends to be high, whereas the actual value might revert to a lower value. So even though the smoothed line better resembles the shape of the data, it does not necessarily forecast the data values better. Note that the standard error of the forecast has increased to 6.176
With a larger value for the smoothing factor, 0.9 in this case, the forecasted values are much more variable –almost as variable as the data. This is because the forecasted values are more likely the data, and less recent observations receive hardly any weight at all in the calculation. If an observation has a large upward swing, then the forecasted value for the next observation tends to be high, whereas the actual value might revert to a lower value. So even though the smoothed line better resembles the shape of the data, it does not necessarily forecast the data values better. Note that the standard error of the forecast has increased to 6.176
Hour
Tem
p
700210011001001500500190090023001400
46
44
42
40
38
36
34
32
Smoothing ConstantAlpha 0,1
Accuracy MeasuresMAPE 5,91316MAD 2,39461MSD 7,46692
Variable
Forecasts95,0% PI
ActualFits
Single Exponential Smoothing Plot for Temp
Hour
Tem
p
700210011001001500500190090023001400
45,0
42,5
40,0
37,5
35,0
Smoothing ConstantAlpha 0,9
Accuracy MeasuresMAPE 1,78556MAD 0,72877MSD 0,80979
Variable
Forecasts95,0% PI
ActualFits
Single Exponential Smoothing Plot for Temp
2.3.4: Single Exp. Smoothing (4/4)
File: RIVERC.MTW
Stat > Time Series > Single Exp Smoothing…
Three measures of the accuracy of the fitted values are provided: Mean Squared Deviation (MSD), Mean Absolute Deviation (MAD), and Mean Absolute Percentage Error (MAPE).
Three measures of the accuracy of the fitted values are provided: Mean Squared Deviation (MSD), Mean Absolute Deviation (MAD), and Mean Absolute Percentage Error (MAPE).
The larger the weight α, the more the smoothed values follow the data. Thus, small weights are usually recommended for a series with a high noise level around the signal or pattern.
The larger the weight α, the more the smoothed values follow the data. Thus, small weights are usually recommended for a series with a high noise level around the signal or pattern.
α = 0.1α = 0.1
α = 0.9α = 0.9
By default, Minitab uses the average of the first six observations for the initial smoothed value by default.
By default, Minitab uses the average of the first six observations for the initial smoothed value by default.
2.3.5: Other Exponential Smoothing Models
More complex ES models (double ES and Winters’ method), have been developed to accommodate time series with trend and seasonal components.
The general idea here is that forecasts are not only computed from consecutive previous observations (as in SES), but an independent (smoothed) trend and seasonal component can be added.
SEASONAL
• None
SEASONAL
• NoneSEASONAL
• Additive
SEASONAL
• AdditiveSEASONAL
• Multiplicative
SEASONAL
• Multiplicative
DOUBLE EXP SMOOTHINGDOUBLE EXP SMOOTHING
TRIPLE EXP SMOOTHING (WINTERS’ METHOD)TRIPLE EXP SMOOTHING (WINTERS’ METHOD)
2.3.6: Double Exp. Smoothing (1/2)
The weights are the smoothing parameters. You can have Minitab supply some optimal weights (the default) or you can specify values between 0 and 2 for the level weight α and between 0 and [ 4 / α – 2 ] for the trend weight γ.
The weights are the smoothing parameters. You can have Minitab supply some optimal weights (the default) or you can specify values between 0 and 2 for the level weight α and between 0 and [ 4 / α – 2 ] for the trend weight γ.
When a trend component is included in the ES process, an independent trend component T is computed for each time, and modified as a function of the forecast error and the respective trend parameter, γ.
• If γ = 0 the trend component is constant across all values of the time series (and for all forecasts).
• If γ = 1 the trend component is modified "maximally" from observation to observation by the respective forecast error.
When a trend component is included in the ES process, an independent trend component T is computed for each time, and modified as a function of the forecast error and the respective trend parameter, γ.
• If γ = 0 the trend component is constant across all values of the time series (and for all forecasts).
• If γ = 1 the trend component is modified "maximally" from observation to observation by the respective forecast error.
Hour
Tem
p700210011001001500500190090023001400
45
40
35
30
25
Smoothing ConstantsAlpha (level) 0,1Gamma (trend) 0,2
Accuracy MeasuresMAPE 7,4288MAD 3,0280MSD 11,9993
Variable
Forecasts95,0% PI
ActualFits
Double Exponential Smoothing Plot for Temp
Hour
Tem
p
700210011001001500500190090023001400
47,5
45,0
42,5
40,0
37,5
35,0
Smoothing ConstantsAlpha (level) 0,9Gamma (trend) 0,8
Accuracy MeasuresMAPE 1,17378MAD 0,48481MSD 0,36849
Variable
Forecasts95,0% PI
ActualFits
Double Exponential Smoothing Plot for Temp
2.3.6: Double Exp. Smoothing (2/2)
File: RIVERC.MTW
Stat > Time Series > Double Exp Smoothing…
α = 0.9 γ = 0.8
α = 0.9 γ = 0.8
α = 0.1 γ = 0.2
α = 0.1 γ = 0.2
The larger the weights α and γ the more the smoothed values follow the data. Thus, small weights are usually recommended for a series with a high noise level around the signal or pattern.
The larger the weights α and γ the more the smoothed values follow the data. Thus, small weights are usually recommended for a series with a high noise level around the signal or pattern.
2.3.7: Winters’ Method (1/2)
Seasonal weight: Many time series data follow recurring seasonal patterns. For example, annual sales of toys will probably peak in the month of December. This pattern will likely repeat every year, however, the relative amount of increase in sales during December may slowly change from year to year. Thus, it may be useful to smooth the seasonal component independently with an extra parameter, usually denoted as δ.
Seasonal weight: Many time series data follow recurring seasonal patterns. For example, annual sales of toys will probably peak in the month of December. This pattern will likely repeat every year, however, the relative amount of increase in sales during December may slowly change from year to year. Thus, it may be useful to smooth the seasonal component independently with an extra parameter, usually denoted as δ.
Parameter δ can assume values between 0 and 1.
• If δ = 0 the seasonal component for a particular point in time is predicted to be identical to the predicted seasonal component for the respective time during the previous seasonal cycle.
• If δ = 1 the seasonal component is modified "maximally" at every step by the respective forecast error.
Parameter δ can assume values between 0 and 1.
• If δ = 0 the seasonal component for a particular point in time is predicted to be identical to the predicted seasonal component for the respective time during the previous seasonal cycle.
• If δ = 1 the seasonal component is modified "maximally" at every step by the respective forecast error.
Hour
Tem
p
700210011001001500500190090023001400
47,5
45,0
42,5
40,0
37,5
35,0
Smoothing ConstantsAlpha (level) 0,1Gamma (trend) 0,2Delta (seasonal) 0,3
Accuracy MeasuresMAPE 2,43018MAD 0,98114MSD 1,52484
Variable
Forecasts95,0% PI
ActualFits
Winters' Method Plot for TempAdditive Method
Hour
Tem
p
700210011001001500500190090023001400
45,0
42,5
40,0
37,5
35,0
Smoothing ConstantsAlpha (level) 0,9Gamma (trend) 0,8Delta (seasonal) 0,7
Accuracy MeasuresMAPE 1,00330MAD 0,41934MSD 0,38474
Variable
Forecasts95,0% PI
ActualFits
Winters' Method Plot for TempAdditive Method
2.3.7: Winters’ Method (2/2)
File: RIVERC.MTW
Stat > Time Series > Winters’ Method…
α = 0.9 γ = 0.8 δ
= 0.7
α = 0.9 γ = 0.8 δ
= 0.7
α = 0.1 γ = 0.2 δ
= 0.3
α = 0.1 γ = 0.2 δ
= 0.3