part ii – time series analysis c3 exponential smoothing methods © angel a. juan & carles...

Part II – TIME SERIES ANALYSIS

C3 Exponential Smoothing Methods

© Angel A. Juan & Carles Serrat - UPC 2007/2008

2.3.1: Simple TS and Smoothing Methods

The simple forecasting and smoothing methods model components in a series that are usually easy to see in a time series plot of the data.

These methods decompose the data into its trend and seasonal components, and then extend the estimates of the components into the future to provide forecasts.

Static methods have patterns that do not change over time; dynamic methods have patterns that do change over time and estimates are updated using neighboring values.

You may use two methods in combination. That is, you may choose a static method to model one component and a dynamic method to model another component.

A disadvantage of combining methods is that the confidence intervals for forecasts are not valid.

A disadvantage of combining methods is that the confidence intervals for forecasts are not valid.

STATIC (SIMPLE) METHODS

• Trend Analysis

• Decomposition

STATIC (SIMPLE) METHODS

• Trend Analysis

• Decomposition

DYNAMIC (SMOOTHING) METHODS

• Moving Average

• Single Exponential Smoothing

• Double Exponential Smoothing

• Winters’ Method (Triple Exp. Smoothing)

DYNAMIC (SMOOTHING) METHODS

• Moving Average

• Single Exponential Smoothing

• Double Exponential Smoothing

• Winters’ Method (Triple Exp. Smoothing)

2.3.2: Selecting an Exp. Smoothing Method

SINGLE EXP. SMOOTHING

Series without trend and without seasonal components.

SINGLE EXP. SMOOTHING

Series without trend and without seasonal components.

DOUBLE EXP. SMOOTHING

Series with trend but without seasonal component.

DOUBLE EXP. SMOOTHING

Series with trend but without seasonal component.

TRIPLE EXP. SMOOTHING (WINTERS’ METHOD)

Series with trend and seasonal components.

TRIPLE EXP. SMOOTHING (WINTERS’ METHOD)

Series with trend and seasonal components.

2.3.3: Measures of Accuracy

One major difference between MSD and MAD is that the MSD measure is influenced much more by large fitting errors than by small errors (since for the MSD measure the errors are squared).

One major difference between MSD and MAD is that the MSD measure is influenced much more by large fitting errors than by small errors (since for the MSD measure the errors are squared).

Mean Absolute Percentage Error

MAPE

Mean Absolute Percentage Error

MAPE

Mean Absolute Deviation MAD

Mean Absolute Deviation MAD

Mean Squared Deviation MSD

Mean Squared Deviation MSD

2.3.4: Single Exp. Smoothing (1/4) Weighted Moving Averages (WMA):

In the moving averages method, each observation in the MA calculation receives the same weight.

One variation, known as weighted moving averages, involves selecting a different weight for each data value and then computing a weighted average of the most recent m values as the forecast. In most cases, the most recent observation receives the most weight, and the weight decreases for older data values.

Note that for the WMA the sum of each weights is equal to 1.

Single Exponential Smoothing (SES o EWMA):

SES is a special case of the WMA method in which we select only one weight, α, the weight for the most recent observation.

The weights for the other data values are computed automatically and become exponentially smaller as the observations move farther into the past.

With the time series data and the forecasting formulas in a spreadsheet, you can experiment with different values of α (or MA weights) and choose the value(s) of α providing the smallest MSD or MAD.

With the time series data and the forecasting formulas in a spreadsheet, you can experiment with different values of α (or MA weights) and choose the value(s) of α providing the smallest MSD or MAD.

S. Makridakis has conducted research showing that the SES method usually outperforms more complex procedures for short-term forecasting.

S. Makridakis has conducted research showing that the SES method usually outperforms more complex procedures for short-term forecasting.

2.3.4: Single Exp. Smoothing (2/4)

SES recursive formulation:

Notes:

When applied recursively to each successive observation in the series, each new smoothed value (forecast) is computed as the weighted average of the current observation and the previous smoothed observation.

Each smoothed value is the weighted average of the previous observations, where the weights decrease exponentially depending on the value of parameter α.

If α = 1 Previous observations are ignored entirely (short memory).

If α = 0 Current observation is ignored entirely (long memory).

The most straightforward way of evaluating the accuracy of the forecasts based on a particular α value is to simply plot the observed values and the one-step-ahead forecasts.

The most straightforward way of evaluating the accuracy of the forecasts based on a particular α value is to simply plot the observed values and the one-step-ahead forecasts.

1ˆ ˆ1t t tY Y Y 1

ˆ ˆ1t t tY Y Y time series value

ˆ forecasted or fitted value

weight 0 2 (usually 0 1)

t

t

Y

Y

time series value

ˆ forecasted or fitted value

weight 0 2 (usually 0 1)

t

t

Y

Y

2

1

1

1 1

1

ˆ1

ˆ1

ˆ

ˆ1 1

1 ...

t t

t t t

t t

t

t

Y Y

Y Y Y

Y

YY Y

2

1

1

1 1

1

ˆ1

ˆ1

ˆ

ˆ1 1

1 ...

t t

t t t

t t

t

t

Y Y

Y Y Y

Y

YY Y

The initial value for the smoothing recursive process can affect the quality of the forecasts for many observations. In practice, when there are many leading observations prior to a crucial actual forecast, the initial value will not affect that forecast by much, since its effect will have long "faded" from the smoothed series.

The initial value for the smoothing recursive process can affect the quality of the forecasts for many observations. In practice, when there are many leading observations prior to a crucial actual forecast, the initial value will not affect that forecast by much, since its effect will have long "faded" from the smoothed series.


This worksheet shows the observed values overlaid with a one-parameter exponential smoothed curve. The smoothing factor is set at 0,1. In the lower-right corner, the worksheet contains an area curve that indicates the relative weight assigned to prior observations. The most recent observation has the most weight, with observations exponentially decreasing in importance. The chosen smoothing factor results in a forecast curve that is much less variable than the observed one. The smoothed values are also less susceptible to the influence of large outlying values. Using this value for the smoothing factor, the final forecasted value is 0.069

This worksheet shows the observed values overlaid with a one-parameter exponential smoothed curve. The smoothing factor is set at 0,1. In the lower-right corner, the worksheet contains an area curve that indicates the relative weight assigned to prior observations. The most recent observation has the most weight, with observations exponentially decreasing in importance. The chosen smoothing factor results in a forecast curve that is much less variable than the observed one. The smoothed values are also less susceptible to the influence of large outlying values. Using this value for the smoothing factor, the final forecasted value is 0.069

With a larger value for the smoothing factor, 0.9 in this case, the forecasted values are much more variable –almost as variable as the data. This is because the forecasted values are more likely the data, and less recent observations receive hardly any weight at all in the calculation. If an observation has a large upward swing, then the forecasted value for the next observation tends to be high, whereas the actual value might revert to a lower value. So even though the smoothed line better resembles the shape of the data, it does not necessarily forecast the data values better. Note that the standard error of the forecast has increased to 6.176

With a larger value for the smoothing factor, 0.9 in this case, the forecasted values are much more variable –almost as variable as the data. This is because the forecasted values are more likely the data, and less recent observations receive hardly any weight at all in the calculation. If an observation has a large upward swing, then the forecasted value for the next observation tends to be high, whereas the actual value might revert to a lower value. So even though the smoothed line better resembles the shape of the data, it does not necessarily forecast the data values better. Note that the standard error of the forecast has increased to 6.176

Hour

Tem

p

700210011001001500500190090023001400

46

44

42

40

38

36

34

32

Smoothing ConstantAlpha 0,1

Accuracy MeasuresMAPE 5,91316MAD 2,39461MSD 7,46692

Variable

Forecasts95,0% PI

ActualFits

Single Exponential Smoothing Plot for Temp

Hour

Tem

p

700210011001001500500190090023001400

45,0

42,5

40,0

37,5

35,0

Smoothing ConstantAlpha 0,9


Variable

Forecasts95,0% PI

ActualFits

Single Exponential Smoothing Plot for Temp


File: RIVERC.MTW

Stat > Time Series > Single Exp Smoothing…

Three measures of the accuracy of the fitted values are provided: Mean Squared Deviation (MSD), Mean Absolute Deviation (MAD), and Mean Absolute Percentage Error (MAPE).

Three measures of the accuracy of the fitted values are provided: Mean Squared Deviation (MSD), Mean Absolute Deviation (MAD), and Mean Absolute Percentage Error (MAPE).

The larger the weight α, the more the smoothed values follow the data. Thus, small weights are usually recommended for a series with a high noise level around the signal or pattern.

The larger the weight α, the more the smoothed values follow the data. Thus, small weights are usually recommended for a series with a high noise level around the signal or pattern.

α = 0.1α = 0.1

α = 0.9α = 0.9

By default, Minitab uses the average of the first six observations for the initial smoothed value by default.

By default, Minitab uses the average of the first six observations for the initial smoothed value by default.

2.3.5: Other Exponential Smoothing Models

More complex ES models (double ES and Winters’ method), have been developed to accommodate time series with trend and seasonal components.

The general idea here is that forecasts are not only computed from consecutive previous observations (as in SES), but an independent (smoothed) trend and seasonal component can be added.

SEASONAL

• None

SEASONAL

• NoneSEASONAL

• Additive

SEASONAL

• AdditiveSEASONAL

• Multiplicative

SEASONAL

• Multiplicative

DOUBLE EXP SMOOTHINGDOUBLE EXP SMOOTHING

TRIPLE EXP SMOOTHING (WINTERS’ METHOD)TRIPLE EXP SMOOTHING (WINTERS’ METHOD)

2.3.6: Double Exp. Smoothing (1/2)

The weights are the smoothing parameters. You can have Minitab supply some optimal weights (the default) or you can specify values between 0 and 2 for the level weight α and between 0 and [ 4 / α – 2 ] for the trend weight γ.

The weights are the smoothing parameters. You can have Minitab supply some optimal weights (the default) or you can specify values between 0 and 2 for the level weight α and between 0 and [ 4 / α – 2 ] for the trend weight γ.

When a trend component is included in the ES process, an independent trend component T is computed for each time, and modified as a function of the forecast error and the respective trend parameter, γ.

• If γ = 0 the trend component is constant across all values of the time series (and for all forecasts).

• If γ = 1 the trend component is modified "maximally" from observation to observation by the respective forecast error.

When a trend component is included in the ES process, an independent trend component T is computed for each time, and modified as a function of the forecast error and the respective trend parameter, γ.

• If γ = 0 the trend component is constant across all values of the time series (and for all forecasts).

• If γ = 1 the trend component is modified "maximally" from observation to observation by the respective forecast error.

Hour

Tem

p700210011001001500500190090023001400

45

40

35

30

25

Smoothing ConstantsAlpha (level) 0,1Gamma (trend) 0,2


Variable

Forecasts95,0% PI

ActualFits

Double Exponential Smoothing Plot for Temp

Hour

Tem

p

700210011001001500500190090023001400

47,5

45,0

42,5

40,0

37,5

35,0

Smoothing ConstantsAlpha (level) 0,9Gamma (trend) 0,8


Variable

Forecasts95,0% PI

ActualFits

Double Exponential Smoothing Plot for Temp

2.3.6: Double Exp. Smoothing (2/2)

File: RIVERC.MTW

Stat > Time Series > Double Exp Smoothing…

α = 0.9 γ = 0.8

α = 0.9 γ = 0.8

α = 0.1 γ = 0.2

α = 0.1 γ = 0.2

The larger the weights α and γ the more the smoothed values follow the data. Thus, small weights are usually recommended for a series with a high noise level around the signal or pattern.

The larger the weights α and γ the more the smoothed values follow the data. Thus, small weights are usually recommended for a series with a high noise level around the signal or pattern.

2.3.7: Winters’ Method (1/2)

Seasonal weight: Many time series data follow recurring seasonal patterns. For example, annual sales of toys will probably peak in the month of December. This pattern will likely repeat every year, however, the relative amount of increase in sales during December may slowly change from year to year. Thus, it may be useful to smooth the seasonal component independently with an extra parameter, usually denoted as δ.

Seasonal weight: Many time series data follow recurring seasonal patterns. For example, annual sales of toys will probably peak in the month of December. This pattern will likely repeat every year, however, the relative amount of increase in sales during December may slowly change from year to year. Thus, it may be useful to smooth the seasonal component independently with an extra parameter, usually denoted as δ.

Parameter δ can assume values between 0 and 1.

• If δ = 0 the seasonal component for a particular point in time is predicted to be identical to the predicted seasonal component for the respective time during the previous seasonal cycle.

• If δ = 1 the seasonal component is modified "maximally" at every step by the respective forecast error.

Parameter δ can assume values between 0 and 1.

• If δ = 0 the seasonal component for a particular point in time is predicted to be identical to the predicted seasonal component for the respective time during the previous seasonal cycle.

• If δ = 1 the seasonal component is modified "maximally" at every step by the respective forecast error.

Hour

Tem

p

700210011001001500500190090023001400

47,5

45,0

42,5

40,0

37,5

35,0

Smoothing ConstantsAlpha (level) 0,1Gamma (trend) 0,2Delta (seasonal) 0,3


Variable

Forecasts95,0% PI

ActualFits

Winters' Method Plot for TempAdditive Method

Hour

Tem

p

700210011001001500500190090023001400

45,0

42,5

40,0

37,5

35,0

Smoothing ConstantsAlpha (level) 0,9Gamma (trend) 0,8Delta (seasonal) 0,7


Variable

Forecasts95,0% PI

ActualFits

Winters' Method Plot for TempAdditive Method

2.3.7: Winters’ Method (2/2)

File: RIVERC.MTW

Stat > Time Series > Winters’ Method…

α = 0.9 γ = 0.8 δ

= 0.7

α = 0.9 γ = 0.8 δ

= 0.7

α = 0.1 γ = 0.2 δ

= 0.3

α = 0.1 γ = 0.2 δ

= 0.3

part ii – time series analysis c3 exponential smoothing methods © angel a. juan & carles...

Documents