forecastingtime series

Last updated 2008 Jan 28

Forecasting Time Series within Excel

Forecasting via Open Source VBA Macros

by Hugh E. Warren

2008

Notes to Accompany the Workbook

“Forecast Time Series.xls”

Contents | Introduction |Analysis | Transform | Non-Seasonal | Seasonal

Tips | Data Sources | Special Cases

Contents1. Introduction 1.1 Required Programs and Files 1.2 Entering Time Series Data 1.3 The “TimeSeries” Menu2. Analyzing a Time Series 2.1 Graphing 2.1.1 Housekeeping 2.2 Seasonal Time Series and the Periodogram 2.3 Autocorrelation 2.4 Smoothing to See Patterns 2.4.1 Moving Averages 2.4.2 Oscillation Filtering 2.5 Box Jenkins Models 2.5.1 Optimal Parameters 2.5.2 Likelihood Plots3. Transforming a Time Series 3.1 Aggregate 3.2 Deviation from Linear 3.3 Difference 3.4 Extend linearly 3.5 Logarithm 3.6 Sample 3.7 Seasonal Difference4. Non-seasonal Forecasting 4.1 Structural Models 4.1.1 The Naïve Model 4.1.2 Naïve Trend 4.2 Smoothing Models 4.2.1 Base Average 4.2.2 Moving Average 4.2.3 Moving Trend

4.2.4 Exponential Smoothing 4.2.5 Double Exponential Smoothing 4.3 Statistical Models 4.3.1 Box Jenkins (p,d,q)5. Seasonal Models 5.1 Structural 5.1.1 Naïve 5.1.2 Naïve Trend 5.1.3 Base Additive` 5.1.4 Base Multiplicative 5.2 Smoothing Models 5.2.1 Holt Winter Additive 5.2.2 Holt Winters Multiplicative 5.3 Statistical Models 5.3.1 Box Jenkins (0,1,1)x(0,1,1)6. Tips and More Examples 6.1 Locations of Data and Screen Objects 6.1.1 Data 6.1.2 Screen Objects 6.2 Customizing Charts 6.3 Using Data Tables for Further Analysis 6.4 Input Data in Other Workbooks7. Sources of Times Series Data 7.1 Your Own Records 7.2 U. S. Government 7.3 Internet 7.4 Trade Associations 7.5 Libraries8. Special Cases of Forecasting Models

Forecasting Time Serieswithin Excel

Forecasting via Open Source VBA Macros

by Hugh E. Warren2007

Notes to Accompany the Workbook

“Forecast Time Series.xls”

1. Introduction A time series is a sequence of counts or measurements, for example births day by day at a hospital yearly rain fall in a given location monthly sales at a selected store a closing stock price each business day A reliable forecast of future values of a time series provides the opportunity to either improve the future or prepare for the inevitable. The accompanying Excel workbook “Forecast Time Series.xls” has a number of built in tools for working with time series. It includes different ways of making forecasts, and of measuring how good those forecasts are. The results are written as tables and charts in Excel worksheets. The tables and charts can be used directly, or further manipulated with the standard features of Excel. It will useful to have algebraic notation for the values in a time series. Here we will write

z1, z2, z3, … , zn

for the first n values of a time series. The subscript denotes the time at which the count or measurement is made. 1.1 Required Programs and Files The macros in “Forecast Time Series.xls” were written and tested in Excel 2002 running under Windows XP. The following image files need to be in the same folder as the Excel workbook: background200by8.bmp background200by600.bmp Copy them from this same web site. 1.2 Entering Time Series Data The Excel workbook is made up of several worksheets, each identified by a sheet tab at the bottom of the screen. Each worksheet can have information about a different time series written in the first column as follows: Cell Contents Example 1 Example 2A1 Title for the time series LA Rain Births in United

StatesA2 unit of measurement inches registered live births

A3 time interval July thru June

month

A4 first time interval 1877 1953 JanA5 last time interval 2006 1980 DecA6 first measurement 21.26 322,488An succeeding

measurementsfollowed by an empty cell

1.3 The “TimeSeries” Menu Section 7 below tells where to find data for many times series. The Forecast worksheet comes with five sample time series, including the two examples above on worksheets with the tabs “LA Rain” and “Births”. To work with a time series, select the worksheet with data you want. Click “TimeSeries” in the menu bar at the top of the worksheet. A drop down menu will show with the options Analyze Transform Forecast Cleanup

2. Analyzing a Time Series Eventually you will want to make forecasts; however, it is best to first analyze a new time series. Move the cursor over the “Analyze” option to display the submenu Graph Periodogram AutoCorrelations --------- Moving Average Oscillation Filter --------- Box Jenkins Suite 2.1 Graphing Click “Graph” to show a graph of the time series. Components of the graph are Component LocationTitle top middleFirst time interval left topLast time interval right topmeasurement unit and time interval top middle, below the titlevertical scale, in the measurement units lefthorizontal scale, counting the measurements

bottom

dots plotting the data plot areaconnecting lines to show the time plot area

sequence The underlying data are in Column A just to the left of the chart. Look for patterns in the data such as steps in the general level of the data points data points that trend up or down in an approximate straight line data points that trend up or down in a curve up and down patterns that repeat different behavior of the graph over different stretches of time Use the “Analyze | Graph” tool on each of the five time series included in the workbook. time series patternsLA Rain No apparent pattern. Sometimes statistical tools can uncover

a pattern or characterize the type of randomness present.Births There is a repeating short period up and down pattern, and

there are longer term trends.Inventories There are apparent straight line trends that persist for a

while, then change.Oil Price The early and late parts of the time series are very different.CO2 A zigzag with a steady upward trend.

The importance of patterns is that all forecasting methods use some pattern in past values to predict future values. The “Analyze” options are ways of detecting patterns. 2.1.1 Housekeeping The “Cleanup” option on the “TimeSeries” menu removes everything from the worksheet except the original data in Column A. Clicking on “Cleanup” before using another time series tool keeps the worksheet from getting cluttered. To have available the results of several operations, insert new blank worksheets, copy the time series data to the top of Column A on each sheet, and run each operation on its own sheet. 2.2 Seasonal Time Series and the Periodogram A seasonal time series is one that has a repeating up and down pattern. The name comes from seasons in a year. Measurements of temperature and precipitation show a yearly cycle of seasons, although there is often variation from year to year as well. For example, if rainfall totals are measured quarterly, that is, every three months, then we would have the traditional four seasons per cycle: winter, spring, summer and fall. If totals were taken every month, we extend the terminology and speak of 12 seasons per cycle. Some time series have more than one underlying rhythm. Retail sales often show both weekly and annual cycles. Good forecasting depends on whether a time series has seasonality. Choose the "CO2" time series and use the "Analyze | Periodogram" tool to quantify the cyclic behavior. Two graphs appear. The top one is the traditional "periodogram." A

tall, isolated spike in the periodogram indicates a corresponding cycle in the underlying time series. In the CO2 example there is one tall spike. Let the cursor hover over the top of the spike to see the "frequency" of the cycle, namely, 0.0833. In general, the number of time periods in the cycle is approximately one divided by the frequency. For the CO2 data 1 / 0.0833 = 12. That is plausible, because the time interval is one month and there are 12 months in a year. The lower graph, with the horizontal axis labeled "time periods", does the "one over frequency" calculation for us. Note the spike directly above 12. The CO2 series measures the atmospheric concentration of carbon dioxide atop the Hawaiian mountain Mona Loa. The behavior is very nearly a linear increase plus a sinusoidal fluctuation with the seasons. The small overtone at frequency 2 × 0.0833 makes the variation a bit sharper than a pure sine wave. Next choose the "Births" time series. The "Analyze | Periodogram" tool again brings out the monthly frequency of the data with a large peak at 0.0833. There is an annual pattern of more babies born in certain months. The small overtone spikes at two, three, four, and five times the fundamental frequency indicate that the variation is more complex than a simple sine wave. The spikes at very low frequencies are a rumble from long term fluctuations in birth rate that are not predictable from the given data. The periodogram for LA Rain has a jumble of spikes. For this series there is no cycle that is useful for forecasting. 2.3 Autocorrelation Autocorrelation measures how closely a future value of a time series is related to past values. Use “Analyze | AutoCorrelations” on the “Inventories” time series. This draws six graphs and writes several tables on the worksheet. The top graph shows the time series. In the second graph the green bars show the autocorrelations. A value of 1.0 means perfect correlation. The bar at 1on the horizontal axis is for correlation with the immediate prior value of the time series. The bar at 2 is for correlation two periods back, and so on. The time between the future value and the past value is called the lag. The “Inventories” series has high autocorrelation at short lags, declining to zero correlation at lag 30, that is, at two and a half years back. The third graph plots the changes in inventory values from one period to the next. The changes are a new time series whose values are differences

zt – zt-1

Differences are also called “deltas”. Note the prefix “delta” in the titles of the third and fourth graphs. The fourth graph shows that the differences have some autocorrelation at short lags, declining to zero at lag 13. Statistically the correlation becomes close to zero at lag 10. Noise in the time series can create a correlation bar when no real correlation exists. Such a zero correlation bar would be within the white band 80% of the time, but 20% of the time could push higher or lower. About 5% of the time a zero correlation bar extends into the gray region.

The fifth graph down shows the second differences, that is, the differences of the differences. The sixth graph shows the autocorrelations of these second differences. Statistically this is a simple graph. The green bars are close to zero, except for the first one. The yellow bars are also close to zero, except the first few, which rapidly move towards zero. The yellow bars show what are called partial autocorrelations, which measure how much of the corresponding autocorrelation is really new information. Notice that the partial autocorrelation and autocorrelation at lag 1 are always the same. Simple patterns are useful in choosing forecasting models. See in particular the discussion of Box Jenkins models. The first difference time series is written in Column A below the original series. The second difference series is written below the first differences. The table in columns O through R and rows 5 through 11 summarizes the statistics of the three series. The autocorrelations and partial autocorrelations are in columns P and Q. When you click on a graph, Excel puts a border around the source data. 2.4 Smoothing to See Patterns Many time series have ups and downs from random causes that do not repeat. These changes in values are also called random shocks, or noise. There are ways to decrease noise so that true patterns can be found. These can be called smoothing techniques. 2.4.1 Moving Averages A moving average can be an effective smoothing technique. The n-period moving average at time t = j is the sum of the n consecutive values ending at time j, all divided by n. Algebraically that is

(zj-n+1 + zj-n+2 + … + zj) / n Try this on the “Births” time series. Use the “Analyze | Moving Average” with 24 data points in the moving average. Taking out the month to month swings and other short term fluctuations emphasizes the longer term pattern at the end of the Baby Boom. If the number of data points in the average is the same as or a multiple of the number of periods in an underlying cycle, then the effects of that cycle are “averaged out”. In general, the more data points, the smoother the moving average. Use “Analyze | Moving Average” with 12 data points on the “Inventories” series, and compare the original series to the moving average. It is generally true that when there are trends, the moving average will fall behind the movement of the original series. 2.4.2 Oscillation FilteringAnother way to “average out” the effects of cycles is to explicitly filter out the corresponding frequencies. The time series is analyzed into components at various frequencies using Fourier methods. Only the desired frequencies are kept. The process is analogous to an audio filter that selects from bass, intermediate, or treble frequency ranges.

For example, use “Analyze | Oscillation Filter” on the “Inventories” time series. A table appears:

oscillations wavelength frequency□ 1 186 0.005□ 2 93 0.011□ 3 62 0.016□ 4 47 0.022□ 5 to 6 27 to 37 0.027 to

0.038□ 7 to 9 19 to 27 0.038 to

0.054□ 10 to 19 9 to 19 0.054 to

0.108□ 20 to 49 4 to 19 0.108 to

0.269 Click on the first three boxes at the left. Then click on the “Filter” button. The graph of the time series appears along with the smoothed series. The smoothed series is the sum of three components. Think of the box around the graph as the glass side on a swimming pool. A one oscillation wave in the pool has just one up and down from one end of the pool to the other. A two oscillation wave goes up and down twice, and so on. The smoothed series is a combination of one, two and three oscillation waves. The “Inventories” time series has 186 observations. That makes a one oscillation wave 186 periods long. Another way to say this is that the wavelength is 186 periods. A two oscillation wave has 93 periods in one up and down cycle, that is, a wavelength of 93. In the right column each frequency is 1.0 divided by the corresponding wavelength. For example, 1.0 / 186 = 0.005376, which to three decimal places is 0.005. The number of oscillations, the wavelength, and the frequency are three ways of describing at the same cyclic behavior. 2.5 Box Jenkins Models From their own research and the work of others George E. P. Box and Gwilym M. Jenkins presented a family of statistical models with which to forecast times series. Box and Jenkins treat both time series taken one at a time, so called univariate series, and multivariate series, that is combinations of related time series. Here we deal with time series only one at a time. The Box Jenkins models for non-seasonal time series are characterized by two groups of parameters. First there are three parameters that measure the degree of complexity, namely

d the degree of differencing used to simplify the series. Roughly, “simplify“ means to take differences of the original time series values until the autocorrelation graph shows only a few non-zero autocorrelation values and the partial

autocorrelation values rapidly tend to zero, or the same behavior with autocorrelations and partial autocorrelations reversed. For most time series d = 0, no differencing needed, or d =1 or d = 2.

p the degree of autocorrelation, or how far back to build in explicit autocorrelation. Roughly, p is the number of non-zero autocorrelation values, when the partial autocorrelation values tend rapidly to zero. Commonly p = 0 or p = 1 or p = 2.

q the “moving average” degree, or how far back to build in explicit dependence on random shocks to the series. Roughly, q is the number of non-zero partial autocorrelation values, when the autocorrelation values tend rapidly to zero. Commonly q = 0 or q = 1 or q = 2.

When we specify p, d, and q, we say we have a (p,d,q) model. The second group of parameters consists of p values called autoregressive coefficients and q values called moving average coefficients. Box and Jenkins use the following symbols Greek phi, φ1, φ2, …, φp for the autoregressive coefficients Greek theta, θ1, θ2, …, θq for the moving average coefficients For example, a (2,0,1) model has 2 autoregressive coefficients and one moving average coefficient. 2.5.1 Optimal Parameters The best parameters for a forecasting model are those that minimize forecast errors. We can only compute past errors, so we select a model and apply it to a sequence of past values

z1, z2, … , zn

The subscripts denote times 1 through n. Call these times the Base Interval. The model and its parameters generate forecasts

f1, f2, … , fn

and forecast errors ei = zi - fi for i = 1 through n. We say we have an optimal set of parameters for the model if some function of the forecast errors is as small as possible. Examples of “error functions” are sum of squared errors e1

2 + e22 + … + en

2

SSR root mean square error ((e1

2 + e22 + … + en

2) / n)1/2

RMSE mean absolute deviation (|e1| + |e2| + … + |en|) / n MAD mean absolute percentage error 100(|e1 / z1| + |e2 / z2| + … + |en / zn|) / n MAPE

maximum absolute error max(|e1|, |e2|, … ,|en|) Optimal parameters are always with respect to a particular time series and a chosen Base Interval. Our hope is that optimal for the Base Interval remains optimal for future intervals of time. The statistical assumptions behind a Box Jenkins (p, d, q) model imply that optimal parameters minimize the sum of squared errors, also called the Sum of Squared Residuals (SSR). The procedure “Analyze | Box Jenkins Suite” looks at the 27 Box Jenkins (p,d,q) models, where p, d, and q each vary over 0, 1, and 2. For each of these general models the procedure searches for the optimal parameters, and corresponding optimal SSR, with the Base Interval taken to be all of the time series values. The search is done two ways. The first search uses the Levenberg-Marquardt (L-M) algorithm to find the parameters that minimize SSR. This method is a combination of steepest descent and quadratic optimization that is particularly appropriate to minimizing a sum of squares. Occasionally the search ends at a suboptimal local minimum or boundary point. The second search evaluates SSR on a grid of parameter points that cover the region of all theoretically feasible combinations of parameters. When the minimum on the grid is less than 99% of the L-M minimum, the latter is shaded blue to mark it as suboptimal. Run the “Analyze | Box Jenkins Suite” for the time series “LA Rain”. The 27 models are listed in Column J. To the left side are

Column Contains the Levenberg-Marquardt value, if any, for theE first Auto-Regressive parameterF second Auto-Regressive parameterG first Moving Average parameterH second Moving Average parameterI sum of squared errors (SSR)

2.5.2 Likelihood Plots

The second search in the “Box Jenkins Suite”computes the SSR over a grid of the feasible parameters for each (p,d, q) model. Feasible means the parameters generate forecasts that do not grow wildly. Parameters with small SSR values are more likely to give a good forecast model. To the right of Column J are Column ContentsK optimal SSR from the feasible region gridL first Auto-Regressive parameter, if any, from the feasible region

gridM second Auto-Regressive parameter, if any, from the feasible region

gridN first Moving Average parameter, if any, from the feasible region

gridO second Moving Average parameter, if any, from the feasible region

grid The smallest SSR in Column I and Column K is shaded pink. One might think that the model in Column J on the same row would be the best Box Jenkins model. Noise in the data makes the pink SSR only a point estimate of an uncertain value. The gray shaded SSR values are within the 95% confidence limit of the pink value. Apply the principle of economy to the models with gray shaded SSR values. Prefer a model with the least number of parameters, that is, with smallest p + q. For “LA Rain” preference goes to the (0,0,0) model, which means the best forecast for annual rainfall in Los Angeles is simply the long term average rainfall, slightly over 15 inches a year. Run the “Analyze | Box Jenkins Suite” for the time series “Inventories”. The many computations may take 30 seconds on a 2 gigahertz computer. Pink shading marks the (2,0,2) model as the one with the lowest computed SSR. However, this model has four parameters. The model (0,2,2) has only two parameters and a statistically close SSR. The likelihood plot for the model (0,2,2) starts in cell DN256. The feasible region for two moving average parameters in this model is a triangle defined by the inequalities

-2 ≤ θ1 ≤ 2θ1 + θ2 ≤ 1θ2 ≤ 1 + θ1

The grid steps the theta values across the feasible region in increments of 0.1. Each grid point is represented by a cell in the worksheet. For example, the cell EP270 represents grid point θ1 = 0.7 and θ2 = –0.2. The SSR value for these parameters, 505.1 is written in the cell, and the cell is shaded pink (technically magenta) to mark it as the lowest SSR value on the grid. Cells with SSR values in the 95% confidence region of the minimum are shaded gray. Blue marks cells with SSR values up to twice the minimum, green up to three times, and so on. A color key to the likelihood plots starts in cell W2. Cells with white background have an SSR at least six times the minimum and correspond to parameters that are far from optimal. The feasible region for a Box Jenkins (p,d,q) model has p + q dimensions. See the one dimensional feasible region for the (0,1,1) model in Column AK starting in Row 2. The magenta curve in the “theta” chart is another way of showing how the SSR value changes with the theta parameter. Right click on the vertical axis of this chart and set the maximum to 5000 to better see the minimum point. In one dimensional grids the parameter steps by 0.01. The three dimensional feasible region of the (2,0,1) model starts in cell AD286. The grid is shown as 11 triangular slices going down the sheet. Each triangle is a plot of the two autoregressive parameters, φ1and φ2, for a single value of the moving average parameter theta. The step size is 0.2. In the Excel menu bar click on View | Zoom | 25% to see all the slices at once. Note that the minimal, magenta cell is on a boundary. This implies that a simpler Box Jenkins model should be used. For example Model with optimum on boundary Implied simpler model(2,0,1) with φ1 + φ2 = 1 (1,1,1) with φ = –φ2

(2,0,2) with φ1 + φ2 = 1 (1,1,2) with φ = –φ2

(1,1,2) with φ = 1 (0,2,2)(0,2,2) with θ1 + θ2 = 1 (0,1,1) with θ = –θ2

(0,1,1) with θ = 1 (0,0,0), i.e., random noise about the mean

The algebraic technique for simplifying and finding equivalent Box Jenkins models is called “factoring the auto regressive and moving average operators”.

3. Transforming a Time Series Good forecasting depends on detecting patterns. There are a number of mathematical transformations that are useful in this regard. The “TimeSeries | Transform” submenu lists six procedures for changing time series. Each procedure writes a new time series on a new worksheet in the existing Excel workbook. The descriptive information for the new time series is automatically modified to show where the data came from and what was done. 3.1 Aggregate This procedure adds up successive values of the original time series. Use this for converting monthly data to quarterly or annual data, or hourly data to daily, and so on. As an exercise aggregate the “Births” time series by blocks of 12 starting at time value number one. The new series is written on a new worksheet called “<agg 12;1> Births”. The time interval is labeled “<agg 12;1> month” 3.2 Deviation from Linear Linear means like a straight line. If we suspect that a time series has an overall linear trend, then the deviations from straight line behavior are key to accurate forecasting. This procedure fits a straight line to the data, then takes the differences from that straight line. As an example create the deviation from linear for the time series “CO2 Concentration”. Note the prefix “<dev>” on the new worksheet tab and on the unit of measure. The original unit of measure is “ppm”, parts per million. 3.3 Difference Differencing takes the differences between successive values of the original series. Suppose we are interested in monthly electricity usage, but what we have are monthly meter readings in cumulative kilowatt hours. The difference series will give us the kilowatt hours used per month. This procedure uses the prefix “<delta>”. Take the difference of the “LA Rain” series. As an extended exercise type the number -30 into cell D1, -25 into D2, and so on up to 30 in cell D13. Select the cells E1:E14. Type

=FREQUENCY(A6:A134,D1:D13) then press CTRL+SHIFT+ENTER. With E1:E14 still selected click “Insert” on the menu bar, and then choose “Chart | Column | Finish”. The chart shows a ragged approximation to a bell curve. Use Excel Help to learn about the Frequency function.

3.4 Extend linearly This fits a straight line to the original series, then appends an equal number of values along the fitted line. This simple type of forecast can be compared with more complex methods. Apply this to “LA Rain”, graph the new series, and note the gradual decline. 3.5 Logarithm Data from economics and population studies often show compound growth. Taking logarithms converts this to linear growth, which is simpler to forecast. The transformation prefix is “<log>”. Take the logarithm of the “Inventories” series. Then take the difference of “<log> Inventories”. Graph the “<delta> <log> Inventories” series. The numbers approximate the fractional month to month changes. During the year 2001 manufacturing inventories were declining about 1% per month. 3.6 SampleFor the “Births” time series, apply “TimeSeries | Transform | Sample” and sample every 12th value starting at value 3. The series “<sample 12;3 Births> is births in the month of March from 1953 through 1980. 3.7 Seasonal DifferenceFor the “Births” time series, apply “TimeSeries | Transform | Seasonal Difference” and leave the Seasons per Cycle at the default value of 12. Graph the differenced data, and note that even during the first 90 months, corresponding overall to a rise to the crest of the baby boom, there are many months that show year to year declines. Now apply the same transformation to the CO2 data. In the nearly 40 year history there are only two months with a year to year decline. The trend in CO2 concentration has been steadier than trends in the birth rate.

4. Non-seasonal ForecastingClick “TimeSeries | Forecast | Non Seasonal” to bring up a graph of the time series and a dialog box with various ways to make a forecast. The dialog box has radio buttons for selecting a forecast procedure and a large control button labeled

Set the Base Interval,Test Interval and

Leadtime The Base Interval is history that is fed to the forecasting procedure. The forecast is then continued into the Test Interval. Leadtime is how far ahead we are forecasting. Forecast errors are measured for both the Base Interval and the Test Interval. By default the Base Interval is the first half of the time series data, and the Test Interval is the second half. The default Leadtime is one period. Start the “Forecast” the “Inventories” time series, and click the “Set …” button. Another dialog box appears, just below the graph. The sliders in this box can be used to reset the Base Interval and Test Interval, and change the

leadtime. For example, drag the Base Interval “End” slider until the number below it equals 100. The start of the Test Interval automatically adjusts to 101. The start and end values are related by 1 ≤ Base Start ≤ Base End ≤ Base End + Leadtime ≤ Test Start ≤ Test End ≤ n

where n is the number of time series values in Column A. User the spin button in the Leadtime box, at the lower left, to change the Leadtime to 6. Then press the “Record Values” button. The Start and End and Leadtime values are written to the worksheet in Row 2 above the graph. After verifying this, click exit in the first dialog box, do a “TimeSeries | Cleanup”, and start the “Forecast | Non Seasonal” over again with the default values in Row 2. The numerical values of the forecast are written in columns B, C, and D: B for the Base Interval, D for the Test Interval, and C for any gaps. 4.1 Structural Models These simple models serve as a basis of comparison. Anything more complicated should deliver lower forecast errors. 4.1.1 The Naïve Model The forecast for time t at Leadtime l is the time series value at time t – l, that is

ft = zt–l

Click the radio button next to “Naïve Level”. The orange squares of the forecast on the Base Interval cover the black diamonds of the time series itself. On the Test Interval open red diamonds represent the forecast. Again, the forecast closely follows the time series values, because for this series there is not much change from one period to the next. The forecast model and the forecast errors are summarized in cells P29:R38. Anything more complicated than the Naïve forecast should have lower errors at the same leadtime, in particular at leadtime 1 we should expect a Mean Absolute Percentage Error on the Test Interval less than 0.57%. 4.1.2 Naïve Trend Create the Naïve Trend forecast by clicking the respective radio button. In essence the Naïve Trend draws a straight line between the most recent two data points and forecasts the future along that line. At Leadtime l the forecast is

ft = zt–1 + l×(zt–1 – zt–2) The Mean Absolute Percentage Error on the Test Interval is now 0.41%. It does a little better than the Naïve Level forecast because there are some stretches in the time series with straight line movement. For time series with frequent changes of direction the naïve Level forecast will often do better than the Naïve Trend. 4.2 Smoothing Models

Forecasts built on smoothing assume that to some degree the short term changes in the time series are only random noise. 4.2.1 Base AverageThis assumes that all the changes are in the time series are random noise. The forecast is always the mean value of the time series over the Base Interval. Click the radio button for “Base Average”. The level line at 414.8 is the average inventory value over the first 93 values. The Mean Absolute Percentage Error on the Test Interval is a poor 8.46%. 4.2.2 Moving AverageThe number of terms in the average is a parameter, call it m. The forecast is

ft = (zt-1 + zt-2 + … + zt-m+1) / m Run the “Moving Average” forecast with six terms in the average. Note the Test MAPE of 1.81%. Rerun the “Moving Average” forecast and click the “Optimize” button, then the “Forecast” button. The value of m that minimizes the Base SSR is m = 1. That puts us back at the Naïve Level forecast. 4.2.3 Moving TrendThis fits a straight line to the most recent m points and forecasts along the line. Run the “Moving Trend” forecast and optimize. The resulting three point moving trend has Test MAPE 1.06%. 4.2.4 Exponential SmoothingThe forecast is

ft = (1 – α)ft-1 + α zt-1

The parameter alpha (α) is a number between 0 and 2. For small alpha this is akin to a moving average m = 1/α. Try the “Exponential Smoothing” forecast with an alpha between 0.2 and 0.5, then use the “Optimize” button. The value 1.38 produces a Test MAPE of 0.47%. Values of alpha greater than one extrapolate a trend in the data. Alpha equal to 1.0 reproduces the Naïve Level model, 4.2.5 Double Exponential Smoothing This model is useful when a time series has stretches approximating straight lines. There are two smoothing parameters: alpha for level, and beta for slope. Small values give smoothly varying forecasts. Run this for the “Inventories” series, with optimization of alpha and beta, to see a Test MAPE of 0.38%. Notice that the name of the forecast model and the values of the parameters are written in cells P29:Q 31. 4.3 Statistical ModelsStatistical models assume an explicit contribution from random effects. For example

zt = deterministic function of past information + at

assumes an additive random effect. Typically the random variables a1, a2, and so on are assumed to have a common probability distribution with zero mean. 4.3.1 Box Jenkins (p,d,q)

These models were first discussed in Section 2.5. As statistical models they have the form

zt = F(zt-1, zt-2, … , at-1, at-2, …) + at

where F is a linear function of its arguments and at, at-1, and so on are independent, identically distributed normal random variables. The forecast is the F function. The a’s are the forecast errors, and their common variance is a measure of forecast accuracy. From Section 2.5 we know that (0,2,2) is a likely candidate for forecasting the “Inventories” time series. With the “Forecast” tool we can see how well this model performs when its parameters are taken from a Base Interval and applied to a Test Interval. With the “Inventories” time series run “TimeSeries | Forecast | Non Seasonal”, keep the default selections Base Interval [1, 93] Test Interval [94, 186] Leadtime 1 and click the radio button for Box Jenkins models. In the next dialog box use the spin buttons to change the Differencing degree to d = 2 and the Moving Average degree to q = 2. Click the “Optimize” button, then the “Forecast” button, then the “Exit” button. The forecast is shown graphically and summarized in the block P29:R38, namely

row column

P Q R description

29 Model: Box Jenkins (0,2,2)

model name

30 MA parm1

0.701 parameter value

31 MA parm2

0.022 parameter value

32 33 34 Error

TypeBase Test

35 SSR 1.30E+02

4.18E+02

sum of squared errors

36 RMSE 1.18E+00

2.12E+00

root mean square error

37 MAD 9.14E-01

1.72E+00

mean absolute deviation

38 MAPE 0.22% 0.38% mean absolute percentage error

Cell Q29 identifies the type of model. Cells Q30 and Q31 are the values of the q = 2 moving average parameters. These values were calculated in the optimization routine to minimize the sum of squared forecast errors (SSR) on the Base Interval. The algebraic definitions of the error measures are in Section 2.5.1 above. SSR and RMSE are both closely related to the statistical assumptions behind Box Jenkins models. RMSE is a point estimate of the standard deviation of the

forecast errors on the Base Interval. Both SSR and RMSE give more weight to larger errors. Minimizing SSR and RMSE gives preference to forecasting methods that have many small errors rather than a few large errors. RSME has the same unit of measure as the underlying time series, for example, dollars or inches. Since the model parameters were chosen to minimize SSR on the Base Interval, it is natural that the Test Interval SSR is larger, even though the number of points is the same. The “optimal” parameter values are only point estimates from the data in the Base Interval. Even when the Base Interval is truly representative of the longer time series, the gray region in the associated Likelihood plot indicates the uncertainty in the “optimal” values. MAD gives the same weight to small errors as to large errors, for example, errors of 1.5 and 2.5 together count as much as one error of 4.0. MAD has the same unit of measure as the underlying time series. MAPE is a dimensionless percentage. It won’t work if the underlying time series has zero values. Otherwise MAPE can be a good way to compare forecasting models over different intervals and different time series. The statistical assumptions of the Box Jenkins make it possible to construct confidence intervals. For each moment in time the green interval, centered on the forecast, marks the 50% confidence interval. Blue extends the confidence interval to 95%. To hide the confidence intervals, see Section 6.2.

5. Seasonal Models All these models incorporate a repeating pattern of “seasons”. Running “TimeSeries | Forecast | Seasonal” brings up a dialog box with “Seasons per Cycle” as the top. The default value is 12. Adjust the value up or down with the adjacent spin button. The number of “seasons” is often known ahead of time.

Seasons per Cycle

type of data

12 monthly4 quarterly2 semi annual7 daily, when there is a weekly

cycle The Periodogram discussed in Section 2.2 can uncover other frequencies and cycle lengths. Explore the seasonal models using “CO2 Concentration”. This is a long series of monthly measurements of carbon dioxide in the atmosphere made at the summit of Mauna Loa , Hawaii , far from industrial emission sources. Having more growing plants in the northern hemisphere presumably causes the month to month variation. 5.1 Structural Seasonality enters these models in relatively simple ways.

5.1.1 Naïve The forecast is the value at the same season in the preceding cycle. Let the number of seasons per cycle be s. At leadtimes less than or equal to s the forecast for time t is the time series value at time t –s, that is

ft = zt–s

To emphasize how the seasonal pattern is propagated, select the “CO2 Concentration” series and run “TimeSeries | Forecast | Seasonal” with 12 seasons per cycle. Set the leadtime to “increasing from 1”, and click the “Record Values” button. Click the radio button next to “Naïve Level” to see the forecast on the graph. Note the repeating pattern on the Test Interval. This setting for leadtime means the forecast uses only the information from Base Interval. We are forecasting farther into the future with each month into the Test Interval. 5.1.2 Naïve Trend The forecast is the level at the same season in the immediate prior cycle plus the most recent cycle to cycle change

ft = zt–s + zt–1 – zt–1–s

With “CO2 Concentration” and leadtime still set to “increasing from 1”, Click the radio button next to “Naïve Trend”. In this situation the forecast from a simple model is quite good. 5.1.3 Base Additive The forecast model is

ft = b0 + b1t + Indexk

where k is a number between 1 and s, corresponding to the season of time t. The numbers b0 and b1 and additive indices Index1, Index2, … , Indexs are computed from the Base Interval. To see the values of the additive indices, click the button “Write Seasonal Indices” in the dialog box. The indices are written in Column I starting in Row 29. For the “CO2 Concentration” series Season 1 is January, and for the Base Interval [1, 93] the January index is 0.0945 parts per million below the annual average. 5.1.4 Base Multiplicative The forecast model is

ft = (b0 + b1t) (Indexk) where k is a number between 1 and s, corresponding to the season of time t. The numbers b0 and b1 and multiplicative indices Index1, Index2, … , Indexs are computed from the Base Interval. To see the values of the multiplicative indices, click the button “Write Seasonal Indices” in the dialog box. The indices are written in Column J starting in Row 29. For the “CO2 Concentration” series Season 5 is May, and for the Base Interval [1, 93] May is 0.87% above the annual average. Similarly October is 0.97% below the average.

For some time series we know that seasonal effects are multiplicative rather than additive. For “CO2 Concentration” there is little practical difference in the forecasts. 5.2 Smoothing Models The Holt Winters models are generalizations of exponential smoothing to seasonal time series, 5.2.1 Holt Winter AdditiveContinuing with “CO2 Concentration”, run “TimeSeries | Forecast | Seasonal”, and click the radio button for “Holt Winter Additive”. A dialog box appears for setting the values of the smoothing parameters alpha, beta, and gamma. There are default values that can be manually changed with spin buttons, and there is a command button for automatic optimization using the data in the Base Interval. The “Forecast” button plots the forecast and writes a summary in cells P29:R38. 5.2.2 Holt Winters MultiplicativeThis is the same as the additive Holt Winters model, except the seasonal component has a multiplicative effect. Whether that makes a difference in forecasting depends on the particular time series. 5.3 Statistical ModelsAs in the non-seasonal case, statistical seasonal models assume an explicit contribution from random effects. 5.3.1 Box Jenkins (0,1,1)x(0,1,1)This two parameter model often outperforms the three parameter Holt Winters models. Clicking the radio button adjacent to the model name brings up a dialog box for the parameters, called theta and Theta. The “Optimize” button will find the parameter pair that minimizes the sum of squared errors on the Base Interval. The steps in the optimization calculation are shown in Columns T through AC, starting in Row 43. A likelihood plot for a grid of theta by Theta values is shown in cells S19:AO41. For :CO2 Concentration: not4e the optimal gird cell AG21 and the surrounding gray area that represents the 95% confidence region for the optimal parameters.

6. Tips and More Examples 6.1 Locations of Data and Screen Objects 6.1.1 Data Many data values are written to the worksheet. The following long table summarizes the cell locations of various data. The data actually on the worksheet depend on what procedures have been run. If the data are plotted, the approximate chart location is given. Locations can vary based on these two parameters. n number of data points in the original series s number of seasons per cycle

Category of Data

Specific Description

Data Location Chart

time series differencing original data A1:A(5+n) A4:O27 first difference A(5+n+2):A(10+2n-1) second difference A(10+2n+1):A(15+3n-

1)

time intervals type Base Interval D1:E2 A4:O27 Test Interval K1:L2 A4:O27 Leadtime H1:H2 A4:O27 forecast aspect model K30:L30 parameters K31:M34 errors K35:M39 Base Interval

valuesB: A4:O27

intermediate values

C: A4:O27

Test Interval values

D: A4:O27

residual values E: auto correlations differencing original data P13:Q(13+min(36,n\4)) B5:N36 first difference P(13+5+n+1):Q B38:N70 second difference P(13+10+2n+1):Q B71:N103 summary O5:R11 likelihood maps model (1,0,0) AD3:AE203 (1,1,0) AF3:AF203 (1,2,0) AG3:AG203 (0,0,1) AI3:AJ203 (0,1,1) AK3:AK203 (0,1,2) AL3:AL203 (2,0,0) AD206:BT229 (2,1,0) BV206:DL229 (2,2,0) DN206:FD229 (1,0,1) AD231:AZ254 (1,1,1) BB231:BX254 (1,2,1) BZ231:CV254 (0,0,2) AD256:BT279 (0,1,2) BV256:DL279 (0,2,2) DN256:FD279 (0,2,2) in lambda

coordinatesFF236:GA279

(2,0,1) AD286:AZ418 (2,1,1) BB286:BX418 (2,2,1) BZ286:CV418

(1,0,2) AD426:AZ558 (1,1,2) BB426:BX558 (1,2,2) BZ426:CV558 (2,0,2) AD566:DN617 (2,1,2) AD621:DN672 (2,2,2) AD676:DN727 likelihood maps (0,1,1)x(0,1,1)s S19:AO41 likelihood maps suite summary J2:O30 optimization suite summary E2:J30 (p,d,q) steps I81:T112 (0,1,1)x(0,1,1)s

stepsT43:AC64

moving average F26:K(25+n) moving trend F27:K(25+n) periodogram E5:G(5+n/2) psi weights U100:X(100+600+1) seasonal indices H28:J(28+s+1)

6.1.2 Screen Objects The screen objects are forms, for selecting what to do and what parameters to use,and charts that summarize results. These objects are essentially rectangular. Location is specified by the left top corner of the rectangle, and by the size in terms of width and height. The screen objects are best viewed on a screen at least 1280 pixels wide by 1024 pixels high. At the common Windows resolution of 96 pixels per inch this is equivalent to a screen that is 13.3 inches wide and 10.6 inches high. The tables below list the forms and charts and their locations. Numbers are in points. One point is 1/72 of an inch. In the case of forms, Left and Top in pure numbers are positions to the right and down from the left top of the screen. Visual Basic Form Left Top Widt

hHeight

BaseTestLTForm 50.25 440.25 720 196.5BJParameters 150 450 200.2

5197.25

FilterForm 150 150 269.25

285.75

ModelParameters 150 450 200.25

200.25

NonSeasonalForm 0 0 135 313.5SeasonalForm 0 0 135 350.25SingleParameter 150 150 375.7 100.5

5TransformParameters 150 150 216 125.25

In the case of charts, Left and Top in pure numbers are positions to the right and down from the left top of Cell A1 on the worksheet. The position of a chart on the screen depends on the position of Cell A1. It will change as the Excel window is changed, or when changing to full screen mode or back. A plus sign means successive charts add an increment to space them out. Excel Chart Left Top Widt

hHeight

Analyze | Graph B6 Left B6 Top 800 400auto correlation, series 60 51 + 600 200auto correlation 60 256 + 600 200forecast A4 Left A4 Top 720 300periodogram, by frequency H1 Left H1 Top 500 300periodogram, by time H1 Left H1 + 500 300smoothing, moving average C6 Left C6 Top 700 400smoothing, oscillation C13

LeftC13 Top

700 400

Box Jenkins suite, (1,d,0) models

C35 Left

C35 Top

385 500

Box Jenkins suite, (0,d,1) models

C35 + C35 Top

385 500

6.2 Customizing Charts Charts can often be enhanced by changing the vertical or horizontal scaling, background shading, or data markers. Indeed, the object of writing data and charts into Excel worksheets is to have the wide range of Excel tools immediately available. As an example, start with the “LA Rain” series, and run “TimeSeries | Forecast | Non Seasonal”. Generate the Box Jenkins forecast with p = 1, d = 0, q = 0 and the parameter equal to 0.5. Click “Exit” in both dialog boxes. This is a poor forecast, and the green and blue confidence intervals obscure what is going on. To turn the confidence intervals off, right click on the chart. Below “Area” click the radio button next to “None”, then click “OK”. Use the “Undo” and “Redo” control buttons in Excel to toggle back and forth between showing the confidence intervals and not. As a second example start with the “Inventories” time series. Run “TimeSeries | Analyze | Box Jenkins Suite”. There are two charts below the numerical comparison of the 27 Box Jenkins models. The chart on the right has “theta” printed at the bottom. Theta is the single parameter in each of the (0,0,1), (0,1,1) and (0,2,1) model types. Each model type and each value of theta generate a forecast of the time series. The dark blue, magenta and yellow curves show how SSR, the sum of squared forecast errors, varies with the parameter theta. See the legend at the top of the chart. To make the behavior of the curves apparent right click on the vertical axis of the chart. Choose “Format Axis | Scale” and change the scale maximum to

75000. Now we can see that the minimum SSR for the (0,0,1) model type occurs near theta = –1. Further reduce the maximum of the vertical scale to 5000. Let the cursor hover along the yellow curve to see that the low SSR for the (0,2,1) model type occurs near theta = 0.6. The data for the curve are in Column AL. The gray 95% confidence region around the minimum actually extends from theta = 0.46 to theta = 0.66. Use the series “LA Rain” for a second example. Run “TimeSeries | Forecast |Non Seasonal | Naïve Trend”. Click the “Exit” button to close the dialog box and so return to the normal Excel mode. The Naïve Trend forecast often shoots off the scale. Right click on the vertical axis, choose “Format Axis | Scale”, and check the Auto option for both Minimum and maximum. Pick several forecast values. By visual inspection in each case verify that the forecast is the preceding time series value plus the preceding change in values. Of course this is a poor forecast for rain in Los Angeles . 6.3 Using Data Tables for Further Analysis The data in any of the tables listed in Section 6.1 can be combined, transformed or graphed. As an example we will investigate the suitability of the seasonal Box Jenkins forecast for the “Births” time series. Starting with the “Births” series, run “TimeSeries | Transform | Extend Linearly”, then with the extended series run “TimeSeries | Forecast | Seasonal” with the default Base Interval, which is the full range of the original series. Choose the Box Jenkins forecast type, “Optimize” the parameters, then “Forecast”, then click the “Exit” button. Click and drag the graph to the right, to expose Column E. Column A has the original births data in Rows 6 through 341. The same rows in Column E contain the forecast errors, also called the forecast residuals. If the residuals are random noise, the original forecast is as good as we can do. A non-random pattern could be refined into a better forecast. Insert a new worksheet into the workbook. Copy

Birth Residualsregistered live births X 1000month1953 Jan1980 Dec

into the top five rows of Column A. From the worksheet “<ext> Births” select and copy the range of residual values, E6:E341. In the new worksheet select cell A6 and do “Edit | Paste”. Now run “TimeSeries | Analyze | AutoCorrelations”. The low autocorrelations at lags 1 through 36 support a hypothesis that the successive forecast errors are independent. Scroll the screen over to view Columns T through AH. The AutoCorrelation procedure compares the distribution of time series values to a normal distribution with the same mean and standard deviation. Cell AB3 has the probability that the chi squared statistic would be greater than or equal to the observed value. In this case there is a 43% chance that normally distributed forecast errors would generate a pattern with a chi squared statistic this big. The forecast errors are like white noise.

6.4 Input Data in Other Workbooks The forecast routines in the workbook “Forecast Time Series” can be applied to data in other workbooks. Time series should be written on worksheets as described in Section 1.1. See for example the workbook “time series 0”, which contains test series whose graphs are simple geometric patterns. Before working on the series in the new workbook, open “Forecast Time Series” in Excel and press ALT+F11. This opens the Visual Basic Editor. The project explorer window should be visible, usually in the upper left corner. If not, press CTRL+R. In the project explorer window find “VBA Project (Forecast Time Series). Under that find “Microsoft Excel Objects”. Under that find and double click “ThisWorkbook”. In the code window type a single apostrophe in front of the line that reads Run "DeleteTSMenu". Open the workbook with the new time series data you want. Proceed as before, using “TimeSeries” in the menu bar.

7. Sources of Times Series Data7.1 Your Own RecordsYour own organization, even your own household should generate many time series. The only challenge is to consistently record and preserve measurements. A few examples of internally generated time series areOrganization Time Seriesretailer daily, weekly, or monthly dollar salesmanufacturer monthly factory labor, by hours or costmanufacturer quarterly sales of a product, in dollars or unitsgovernment monthly sales tax revenuesgovernment weekly building permits issuedchurch weekly attendance at servicescollege percentage each term of admitted students

who enrollbank new loans made each week, by number or

dollar totalfarm crop yield per acre year by yearhospital patient days of service provided each monthhousehold monthly electricity usage in kilowatt hours

7.2 U. S. Government The following are treasures troves. United States Department of Commerce Bureau of the Census Bureau of Economic Analysis National Oceanic and Atmospheric Administration Unites States Bureau of Labor Statistics 7.3 Internet Searching for terms and names mentioned elsewhere in this commentary will turn up many time series. The internet also has many articles on forecasting.

7.4 Trade Associations Associations exist for hundreds of fields in commerce and science. For example building energy explosives farming gaming manufacturing news papers pet food publishing railroads resorts 7.5 Libraries Not everything is on the internet. Even if it is, some things may be more accessible in a library. Some libraries are official repositories of information, or they hold special collections. Librarians are the original search engines. They know how to look for information.

8. Special Cases of Forecasting Models Box Jenkins (p,d,q) models p is the autoregressive order the p autogressive parameters are commonly denoted by the Greek letter phi, with subscripts: φ1, φ2, .. , φp

d is the degree of differencing q is the moving average order the q moving average parameters are commonly denoted by the Greek letter theta, with subscripts: θ1, θ 2, .. , θ q

Box Jenkins (1,1,1) theta = 1: (1,0,0) in general, the forecast is phi*(Prior Level - Mean) + Mean phi = 1: (0,1,0), Naïve Level phi = 0: (0,0,0), forecast is the mean of the base interval phi = -1: Naïve reflection about the mean theta = 0: (1,1,0) in general, the forecast is Naïve Level + phi*(Prior Trend) phi = 1: (0,2,0), Naïve Trend phi = 0: (0,1,0), Naïve Level phi = -1: Naïve Level at Leadtime 2 phi = 1: (0,2,1) theta = 1: (0,1,0), Naïve Level theta = 0: (0,2,0), Naïve Trend theta = -1: Naïve Trend plus Error Correction phi = 0: (0,1,1) theta = 1, lambda = 0: (0,0,0), forecast is the mean of the base interval theta = 0, lambda = 1: (0,1,0), Naïve Level theta = -1, lambda = 2: Naïve Error Correction phi = -1: theta = 1: Naïve oscillation about the mean theta = 0: Naïve Level at Leadtime 2 theta = –1: (0,1,0), Naïve Level phi = theta: (0,1,0), Naïve Level Box Jenkins (0,2,2) lambda1 = 0: (0,1,1), θ1+ θ2 = 1, (1 – θ1B – θ2B2) = (1 – B) (1 + θ2B) lambda0 = 0: (0,0,0), white noise about the mean lambda0 = 1: (0,1,0), Naïve Level lambda0 = 2: Naïve Error Correction lambda0 = 0, θ2 = –1: lambda1 = 4

lambda0 = 1, θ2 = 0: Box Jenkins (2,0,0) phi1 = 2 and phi2 = –1: (0,2,0), Naïve Trend phi1 = –2 and phi2 = –1: phi1 = 0 and phi2 = 1: Naïve at Leadtime 2 phi2 = 0: (1,0,0) Box Jenkins (2,1,0) phi1 = 2 and phi2 = –1: (0,3,0), Naïve parabolic phi2 = 0: (1,1,0) phi = 0: (0,1,0), Naïve Level Box Jenkins (2,2,0) phi2 = 0: (1,2,0) phi = 0: (0,2,0), Naïve Trend Moving Average equivalent to autoregressive (n,0,0) with first n phi’s equal 1/n Moving Linear Fit n = 2: naïve Trend equivalent to autoregressive (n,0,0) with phi’s n = 2: 2, -1 n = 3: 4/3, 1/3, –1/3 n = 4: 1, 0.5, 0, –0.5 [Note that phi3 is zero.] n = 5: 0.8, 0.5, 0.2, –0.1, –0.4 general n: 2(3(n + 1 – i) – n – 2) / (n(n – 1)) e.g., phi1 is 2(3(n + 1 – 1) – n – 2) / (n(n – 1)) = 4/n The coefficients in y = a + bx are, with i in time order, 1 to n a = Σ yi (2n + 1 – 3i)2/(n(n-1)) b = Σ yi (2i – 1 – n)2/(n(n-1))

Top of Page | ACCT 503 Syllabus | Warren Home Page

copyright notice

forecastingtime series

Documents

seasonal time series

forecasting time series

workbookforecast time

time series data1

introductiona time series

time series3

time series2

excel worksheets