forecasting - georgia state university - georgia state …dscsss/teaching/mgs3100… · ppt...
TRANSCRIPT
-
Module 4. Forecasting
MGS3100
-
Forecasting
-
Quantitative Forecasting
Casual Models:
Causal
Model
Year 2000
Sales
Price
Population
Advertising
Time Series Models:
Time Series
Model
Year 2000
Sales
Sales1999
Sales1998
Sales1997
--Forecasting based on data and models
-
Causal forecasting
Regression
Find a straight line that fits the data best.
y = Intercept + slope * x (= b0 + b1x)
Slope = change in y / change in x
Best line!
Intercept
Chart3
5
6
5
7.5
6
8.5
8
10
7
8
11
8
11
Shoe Size (Y)
Raw Data
Example of Simple Regression - Does Shoe Size among teenagers depend on Age?
(Can you predict the shoe size if you know the age?)
AgeShoe Size
115
126
125
137.5
136
138.5
148
1510
157
178
1811
188
1911
Raw Data
5
6
5
7.5
6
8.5
8
10
7
8
11
8
11
Shoe Size (Y)
Simple
Example of Simple Regression - Does Shoe Size among teenagers depend on Age?
(Can you predict the shoe size if you know the age?)
Deviations fromSquared
Age (X)Shoe Size (Y)the MeanDeviations
115-2.76923076927.6686390533
126-1.76923076923.1301775148
125-2.76923076927.6686390533
137.5-0.26923076920.0724852071
136-1.76923076923.1301775148
138.50.73076923080.5340236686
1480.23076923080.0532544379
15102.23076923084.9763313609
157-0.76923076920.5917159763
1780.23076923080.0532544379
18113.230769230810.4378698225
1880.23076923080.0532544379
19113.230769230810.4378698225
Mean Shoe Size =7.769230769248.8076923077
Sum of Squared Deviations, shown in the ANOVA
SUMMARY OUTPUTTable as the SS Total.
Regression Statistics
Multiple R0.798497882
R Square0.6375988676R-Squared is = SSR/SST = 31.119/48.807, from the ANOVA table below.
Adjusted R Square0.6046533101
Standard Error1.2680680711Std. Error is the square root of the Mean Squared Error
Observations13n is the number of obs., which is 13 in this case.
ANOVADegrees of FreedomThe Mean Squares (MS) are computed by dividing SS (Sum of Squares by the degrees of freedom)
dfSSMSFSignificance F
Regression31.119729344731.119729344719.35310603650.0010645821
Residual (Error)17.6879629631.607996633Shows that the overall model is significant. There is a 0.1% chance
Total48.8076923077that the relationship is non-existent, and that we falsely believe the model.
k is the number of independent (predictor) variables, in this case just 1 (age)
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%
Intercept-1.17592592592.0635438706-0.56985748770.5802261973-5.71775765933.3659058075
Age0.6120370370.13912409944.39921652530.00106458210.3058268040.91824727
These coefficients are computed using a formula that guarantees that this is the best fitting line.
RESIDUAL OUTPUTYou do not have to know the formulas. They are available in every basic stat book
Squared
ObservationPredicted Shoe SizeResidualsResiduals
15.55648148-0.556481480.3096716392
26.16851852-0.168518520.0283984911
36.16851852-1.168518521.3654355281
46.780555560.719444440.5176003086
56.78055556-0.780555560.6092669753
66.780555561.719444442.9564891975
77.392592590.607407410.3689437586
88.004629631.995370373.981502915
98.00462963-1.004629631.0092806927
109.22870370-1.228703701.5097127915
119.840740741.159259261.3438820302
129.84074074-1.840740743.3883264746
1310.452777780.547222220.2994521605
17.687962963Sum of Squared Residuals(Errors), shown in the
ANOVA table.
How and Why are the Sum of Squares shown in the ANOVA table calculated?
The basic idea behind regression is to see if there is a relationship between X and Y, and if so, how well does X help predict Y.
If there was no info on X (Age), but all we had was a sample of shoe sizes, our best estimate of shoe sizes would be the mean
shoe size of about 7.7, shown at the top. However, this estimate would have a lot of error, since actual sizes deviate quite a bit from
the mean. These deviations are computed and squared (to avoid + and - cancelling each other) and summed, to get a SST (Sum
of Squares Total) value of 48.807.
Now, when we consider the info provided by age, we can better estimate shoe size than simply using the mean size. We can now say
that shoe size depends on Age according to the equation Y= 0.612 X - 1.1759. Now, our new estimates are better, but they are still
not perfect. There are still errors (residuals) shown in the excel output at the bottom. If we square each of the errors again and add them,
we get the SSE (Sum of Squared Errors) value of 17.687.
This means that by using Age to do the regression, we reduced our error squares by 48.807-17.687, or by a value of 31.119,
shown in the ANOVA table as SSR (Sum of Squares Regression). SSR is thus the reduction in SST brought about by the regression.
In other words, the regression helped to explain away 31.119 out of the total of 48.807 of error. Thus, the proportion of variability in
Y that is explained by the regression is 31.119/48.807 = 0.6376, which is the R-Squared value shown at the top.
What are Degrees of Freedom?
Once the SS are computed, the Mean Squares are computed by dividing by the degrees of freedom. Normally, a mean is simply
the sum of n numbers divided by n. Here, however, when we find the mean, we must compensate for the fact that we are averaging
errors, and even though there are n numbers, not all of them contribute to the error.
For example, if there is only 1 data point, there is no chance (freedom) for any variation at all to occur. Hence, total degrees of
freedom are always n-1. Thus, if there are 2 data points, there is one degree of freedom for variation to occur.
Next, suppose there are 2 points of data. Even though they could be different values of Y, the process of using a variable X to do
a regression means that we draw the best line through them. Now no matter what the points are, we can always draw a straight
line perfectly through those points. Thus, there is no freedom for error to occur, since the variable X "used up" the single degree of
freedom that Y had. In general, the number of independent variables used (K) is the number of degrees of freedom that are used up
from the total available (n-1), leaving n-k-1 degrees available for error to occur. Thus the SS Error is divided by n-k-1 to find the mean
squared error, instead of dividing by n.
What is F-value? When is a model significant?
F-value is the ratio of MSR/MSE = 31.119/1.6079. This shows the ratio of the average error that is explained by the regression to the average
error that is still unexplained. Thus, the higher the F, the better the model, and the more confidence we have that the model that we
derived from sample data actually applies to the whole population, and is not just an aberration found in the sample.
In this case, the level of confidence is around 99.9%, reflected in the significance value of 0.00106 shown in the ANOVA table.
That value was computed by looking at standardized tables that consider the F-value and your sample size to make that determination.
Simple
0
0
0
0
0
0
0
0
0
0
0
0
0
Shoe Size (Y)
Age in Years
Shoe Size
Shoe Sizes of Teens
Multiple-initial
Example of Multiple Regression: Can Shoe Size (Y) be predicted by
the independent variables X1 through X4?
YX1X2X3X4
Shoe SizeAgeWeightSexIQ Score
511750100
61285180
51288050
7.5131351120
613800115
8.5131800106
814140096
1015200088
715110078
817120065
11181501101
8181250105
11191651130
Female=0
Male=1
SUMMARY OUTPUT
Regression Statistics
Multiple R0.9805339782
R Square0.9614468824
Adjusted R Square0.9421703236
Standard Error0.484985657
Observations13
ANOVA
dfSSMSFSignificance F
Regression446.92600360811.73150090249.87647915960.0000107054
Residual81.88168869970.2352110875
Total1248.8076923077
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%
Intercept-1.7743425060.9297924371-1.9083210780.092770928-3.91844909750.3697640855
Age0.34626069830.06232644985.55559797250.00053743450.20253555430.4899858422
Weight0.03038034870.00418029997.26750462880.00008656980.02074055370.0400201438
Sex0.79189071910.32269621242.45398206910.03968948280.04775143761.5360300006
IQ Score0.00396324190.00714713770.55452155650.5943806023-0.01251809790.0204445817
IQ not significantly related to Shoe Size
Multiple-revised
Multiple Regression with the insignificant variable (IQ) dropped
SUMMARY OUTPUT
Regression Statistics
Multiple R0.9797780489
R Square0.9599650251
Adjusted R Square0.9466200335
Standard Error0.4659535903
Observations13
ANOVA
dfSSMSFSignificance F
Regression346.85367757315.617892524371.93447942010.0000013077
Residual91.95401473470.2171127483
Total1248.8076923077
CoefficientsStandard Errort StatP-valueLower 95%
Intercept-1.51044939720.767427851-1.96819726480.0805754213-3.2464931304
Age0.34745099110.05984507885.8058406470.00025758330.2120719142
Weight0.03096629490.00388582827.96903343470.00002283110.0221759341
Sex0.85821709040.28794892222.9804490460.01543832690.2068308771
RESIDUAL OUTPUT
ObservationPredicted Shoe SizeResiduals
14.63398362350.3660163765
26.1493146541-0.1493146541
35.3839964485-0.3839964485
48.0450803909-0.5450803909
55.48371708030.5162829197
68.5803465716-0.0803465716
77.68914576620.3108542338
89.89457445210.1054255479
97.1076079099-0.1076079099
108.1121728412-0.1121728412
1110.24682977010.7531702299
128.6144553069-0.6144553069
1311.0587751849-0.0587751849
-
Causal Forecasting Models
Curve Fitting: Simple Linear RegressionOne Independent Variable (X) is used to predict one Dependent Variable (Y): Y = a + b XGiven n observations (Xi, Yi), we can fit a line to the overall pattern of these data points. The Least Squares Method in statistics can give us the best a and b in the sense of minimizing (Yi - a - bXi)2:
Regression formula is an optional learning objective
-
Curve Fitting: Simple Linear Regression Find the regression line with ExcelUse Function:
a = INTERCEPT(Y range; X range)
b = SLOPE(Y range; X range)
Use SolverUse Excels Tools | Data Analysis | Regression Curve Fitting: Multiple RegressionTwo or more independent variables are used to predict the dependent variable:
Y = b0 + b1X1 + b2X2 + + bpXp
Use Excels Tools | Data Analysis | Regression
-
Time Series Forecasting Process
Look at the data (Scatter Plot)
Forecast using one or more techniques
Evaluate the technique and pick the best one.
Observations from the scatter PlotTechniques to tryWays to evaluateData is reasonably stationary (no trend or seasonality)Heuristics - Averaging methods Naive Moving Averages Simple Exponential Smoothing MAD MAPE Standard Error BIASData shows a consistent trendRegression Linear Non-linear Regressions (not covered in this course) MAD MAPE Standard Error BIAS R-SquaredData shows both a trend and a seasonal patternClassical decomposition Find Seasonal Index Use regression analyses to find the trend component MAD MAPE Standard Error BIAS R-Squared
-
Evaluation of Forecasting Model
BIAS - The arithmetic mean of the errors
n is the number of forecast errors
Excel: =AVERAGE(error range)
Mean Absolute Deviation - MAD
No direct Excel function to calculate MAD
-
Evaluation of Forecasting Model
Mean Square Error - MSE
Excel: =SUMSQ(error range)/COUNT(error range)Standard error is square root of MSEMean Absolute Percentage Error - MAPE
R2 - only for curve fitting model such as regressionIn general, the lower the error measure (BIAS, MAD, MSE) or the higher the R2, the better the forecasting model
-
Stationary data forecasting
Nave
I sold 10 units yesterday, so I think I will sell 10 units today.
n-period moving average
For the past n days, I sold 12 units on average. Therefore, I think I will sell 12 units today.
Exponential smoothing
I predicted to sell 10 units at the beginning of yesterday; At the end of yesterday, I found out I sold in fact 8 units. So, I will adjust the forecast of 10 (yesterdays forecast) by adding adjusted error ( * error). This will compensate over (under) forecast of yesterday.
-
Nave Model
The simplest time series forecasting model Idea: what happened last time (last year, last month, yesterday) will happen again this time Nave Model:
Algebraic: Ft = Yt-1
Yt-1 : actual value in period t-1
Ft : forecast for period t
Spreadsheet: B3: = A2; Copy down
-
Moving Average Model
Simple n-Period Moving Average
Issues of MA Model Nave model is a special case of MA with n = 1 Idea is to reduce random variation or smooth dataAll previous n observations are treated equally (equal weights)Suitable for relatively stable time series with no trend or seasonal pattern
-
Smoothing Effect of MA Model
Longer-period moving averages (larger n) react to actual changes more slowly
-
Moving Average Model
Weighted n-Period Moving Average
Typically weights are decreasing: w1>w2>>wnSum of the weights = wi = 1Flexible weights reflect relative importance of each previous observation in forecastingOptimal weights can be found via Solver
-
Weighted MA: An Illustration
Month Weight Data
August 17%130
September 33%110
October 50%90
November forecast:
FNov = (0.50)(90)+(0.33)(110)+(0.17)(130)
= 103.4
-
Exponential Smoothing
Concept is simple!Make a forecast, any forecastCompare it to the actualNext forecast isPrevious forecast plus an adjustmentAdjustment is fraction of previous forecast errorEssentiallyNot really forecast as a function of timeInstead, forecast as a function of previous actual and forecasted value
-
Simple Exponential Smoothing
A special type of weighted moving average Include all past observationsUse a unique set of weights that weight recent observations much more heavily than very old observations:
-
Simple ES: The Model
New forecast = weighted sum of last period actual value and last period forecast
: Smoothing constant Ft :Forecast for period t Ft-1:Last period forecast Yt-1:Last period actual value
-
Simple Exponential Smoothing
Properties of Simple Exponential SmoothingWidely used and successful modelRequires very little dataLarger , more responsive forecast; Smaller , smoother forecast (See Table 13.2)best can be found by Solver Suitable for relatively stable time series
-
Time Series Components
Trendpersistent upward or downward pattern in a time seriesSeasonal Variation dependent on the time of year Each year shows same patternCyclical up & down movement repeating over long time frameEach year does not show same patternNoise or random fluctuations follow no specific pattern short duration and non-repeating
-
Time Series Components
Time
Trend
Random
movement
Time
Cycle
Time
Seasonal
pattern
Demand
Time
Trend with
seasonal pattern
-
Trend Model
Curve fitting method used for time series data (also called time series regression model) Useful when the time series has a clear trend Can not capture seasonal patterns Linear Trend Model: Yt = a + bt t is time index for each period, t = 1, 2, 3,
Chart2
0.6
1.2
1.8
2.4
3
3.6
4.2
4.8
5.4
6
Sheet1
10.811600.6
29.05096679927401.2
337.41229744354401.8
4102.42602.4
5223.606797752003
6423.27182755292603.6
7725.99415975614404.2
81158.5237502967404.8
91749.611605.4
102529.822128134717006
111324.32827878891080
121795.79027728741470
132376.41884565831920
143080.13236079232430
153921.3956380353000
164915.23630
176077.04538159794320
187422.92414618395070
198969.30635612365880
2010733.1262919996750
Sheet1
Sheet2
Sheet3
Sheet4
Sheet5
-
Pattern-based forecasting - Trend
Regression Recall Independent Variable X, which is now time variable e.g., days, months, quarters, years etc.
Find a straight line that fits the data best.
y = Intercept + slope * x (= b0 + b1x)
Slope = change in y / change in x
Best line!
Intercept
Chart3
5
6
5
7.5
6
8.5
8
10
7
8
11
8
11
Shoe Size (Y)
Raw Data
Example of Simple Regression - Does Shoe Size among teenagers depend on Age?
(Can you predict the shoe size if you know the age?)
AgeShoe Size
115
126
125
137.5
136
138.5
148
1510
157
178
1811
188
1911
Raw Data
5
6
5
7.5
6
8.5
8
10
7
8
11
8
11
Shoe Size (Y)
Simple
Example of Simple Regression - Does Shoe Size among teenagers depend on Age?
(Can you predict the shoe size if you know the age?)
Deviations fromSquared
Age (X)Shoe Size (Y)the MeanDeviations
115-2.76923076927.6686390533
126-1.76923076923.1301775148
125-2.76923076927.6686390533
137.5-0.26923076920.0724852071
136-1.76923076923.1301775148
138.50.73076923080.5340236686
1480.23076923080.0532544379
15102.23076923084.9763313609
157-0.76923076920.5917159763
1780.23076923080.0532544379
18113.230769230810.4378698225
1880.23076923080.0532544379
19113.230769230810.4378698225
Mean Shoe Size =7.769230769248.8076923077
Sum of Squared Deviations, shown in the ANOVA
SUMMARY OUTPUTTable as the SS Total.
Regression Statistics
Multiple R0.798497882
R Square0.6375988676R-Squared is = SSR/SST = 31.119/48.807, from the ANOVA table below.
Adjusted R Square0.6046533101
Standard Error1.2680680711Std. Error is the square root of the Mean Squared Error
Observations13n is the number of obs., which is 13 in this case.
ANOVADegrees of FreedomThe Mean Squares (MS) are computed by dividing SS (Sum of Squares by the degrees of freedom)
dfSSMSFSignificance F
Regression31.119729344731.119729344719.35310603650.0010645821
Residual (Error)17.6879629631.607996633Shows that the overall model is significant. There is a 0.1% chance
Total48.8076923077that the relationship is non-existent, and that we falsely believe the model.
k is the number of independent (predictor) variables, in this case just 1 (age)
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%
Intercept-1.17592592592.0635438706-0.56985748770.5802261973-5.71775765933.3659058075
Age0.6120370370.13912409944.39921652530.00106458210.3058268040.91824727
These coefficients are computed using a formula that guarantees that this is the best fitting line.
RESIDUAL OUTPUTYou do not have to know the formulas. They are available in every basic stat book
Squared
ObservationPredicted Shoe SizeResidualsResiduals
15.55648148-0.556481480.3096716392
26.16851852-0.168518520.0283984911
36.16851852-1.168518521.3654355281
46.780555560.719444440.5176003086
56.78055556-0.780555560.6092669753
66.780555561.719444442.9564891975
77.392592590.607407410.3689437586
88.004629631.995370373.981502915
98.00462963-1.004629631.0092806927
109.22870370-1.228703701.5097127915
119.840740741.159259261.3438820302
129.84074074-1.840740743.3883264746
1310.452777780.547222220.2994521605
17.687962963Sum of Squared Residuals(Errors), shown in the
ANOVA table.
How and Why are the Sum of Squares shown in the ANOVA table calculated?
The basic idea behind regression is to see if there is a relationship between X and Y, and if so, how well does X help predict Y.
If there was no info on X (Age), but all we had was a sample of shoe sizes, our best estimate of shoe sizes would be the mean
shoe size of about 7.7, shown at the top. However, this estimate would have a lot of error, since actual sizes deviate quite a bit from
the mean. These deviations are computed and squared (to avoid + and - cancelling each other) and summed, to get a SST (Sum
of Squares Total) value of 48.807.
Now, when we consider the info provided by age, we can better estimate shoe size than simply using the mean size. We can now say
that shoe size depends on Age according to the equation Y= 0.612 X - 1.1759. Now, our new estimates are better, but they are still
not perfect. There are still errors (residuals) shown in the excel output at the bottom. If we square each of the errors again and add them,
we get the SSE (Sum of Squared Errors) value of 17.687.
This means that by using Age to do the regression, we reduced our error squares by 48.807-17.687, or by a value of 31.119,
shown in the ANOVA table as SSR (Sum of Squares Regression). SSR is thus the reduction in SST brought about by the regression.
In other words, the regression helped to explain away 31.119 out of the total of 48.807 of error. Thus, the proportion of variability in
Y that is explained by the regression is 31.119/48.807 = 0.6376, which is the R-Squared value shown at the top.
What are Degrees of Freedom?
Once the SS are computed, the Mean Squares are computed by dividing by the degrees of freedom. Normally, a mean is simply
the sum of n numbers divided by n. Here, however, when we find the mean, we must compensate for the fact that we are averaging
errors, and even though there are n numbers, not all of them contribute to the error.
For example, if there is only 1 data point, there is no chance (freedom) for any variation at all to occur. Hence, total degrees of
freedom are always n-1. Thus, if there are 2 data points, there is one degree of freedom for variation to occur.
Next, suppose there are 2 points of data. Even though they could be different values of Y, the process of using a variable X to do
a regression means that we draw the best line through them. Now no matter what the points are, we can always draw a straight
line perfectly through those points. Thus, there is no freedom for error to occur, since the variable X "used up" the single degree of
freedom that Y had. In general, the number of independent variables used (K) is the number of degrees of freedom that are used up
from the total available (n-1), leaving n-k-1 degrees available for error to occur. Thus the SS Error is divided by n-k-1 to find the mean
squared error, instead of dividing by n.
What is F-value? When is a model significant?
F-value is the ratio of MSR/MSE = 31.119/1.6079. This shows the ratio of the average error that is explained by the regression to the average
error that is still unexplained. Thus, the higher the F, the better the model, and the more confidence we have that the model that we
derived from sample data actually applies to the whole population, and is not just an aberration found in the sample.
In this case, the level of confidence is around 99.9%, reflected in the significance value of 0.00106 shown in the ANOVA table.
That value was computed by looking at standardized tables that consider the F-value and your sample size to make that determination.
Simple
0
0
0
0
0
0
0
0
0
0
0
0
0
Shoe Size (Y)
Age in Years
Shoe Size
Shoe Sizes of Teens
Multiple-initial
Example of Multiple Regression: Can Shoe Size (Y) be predicted by
the independent variables X1 through X4?
YX1X2X3X4
Shoe SizeAgeWeightSexIQ Score
511750100
61285180
51288050
7.5131351120
613800115
8.5131800106
814140096
1015200088
715110078
817120065
11181501101
8181250105
11191651130
Female=0
Male=1
SUMMARY OUTPUT
Regression Statistics
Multiple R0.9805339782
R Square0.9614468824
Adjusted R Square0.9421703236
Standard Error0.484985657
Observations13
ANOVA
dfSSMSFSignificance F
Regression446.92600360811.73150090249.87647915960.0000107054
Residual81.88168869970.2352110875
Total1248.8076923077
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%
Intercept-1.7743425060.9297924371-1.9083210780.092770928-3.91844909750.3697640855
Age0.34626069830.06232644985.55559797250.00053743450.20253555430.4899858422
Weight0.03038034870.00418029997.26750462880.00008656980.02074055370.0400201438
Sex0.79189071910.32269621242.45398206910.03968948280.04775143761.5360300006
IQ Score0.00396324190.00714713770.55452155650.5943806023-0.01251809790.0204445817
IQ not significantly related to Shoe Size
Multiple-revised
Multiple Regression with the insignificant variable (IQ) dropped
SUMMARY OUTPUT
Regression Statistics
Multiple R0.9797780489
R Square0.9599650251
Adjusted R Square0.9466200335
Standard Error0.4659535903
Observations13
ANOVA
dfSSMSFSignificance F
Regression346.85367757315.617892524371.93447942010.0000013077
Residual91.95401473470.2171127483
Total1248.8076923077
CoefficientsStandard Errort StatP-valueLower 95%
Intercept-1.51044939720.767427851-1.96819726480.0805754213-3.2464931304
Age0.34745099110.05984507885.8058406470.00025758330.2120719142
Weight0.03096629490.00388582827.96903343470.00002283110.0221759341
Sex0.85821709040.28794892222.9804490460.01543832690.2068308771
RESIDUAL OUTPUT
ObservationPredicted Shoe SizeResiduals
14.63398362350.3660163765
26.1493146541-0.1493146541
35.3839964485-0.3839964485
48.0450803909-0.5450803909
55.48371708030.5162829197
68.5803465716-0.0803465716
77.68914576620.3108542338
89.89457445210.1054255479
97.1076079099-0.1076079099
108.1121728412-0.1121728412
1110.24682977010.7531702299
128.6144553069-0.6144553069
1311.0587751849-0.0587751849
-
Pattern-based forecasting Seasonal
Once data turn out to be seasonal, deseasonalize the data.The methods we have learned (Heuristic methods and Regression) is not suitable for data that has pronounced fluctuations.Make forecast based on the deseasonalized dataReseasonalize the forecastGood forecast should mimic reality. Therefore, it is needed to give seasonality back.
-
Pattern-based forecasting Seasonal
Deseasonalize
Forecast
Reseasonalize
Actual data
Deseasonalized data
Example (SI + Regression)
-
Pattern-based forecasting Seasonal
Deseasonalization
Deseasonalized data = Actual / SI
Reseasonalization
Reseasonalized forecast
= deseasonalized forecast * SI
-
Seasonal Index
Whats an index?RatioSI = ratio between actual and average demandSupposeSI for quarter demand is 1.20Whats that mean?Use it to forecast demand for next fallSo, where did the 1.20 come from?!
-
Calculating Seasonal Indices
Quick and dirty method of calculating SIFor each year, calculate average demandDivide each demand by its yearly averageThis creates a ratio and hence a raw indexFor each quarter, there will be as many raw indices as there are yearsAverage the raw indices for each of the quartersThe result will be four values, one SI per quarter
-
Classical decomposition
Start by calculating seasonal indicesThen, deseasonalize the demandDivide actual demand values by their SI values
y = y / SI
Results in transformed data (new time series)Seasonal effect removedForecastRegression if deseasonalized data is trendyHeuristics methods if deseasonalized data is stationaryReseasonalize with SI
-
Causal or Time series?
What are the difference?
Which one to use?
-
Can you
describe general forecasting process?compare and contrast trend, seasonality and cyclicality?describe the forecasting method when data is stationary?describe the forecasting method when data shows trend?describe the forecasting method when data shows seasonality?
n
Error
n
Forecast)
-
(Actual
BIAS
=
=
n
|
Error
|
n
Forecast
-
Actual
|
MAD
=
=
|
n
(Error)
n
Forecast)
-
(Actual
MSE
2
2
=
=
n
Actual
|
Forecast
-
Actual
|
MAPE
=
%
100
*
n
X
b
n
Y
a
n
X
X
n
Y
X
Y
X
b
i
i
i
i
i
i
i
i
-
=
-
-
=
2
2
)
(
/
n
n
t
Y
2
t
Y
1
t
Y
=
n
periods
n
previous
in
values
actual
of
Sum
t
F
-
+
+
-
+
-
=
L
n
t
Y
n
w
2
t
Y
2
w
1
t
Y
1
w
=
t
F
-
+
+
-
+
-
L
a
a
a
a
a
a
a
(
)
(
)
(
)
1
1
1
2
3
-
-
-
0
1