datsol
DESCRIPTION
Time series analysis and forecasting solutionsTRANSCRIPT
CHAPTER 2
A REVIEW OF BASIC STATISTICAL CONCEPTS
ANSWERS TO ODD NUMBERED PROBLEMS
1. Descriptive Statistics
Variable N Mean Median Tr Mean StDev SE MeanC1 28 21.32 17.00 20.69 13.37 2.53
Variable Min Max Q1 Q3C1 5.00 54.00 11.25 28.75
a. X = 21.32
b. S = 13.37
c. S2 = 178.76
d. If the policy is successful, smaller orders will be eliminated and the mean will increase.
e. If the change causes all customers to consolidate a number of small orders into large orders, the standard deviation will probably decrease. Otherwise, it is very difficult to tell how the standard deviation will be affected.
f. The best forecast over the long-term is the mean of 21.32.
3. a. Point estimate:
b. 1 = .95 Z = 1.96, n = 30, (5.85%, 15.67%)
c. df = 301 = 29, t = 2.045 (5.64%, 15.88%)
d. We see that the 95% confidence intervals in b and c are not much different. This explains why a sample of size n = 30 is often taken as the cutoff between large and small samples.
5. H0: = 12.1 n = 100 = .05
1
H1: > 12.1 S = 1.7 X = 13.5
Reject H0 if Z > 1.645
Z = = 8.235
Reject H0 since the computed Z (8.235) is greater than the critical Z (1.645). The mean has increased.
7. n = 60,
two-sided test, = .05, critical value: |Z|= 1.96
Test statistic:
Since |2.67| = 2.67 > 1.96, reject at the 5% level. The mean satisfaction rating is
different from 5.9. p-value: P(Z < 2.67 or Z > 2.67) = 2 P(Z > 2.67) = 2(.0038) = .0076, very strong
evidence against
9. H0: = 700 n = 50 = .05H1: 700 S = 50 X = 715
Reject H0 if Z < -1.96 or Z > 1.96
Z = = 2.12
Since the calculated Z is greater than the critical Z (2.12 > 1.96), reject the null hypothesis. The forecast does not appear to be reasonable.
p-value: P(Z < 2.12 or Z > 2.12) = 2 P(Z > 2.12) = 2(.017) = .034, strong evidence against
11. a.
2
b. Positive
c. Y = 6058 Y2 = 4,799,724 X = 59X2 = 513 XY = 48,665 r = .938
13. This is a good population for showing how random samples are taken. If three-digit random numbers are generated from Minitab as demonstrated in Problem 10, the selected
3
items for the sample can be easily found. In this population, = 0.06, so most students will not reject the null hypothesis which so states this. The few students that erroneously
reject H0 demonstrate Type I error.
15. n = 175, Point estimate: 98% confidence interval: 1 = .98 Z = 2.33
(43.4, 47.0)
Hypothesis test:
two-sided test, = .02, critical value: |Z|= 2.33
Test statistic:
Since |Z| = 1.54 < 2.33, do not reject at the 2% level.
As expected, the results of the hypothesis test are consistent with the confidence interval for ; = 44 is not ruled out by either procedure.
CHAPTER 3
EXPLORING DATA PATTERNS ANDCHOOSING A FORECASTING TECHNIQUE
4
ANSWERS TO ODD NUMBERED PROBLEMS
1. Qualitative forecasting techniques rely on human judgment and intuition. Quantitative forecasting techniques rely more on manipulation of past historical data.
3. The secular trend of a time series is the long-term component that represents the growth or decline in the series over an extended period of time. The cyclical component is the wave- like fluctuation around the trend. The seasonal component is a pattern of change that repeats itself year after year. The irregular component measures the variability of the time series after the other components have been removed.
5. The autocorrelation coefficient measures the correlation between a variable, lagged one or more periods, and itself.
7. a. nonstationary series b. stationary series c. nonstationary series d. stationary series
9. Naive methods, simple averaging methods, moving averages, simple exponential smoothing, and Box-Jenkins methods. Examples are: the number of breakdowns per week on an assembly line having a uniform production rate; the unit sales of a product or service in the maturation stage of its life cycle; and the number of sales resulting from a constant level of effort.
11. Classical decomposition, census II, Winter's exponential smoothing, time series multiple regression, and Box-Jenkins methods. Examples are: electrical consumption, summer/winter activities (sports like skiing), clothing, and agricultural growing seasons, retail sales influenced by holidays, three-day weekends, and school calendars.
13. 1985 2,413 -1986 2,407 -61987 2,403 -41988 2,396 -71989 2,403 71990 2,443 451991 2,371 -771992 2,362 -9 1993 2,334 -28 1994 2,362 281995 2,336 -261996 2,344 8
5
1997 2,384 401998 2,244 -140
Yes! The original series has a decreasing trend.
15. a. MPE b. MAPE c. MSE
17. Y = 4465.7 (Yt-1 Y ) = 2396.1 (Yt Y )2 = 55,912,891
(Yt Y ) (Yt-1 Y ) = 50,056,722 (Yt Y ) (Yt-2 Y ) = 44,081,598
r1 = 50 056 722
55 912 891
, ,
, , = .895
H0: ρ1 = 0H1: ρ1 0
Reject if t < -2.069 or t > 2.069
SE( ) = = = = .204
= = 4.39
Since the computed t (4.39) is greater than the critical t (2.069), reject the null.
H0: ρ2 = 0H1: ρ2 0
Reject if t < -2.069 or t > 2.069
r2 = 44,081598
55 912 891
,
, , = .788
SE( ) = = =2 6
24
.= .33
.
.
788 0
33= 2.39
Since the computed t (4.39) is greater than the critical t (2.069), reject the null.
b. The data are nonstationary.
6
654321
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
LBQTCorrLag
72.04
68.73
62.57
52.85
39.37
21.74
0.63
0.92
1.25
1.68
2.39
4.39
0.31
0.43
0.56
0.67
0.79
0.90
6
5
4
3
2
1
Autocorrelation Function for Loans
19. Figure 3.18 - The data are nonstationary. (Trended data) Figure 3.19 - The data are random. Figure 3.20 - The data are seasonal. (Monthly data) Figure 3.21 - The data are stationary and have some pattern that could be modeled.
7
21. a. Time series plot follows
b. The sales time series appears to vary about a fixed level so it is stationary.
c. The sample autocorrelation function for the sales series follows
The sample autocorrelations die out rapidly. This behavior is consistent with a stationary series. Note that the sales data are not random. Sales in adjacent weeks
tend to be positively correlated.
Since, in this case, the residuals differ from the original observations by the constant
8
CHAPTER 4
MOVING AVERAGES AND SMOOTHING METHODS
ANSWERS TO ODD NUMBERED PROBLEMS
1. Exponential smoothing
3. Moving average
5. Winter's three-parameter linear and seasonal model
7.Price AVER1 FITS1 RESI119.39 * * *18.96 * * *18.20 18.8500 * *17.89 18.3500 18.8500 -0.9600018.43 18.1733 18.3500 0.0800019.98 18.7667 18.1733 1.8066719.51 19.3067 18.7667 0.7433320.63 20.0400 19.3067 1.3233319.78 19.9733 20.0400 -0.2600021.25 20.5533 19.9733 1.2766721.18 20.7367 20.5533 0.6266722.14 21.5233 20.7367 1.40333
Accuracy MeasuresMAPE: 4.63186 MAD: 0.94222 MSD: 1.17281
The naïve approach is better.
a. 221.2 is forecast for period 9
9. 3-month moving-average
9
Period Yield AVER1 FITS1 RESI1 1 9.29 * * *
2 9.99 * * * 3 10.16 9.8133 * * 4 10.25 10.1333 9.8133 0.436667 5 10.61 10.3400 10.1333 0.476667 6 11.07 10.6433 10.3400 0.730000 7 11.52 11.0667 10.6433 0.876667 8 11.09 11.2267 11.0667 0.023333 9 10.80 11.1367 11.2267 -0.426667 10 10.50 10.7967 11.1367 -0.636667 11 10.86 10.7200 10.7967 0.063333 12 9.97 10.4433 10.7200 -0.750000
Accuracy MeasuresMAPE: 4.58749 MAD: 0.49111 MSD: 0.31931 MPE: .6904
Row Period Forecast Lower Upper 1 13 10.4433 9.33579 11.5509
Actual
Predicted
Forecast
Actual
Predicted
Forecast
14121086420
11.5
10.5
9.5
Yield
Time
MSD:
MAD:
MAPE:
Length:
Moving Average
0.31931
0.49111
4.58749
3
Moving Average
b. 5-month moving-average
Period Yield MA Predict Error 1 9.29 * * *
10
2 9.99 * * * 3 10.16 * * *
4 10.25 * * * 5 10.61 10.060 * *
6 11.07 10.416 10.060 1.010 7 11.52 10.722 10.416 1.104
8 11.09 10.908 10.722 0.368 9 10.80 11.018 10.908 -0.108
10 10.50 10.996 11.018 -0.518 11 10.86 10.954 10.996 -0.136
12 9.97 10.644 10.954 -0.984
Row Period Forecast Lower Upper 1 13 10.644 9.23041 12.0576
Accuracy MeasuresMAPE: 5.58295 MAD: 0.60400 MSD: 0.52015 MPE: .7100
Actual
Predicted
Forecast
Actual
Predicted
Forecast
14121086420
12
11
10
Yield
Time
MSD:
MAD:
MAPE:
Length:
Moving Average
0.52015
0.60400
5.58295
5
Moving Average
g. Use 3-month moving average forecast: 10.4433
11. Time Demand Smooth Predict Error
11
1 205 205.000 205.000 0.0000 2 251 228.000 205.000 46.0000 3 304 266.000 228.000 76.0000 4 284 275.000 266.000 18.0000
5 352 313.500 275.000 77.0000 6 300 306.750 313.500 -13.5000
7 241 273.875 306.750 -65.7500 8 284 278.938 273.875 10.1250
9 312 295.469 278.938 33.0625 10 289 292.234 295.469 -6.4688 11 385 338.617 292.234 92.7656 12 256 297.309 338.617 -82.6172
Accuracy Measures MAPE: 14.67 MAD: 43.44 MSD: 2943.24
Row Period Forecast Lower Upper 1 13 297.309 190.879 403.738
Actual
Predicted
Forecast
Actual
Predicted
Forecast
14121086420
400
300
200
Demand
Time
MSD:
MAD:
MAPE:
Alpha:
Smoothing Constant
2943.24
43.44
14.67
0.500
Single Exponential Smoothing
12
13. a. = .4
Accuracy Measures MAPE: 14.05 MAD: 24.02 MSD: 1174.50
Row Period Forecast Lower Upper
1 57 326.367 267.513 385.221
b. = 0.6
Accuracy Measures MAPE: 14.68 MAD: 24.56 MSD: 1080.21
Row Period Forecast Lower Upper 1 57 334.07 273.889 394.251
c. A smoothing constant of approximately .6 appears to be best.
13
d. No! When the residual autocorrelations shown above are examined, some of them were found to be significant.
15. The autocorrelation function shows that the data are seasonal with a slight trend. Therefore, the Winters’ model is used to forecast revenues.
Smoothing Constants - Alpha (level): 0.8 Beta (trend): 0.1 Gamma (seasonal): 0.1
14
An examination of the autocorrelation coefficients for the residuals of this Winters’ model shown below indicates that none of them are significantly different from zero.
15
CHAPTER 5
TIME SERIES AND THEIR COMPONENTS
ANSWERS TO ODD NUMBERED PROBLEMS
1. The purpose of decomposing a time series variable is to observe its various elements in isolation. By doing so, insights into the causes of the variability of the series are
frequently gained. A second important reason for isolating time series components is to facilitate the forecasting process.
3. The basic forces that affect and help explain the trend-cycle of a series are population growth, price inflation, technological change, and productivity increases.
5. Weather and the calendar year such as holidays affect the seasonal component.
7. a.
YEAR Y T C1980 11424 11105.2 102.8711981 12811 12900.6 99.3061982 14566 14695.9 99.1161983 16542 16491.3 100.307
1984 19670 18286.7 107.565 1985 20770 20082.1 103.426
1986 22585 21877.4 103.234
16
1987 23904 23672.8 100.9771988 25686 25468.8 100.8551989 26891 27263.6 98.6331990 29073 29059.0 100.0481991 28189 30854.3 91.3621992 30450 32649.7 93.2631993 31698 34445.1 92.0251994 35435 36240.5 97.7771995 37828 38035.8 99.4541996 42484 39831.2 106.6601997 44580 41626.6 107.095
b. = 9309.81 + 1795.38X
c. = 9309.81 + 1795.38(19) = 43422.03
d. If there is a cyclical affect, it is very slight.
9. Y = TS = 850(1.12) = $952
11. All of the statements are correct except d.
13. a. The regression equation and seasonal indexes are shown below. The trend is the most important but both trend and seasonal should be used to forecast.
Trend Line Equation = 2241.77 + 25.5111X
Seasonal Indices
Period Index 1 0.96392 2 1.02559 3 1.00323 4 1.00726
Accuracy of Model MAPE: 3.2 MAD: 87.3 MSD: 11114.7
b. Forecasts
Row Period Forecast
17
1 43 3349.53 2 44 3388.67
c. The forecast for third quarter is very accurate (3,349.5 versus 3,340)The forecast for fourth quarter is high compared to Value Line (3,388.7 versus 3,300). Following are computer generated graphs for this problem:
Actual
Predicted
Forecast
Actual
Predicted
Forecast
454035302520151050
3400
3200
3000
2800
2600
2400
2200
2000
Time
Sales
MSD:
MAD:
MAPE:
11114.7
87.3
3.2
Decomposition Fit for Sales
18
15. a. Addivitive Decomposition Ln(Sales)
19
Data LnCavSales Length 77.0000
NMissing 0 Trend Line Equation
Yt = 4.61322 + 2.19E-02*t
Seasonal Indices
Period Index
1 0.33461 2 -0.01814 3 -0.40249 4 -0.63699 5 -0.71401 6 -0.57058 7 -0.27300 8 -0.00120 9 0.46996 10 0.72291 11 0.74671 12 0.34220
b. Pronounced trend and seasonal components. Would use both to forecast.
c. Forecasts
20
Date Period Forecasts(Ln(Sales)) Forecasts(Sales)
Jun. 78 5.74866 314 Jul. 79 6.06811 432 Aug. 80 6.36178 579
Sep. 81 6.85482 948Oct. 82 7.12964 1248Nov. 83 7.17532 1307Dec. 84 6.79267 891
d. See last column of table in part c. e. Forecasts of sales developed from additive decomposition are higher (for
all months June 2000 through December 2000) than those developed from the multiplicative decomposition. Forecasts from multiplicative decomposition appear to be a little more consistent with recent behavior of Cavanaugh sales time series.
17. a.
Variation appears to be increasing with level. Multiplicative decomposition may be appropriate or additive decomposition with the
logarithms of demand.
b. Neither a multiplicative decomposition or an additive decomposition with a linear trend work well for this series. This time series is best modeled with other methods. The multiplicative decomposition is pictured below.
21
c. Seasonal Indices (Multiplicative Decomposition for Demand)
Period Index Period Index Period Index 1 0.951313 5 1.00495 9 1.04368
2 0.955088 6 1.00093 10 0.98570 3 0.966953 7 1.02748 11 0.99741 4 0.988927 8 1.07089 12 1.00669
Demand tends to be relatively high in the summer months.
d. Forecasts derived from a multiplicative decomposition of demand (see plot below).
Date Period Forecast
Oct. 130 172.343 Nov. 131 175.773
Dec. 132 178.803
19. a. JAN = 600
12.= 500
500(1.37) = 685 people estimated for FEB
b. = 140 + 5(t); t for JAN 2003 = 72. = 140 + 5(72) = 500
JAN = (140 + 5(72) = 500)(1.20) = 600 FEB = (140 + 5(73) = 505)(1.37) = 692 MAR = (140 + 5(74) = 510)(1.00) = 510 APR = (140 + 5(75) = 515)(0.33) = 170
22
MAY = (140 + 5(76) = 520)(0.47) = 244 JUN = (140 + 5(77) = 525)(1.25) = 656 JUL = (140 + 5(78) = 530)(1.53) = 811 AUG = (140 + 5(79) = 535)(1.51) = 808 SEP = (140 + 5(80) = 540)(0.95) = 513 OCT = (140 + 5(81) = 545)(0.60) = 327 NOV = (140 + 5(82) = 550)(0.82) = 451 DEC = (140 + 5(83) = 555)(0.97) = 538
c. 5
23. 1289.73(2.847) = 3,671.86
CHAPTER 6
REGRESSION ANALYSIS
ANSWERS TO ODD NUMBERED PROBLEMS
23
1. Option b is inconsistent because the regression coefficient and the correlation coefficient must have the same sign.
3. Correlation of Sales and Expend. = 0.848
The regression equation isSales = 828 + 10.8 Expend.
Predictor Coef StDev T PConstant 828.1 136.1 6.08 0.000Expend. 10.787 2.384 4.52 0.000
S = 67.19 R-Sq = 71.9% R-Sq(adj) = 68.4%
Analysis of Variance
Source DF SS MS F PRegression 1 92432 92432 20.47 0.000Error 8 36121 4515Total 9 128552
a. Yes, because r = .848 and t = 4.52.
b. Y = 828 + 10.8X
c. Y = 828 + 10.8(50) = $1368
d. 72% since r2 = .719
e. Unexplained sum of squares = 36,121Divide by df = (n - 2) = 8 to get 4515.
4515 = 67.19 = sy.x
f. Total sum of squares is 128,552Divide by df = (n - 1) = 9 to get the variance, 14,283.6.The square root is Y's standard deviation, sy = 119.5.
5. . The regression equation is
Cost = 208 + 70.9 Age
Predictor Coef StDev T P
24
Constant 208.20 75.00 2.78 0.027Age 70.918 9.934 7.14 0.000
S = 111.6 R-Sq = 87.9% R-Sq(adj) = 86.2%
Analysis of Variance
Source DF SS MS F PRegression 1 634820 634820 50.96 0.000Error 7 87197 12457Total 8 722017
a.
25
1050
1100
1000
900
800
700
600
500
400
300
200
Age
Cost
b. Positive
c. Y = 6058 Y2 = 4,799,724 X = 59 X2 = 513 XY = 48,665 r = .938
d. Y = 208.2033 + 70.9181X
e. H0: = 0 H1: 0
Reject H0 if t < -2.365 or t > 2.365
Reject the null hypothesis. Age and maintenance cost are linearly related in the population.
f. Y = 208.2033 + 70.9181(5) = 562.7938 or $562.80
26
7. The regression equation isOrders = 15.8 + 1.11 Catalogs
Predictor Coef StDev T PConstant 15.846 3.092 5.13 0.000
Catalogs 1.1132 0.3596 3.10 0.011
S = 5.757 R-Sq = 48.9% R-Sq(adj) = 43.8%
a. r = .7 and t = 3.1 so a significant relationship exists between Y and X.
b. Y = 15.846 + 1.1132X
c. sy.x = 5.57
d. Analysis of VarianceSource DF SS MS F PRegression 1 317.53 317.53 9.58 0.011Error 10 331.38 33.14Total 11 648.92
e. 49% since r2 = .489
f. H0: = 0 H1: 0
Reject H0 if t < -3.169 or t > 3.169
Fail to reject the null hypothesis. Orders received and catalogs distributed are not linearly related in the population.
g. H0: = 0 H1: 0
Reject H0 if F > 10.04
Yes and Yes
h. Y = 13.846 + 1.1132(10) = 24.96 or 24,959
27
9. a. The two firms seem to be using very similar rationale since r = .959.
b. If ABC bids 1.01, the prediction for Y becomes COMP = -3.257 + 1.03435(101) = 101.212, the point estimate. For an interval estimate
using 95% confidence the appropriate standard error is sy.x, since only the variability of
the data points around the (population) regression line need be considered: 101.205 1.96(.7431) 101.205 1.456 99.749 to 102.661
c. If the data constitute a sample, the point estimate for Y would be the same but the
interval estimate would use the standard error of the forecast instead of Sy.x, since the variability of sample regression lines around the population regression line would have to be considered along with the scatter of sample Y values around the calculated line. The probability of winning the bid would then involve the same problem as in c. above except that sf would be used in the calculation:
Z = (101 - 101.212)/.768 = -.276 where sf = .768, giving an area of .1103. (.5 – .1103)
= .3897.
11. The regression equation isPermits = 2217 - 145 Rate
Predictor Coef StDev T PConstant 2217.4 316.2 7.01 0.000Rate -144.95 27.96 -5.18 0.000
S = 144.3 R-Sq = 79.3% R-Sq(adj) = 76.4%
Analysis of Variance
Source DF SS MS F PRegression 1 559607 559607 26.88 0.000Error 7 145753 20822Total 8 705360
a.
28
b. Y = 5375 Y2 = 3,915,429 X = 100.6 X2 = 1151.12 XY = 56,219.8
r = -0.8907
c. Reject H0 if t < - 2.365 or t > 2.365.
Computed t score = 5.1842Reject the null hypothesis: Interest rates and building permits are linearly related in the population.
d. 44.9474
e. r2 = .7934
f. Using knowledge of the linear relationship between interest rates and building permits (r = -0.8907) we can explain 79.3% of the building permit variable variance.
13. The regression equation isDefects = - 17.7 + 0.355 Size
Predictor Coef StDev T PConstant -17.731 4.626 -3.83 0.003Size 0.35495 0.02332 15.22 0.000
S = 7.863 R-Sq = 95.5% R-Sq(adj) = 95.1%Analysis of Variance
Source DF SS MS F PRegression 1 14331 14331 231.77 0.000
29
Error 11 680 62Total 12 15011
a.
b. Defects = - 17.7 + 0.355 Size
c. The slope coefficient, .355, is significantly different from zero. + 0.00101
d.
30
Examination of the residuals shows a curvilinear pattern.
e. The best model transforms the predictor variable and uses X2 as its independent variable.
The regression equation isDefects = 4.70 + 0.00101 Sizesqr
Predictor Coef StDev T PConstant 4.6973 0.9997 4.70 0.000Sizesqr 0.00100793 0.00001930 52.22 0.000
S = 2.341 R-Sq = 99.6% R-Sq(adj) = 99.6%
Analysis of Variance
Source DF SS MS F PRegression 1 14951 14951 2727.00 0.000Error 11 60 5Total 12 15011
Unusual ObservationsObs Sizesqr Defects Fit StDev Fit Residual St Resid 3 5625 6.000 10.367 0.920 -4.367 -2.03R
f. The slope coefficient, .355, is significantly different from zero. + 0.00101
g.
Examination of the residuals shows a random pattern.
31
h. R denotes an observation with a large standardized residual
Fit StDev Fit 95.0% CI 95.0% PI 95.411 1.173 ( 92.828, 97.994) ( 89.645, 101.177)
i. Defects = 4.70 + 0.00101 Sizesqr
b. The regression equation is: Market = 60.7 + 0.414 Assessed
c. . About 38% of the variation in market prices is explained by assessed values (as predictor variable). There is a considerable amount of unexplained variation.
15. a. The regression equation is: OpExpens = 18.9 + 1.30 PlayCosts
b. . About 75% of the variation in operating expenses is explained by player costs.
c. , p-value = .000 < .10. The regression is clearly significant at the level.
d. Coefficient on = player costs is 1.3. Is reasonable?
suggests is not supported by the data. Appears that
operating expenses have a fixed cost component represented by the intercept , and then are about 1.3 times player costs.
e. , gives or (47.6, 69.6).
f. Unusual Observations
Obs Play Cost OpExpens Fit SE Fit Residual St Resid 7 18.0 60.00 42.31 1.64 17.69 3.45R
R denotes an observation with a large standardized residual Team 7 has unusually low player costs relative to operating expenses.
32