doe hw 11-26_copy (1)
TRANSCRIPT
-
8/10/2019 DOE HW 11-26_Copy (1)
1/14
Question 10.15
A University career center conducted a study to determine whether there is an association
between starting salaries Y (in thousands of dollors) and grade point average X1,
age apon completion X2, and gender for students in the school of engineering.
The career center obtained the following sample data
Y X1 X2 X3Starting Salary GPA Age Gender
50.5 2.95 22 F
52 3.2 23 M
53.1 3.4 22 M
54.4 3.6 23 M
53.2 3.5 24 M
47 2.85 24 F
50 3.1 25 F
50.8 3.2 26 F
47.7 3.05 23 M
46.4 2.7 24 F
47.5 2.75 28 F
49.2 3.1 22 M
51 3.15 22 M
49.2 2.95 23 F
48.8 0.75 26 M
(a) Fit an appropriate regression model to these data, evaluate it
and reive it as suggested by your evaluation
(b) Think of another potential predictor variable that could further
explain the variation in the sample starting salaries
Solution
(a)
Y X1 X2 X3
50.5 2.95 22 0
52 3.2 23 1
53.1 3.4 22 1
54.4 3.6 23 1
53.2 3.5 24 1
47 2.85 24 0
50 3.1 25 050.8 3.2 26 0
47.7 3.05 23 1
46.4 2.7 24 0
47.5 2.75 28 0
49.2 3.1 22 1
51 3.15 22 1
49.2 2.95 23 0
48.8 0.75 26 1
-
8/10/2019 DOE HW 11-26_Copy (1)
2/14
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.703494178
R Square 0.494904058Adjusted R S 0.35715062
Standard Erro 1.931858531
Observations 15
ANOVA
df SS MS F
Significa
nce F
Regression 3 40.2245 13.4082 3.59268 0.04982
Residual 11 41.0529 3.73208
Total 14 81.2773
Coefficients Standard Error t Stat P-value
Lower
95%
Upper
95%
Lower
95.0%
Upper
95.0%
Intercept 40.3640753 10.34532195 3.90167 0.00247 17.59418 63.134 17.5942 63.134
X1 1.8664914 0.883585363 2.11241 0.05833 -0.07827 3.81125 -0.07827 3.81125
X2 0.11969999 0.360596758 0.33195 0.74617 -0.67397 0.91337 -0.67397 0.91337
X3 2.50171596 1.120612565 2.23245 0.04732 0.035264 4.96817 0.03526 4.96817
Y X1 X3
50.5 2.95 0
52 3.2 1
53.1 3.4 1
54.4 3.6 1
53.2 3.5 1
47 2.85 0
50 3.1 0
50.8 3.2 0
47.7 3.05 1
46.4 2.7 047.5 2.75 0
49.2 3.1 1
51 3.15 1
49.2 2.95 0
48.8 0.75 1
-
8/10/2019 DOE HW 11-26_Copy (1)
3/14
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.69989
R Square 0.48984
Adjusted R Squa 0.40482Standard Error 1.85885
Observations 15
ANOVA
df SS MS F
Significa
nce F
Regression 2 39.8132 19.9066 5.761116 0.01763
Residual 12 41.4641 3.45534
Total 14 81.2773
CoefficientsStandardError t Stat P-value
Lower95%
Upper95%
Lower95.0%
Uppe95.0%
Intercept 43.70409143 2.31443 18.8833 2.73E-10 38.6614 48.7468 38.6614 48.74
X1 1.730310245 0.753 2.29789 0.040352 0.08966 3.37096 0.08966 3.370
X3 2.334050035 0.96252 2.42493 0.032028 0.23689 4.43121 0.23689 4.431
The Model would be:
Y=43.704+1.73031X1+2.33405X2
-
8/10/2019 DOE HW 11-26_Copy (1)
4/14
Question 10.17
An insurance executive wished to estimate the relationship between the number of days
of work lost by auto accident victims Y and age X1 and gender X2 of victim. A representative
sample of 25 loss reports was selected resulting in the least squares equation
Y-Bar=21.4-0.0072X1-2.5X2, For this equation SST = 4.750, SSE=3.180, SD(b1)=0.11, SD(b
(a) Do you detect an association between the response variable Y and the two predictor
variables as a group? Support your answer(b) Is the incremental contribution of age discernible, given the persons gender? Explain
(c ) Is the incremental contribution of gender discernible, given the persons age? Explain
(d) what do your conclusions in parts (a)-(c ) suggest about basing premiums for income
replacement on the age and gender of the unsured when work time is lost due to an
automobile accident?
Solution:
Y-Bar=21.4-0.0072X1-2.5X2
(a) the data sample for analysis can be considered as
(X2: 0=Female, 1= Male)
X3 = X1*X2
Y X1 X2 X3
21.2416 22 0 0
18.7344 23 1 23
18.7416 22 1 22
18.7344 23 1 23
18.7272 24 1 2421.2272 24 0 0
21.22 25 0 0
21.2128 26 0 0
18.7344 23 1 23
21.2272 24 0 0
21.1984 28 0 0
18.7416 22 1 22
18.7416 22 1 22
21.2344 23 0 0
18.7128 26 1 26
Y X3
21.2416 0
18.7344 23
18.7416 22
18.7344 23
-
8/10/2019 DOE HW 11-26_Copy (1)
5/14
18.7272 24
21.2272 0
21.22 0
21.2128 0
18.7344 23
21.2272 0
21.1984 018.7416 22
18.7416 22
21.2344 0
18.7128 26
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.997180093R Square 0.994368138
Adjusted R Sq 0.993934918
Standard Error 0.100126163
Observations 15
ANOVA
df SS MS
Regression 1 23.0109068 23.0109068
Residual 13 0.130328232 0.01002525
Total 14 23.14123503
Coefficients Standard Error t Stat P-value
Intercept 21.21514683 0.037779413 561.5531057 6.8402E-30
X3 -0.107014068 0.002233683 -47.90924109 5.211E-16
F
2295.295382
The model can be defined as
Lower 95%21.13352937
-0.111839647
(b) Considering incremental contribution of age against given gender
Y X1 X2
21.2416 22 0
18.7344 22 1
Y=21.21515-0.10701X3
-
8/10/2019 DOE HW 11-26_Copy (1)
6/14
-
8/10/2019 DOE HW 11-26_Copy (1)
7/14
18.7416 22 0 0.354433637
18.7344 23 0 4.82175E-26
18.7272 24 0
21.2272 24 0
21.22 25 0
21.2128 26 1
18.7344 23 121.2272 24 1
21.1984 28 1
18.7416 22 1
18.7416 22 1
21.2344 23 1
18.7128 26 1
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.413975326
R Square 0.171375571
Adjusted R Sq 0.033271499
Standard Error 1.264100229
Observations 15
ANOVA
df SS MS
Regression 2 3.965842359 1.98292118Residual 12 19.17539267 1.59794939
Total 14 23.14123503
Coefficients Standard Error t Stat
Intercept 12.6565445 4.625799429 2.73607723
X1 0.306936126 0.197591481 1.55338744
X2 -0.12434555 0.681414417 -0.1824815 F
1.240916135
P-value
0.018062093
0.146295239
0.858251378
-
8/10/2019 DOE HW 11-26_Copy (1)
8/14
)=0.99
-
8/10/2019 DOE HW 11-26_Copy (1)
9/14
Significance F
5.21105E-16
Upper 95% Lower 95.0% Upper 95.0%21.29676429 21.13352937 21.29676429
-0.102188488 -0.111839647 -0.102188488
-
8/10/2019 DOE HW 11-26_Copy (1)
10/14
-
8/10/2019 DOE HW 11-26_Copy (1)
11/14
-0.005756376 0.002226964 -0.005756376 0.00223
-2.502595094 -2.47506373 -2.502595094 -2.47506
Significance F
0.323702756
Lower 95% Upper 95% Lower 95.0%
Upper
95.0%
2.57779336 22.73529565 2.57779336 22.7353
-0.123578729 0.73745098 -0.123578729 0.73745
-1.609020024 1.360328925 -1.609020024 1.36033
-
8/10/2019 DOE HW 11-26_Copy (1)
12/14
Question 10.23
A Manufacturing firm wishes to predict the manufacturing unit cost Y (in dollors) of one
of its products as a function of fluctuating production rate X1, and material and labor costs X2,
(X1 is measured as a percentage of rated capacity and X2 is a standard index that combines
the costs of material and labor.) Representative data were collected over a 20 month span
during which the production rate and labor costs fluctuated considerably
Y X1 X2 Y X1 X2
13.59 87 80 15.93 102 116
15.71 78 95 16.45 82 117
15.97 81 106 19.02 74 127
20.21 65 115 18.16 85 133
24.64 51 128 18.57 86 135
21.25 62 128 17.01 90 136
18.94 70 115 18.03 93 140
14.85 91 92 19.22 81 142
15.18 94 93 21.12 72 14816.3 100 111 23.32 60 150
Fit an appropriate regression model to these data, evaluate the resulting least squares
equation, and revise it as necessary
Solution
Month # Y X1 X2
1 13.59 87 80
2 15.71 78 953 15.97 81 106
4 20.21 65 115
5 24.64 51 128
6 21.25 62 128
7 18.94 70 115
8 14.85 91 92
9 15.18 94 93
10 16.3 100 111
11 15.93 102 116
12 16.45 82 11713 19.02 74 127
14 18.16 85 133
15 18.57 86 135
16 17.01 90 136
17 18.03 93 140
18 19.22 81 142
19 21.12 72 148
-
8/10/2019 DOE HW 11-26_Copy (1)
13/14
20 23.32 60 150
SUMMARY OUTPUT
Regression StatisticsMultiple R 0.95601
R Square 0.91396
Adjusted R Square 0.90384
Standard Error 0.89419
Observations 20
ANOVA
df SS MS F
Significa
nce F
Regression 2 144.387 72.1937 90.28915 8.8E-10Residual 17 13.5929 0.79958
Total 19 157.98
Coefficients
Standard
Error t Stat P-value
Lower
95%
Upper
95%
Lower
95.0%
Upper
95.0%
Intercept 20.28126941 2.12525 9.543 3.1E-08 15.79738 24.7652 15.7974 24.7652
X1 -0.137695838 0.01585 -8.68549 1.2E-07 -0.17114 -0.10425 -0.17114 -0.10425
X2 0.074245424 0.01096 6.77134 3.3E-06 0.051112 0.09738 0.05111 0.09738
Model Would be
Y=20.2812-0.1376X1+0.07424X2
-
8/10/2019 DOE HW 11-26_Copy (1)
14/14
10.21
How well can a taxpayer's taxes Y, as a percentage of his or her gross income X (in thousands
of dollors)? The following represents a random sample of 14 federal income tax returns in a
given year"
Income X % tax Y:
45.6 10.462.2 11.8
77.6 14.7
118.8 16.7
30.4 5.8
50.1 10.2
60 13.9
49.3 10.9
36.1 7
38 9.1
108.2 16.1
54 12.6
42.1 9.8
90 16.6
Fit a simple linear regression model to these data, evaluate the resulting least squares
equation (including a residual analysis) and revise it as necessary.
Linear Regression Equations:
y = 0.1157x + 4.7037
R = 0.8363
% tax Y:
0
5
10
15
20
% tax Y: