univariate linear regression problem model: y= 0 + 1 x+ test: h 0 : β 1 =0. alternative: h 1 :...
TRANSCRIPT
![Page 1: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/1.jpg)
Univariate Linear Regression Problem
• Model: Y=0+1X+
• Test: H0: β1=0.
• Alternative: H1: β1>0.
• The distribution of Y is normal under both null and alternative.
• Under null, var(Y)=σ02.
• Under alternative, β1>0, and var(Y)=σ12.
![Page 2: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/2.jpg)
Step 1: Choose the test statistic and specify its null distribution
• Use conditions of the null to find:
).)(
,0(~ˆ
1
2
20
1
n
ini xx
N
![Page 3: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/3.jpg)
Bringing sample size into regression design
• The sample size n is hidden in the regression results. That is, let:
.)( 2
1
2X
n
ini nxx
![Page 4: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/4.jpg)
Step 2: Define the critical value
• For the univariate linear regression test:
.)/(
||0||0 0
2
0
nz
nzCV X
X
![Page 5: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/5.jpg)
Step 3: Define the Rejection Rule
• Each test is a right sided test, and so the rule is to reject when the test statistic is greater than the critical value.
![Page 6: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/6.jpg)
Step 4: Specify the Distribution of Test Statistic under Alternative• Use conditions of the null to find:
)./
,(~ˆ22
111 nEN X
![Page 7: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/7.jpg)
Step 5: Define a Type II Error
• For the univariate linear regression test:
.)/(
||0ˆ 01
nzCV X
![Page 8: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/8.jpg)
Step 6: Find β
• For a univariate linear regression test:
}.)/(
))/(
||0(
)ˆ(
))ˆ(ˆ({Pr
1
10
1
111
n
En
zE
X
X
![Page 9: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/9.jpg)
Basic Insight
• Notice that all three problems have the same basic structure.
• That is, if you understand the solution of the one sample test, then you can derive the answer to the other problems.
![Page 10: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/10.jpg)
Step 7: Phrase requirement on β
• For example, we seek to “choose n so that β=0.01.”
• That is, “choose n so that Pr1{Accept H0}=β=0.01.
![Page 11: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/11.jpg)
Step 7: Phrase requirement on β
• For example, we seek to “choose n so that
.}/
)/
||0(
)ˆ(
))ˆ(ˆ({Pr
1
10
1
111
n
En
zE
X
X
![Page 12: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/12.jpg)
Step 7: Phrase requirement on β
• Notice the parallel phrasing:
.|}|Pr{ zZ
![Page 13: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/13.jpg)
Step 7: Phrase requirement on β
• That is, “choose n so that (note that E0=0):
.||/
)/
||(
1
10
0
z
n
En
zE
X
X
![Page 14: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/14.jpg)
Step 7: Phrase requirement on β
• That is, choose n so that (after algebraic clearing out):
.||||)( 1001
XX
zznEE
![Page 15: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/15.jpg)
Step 8: State the conclusion
• The result for a left sided test has to be worked through but is similar. You must remember to keep all entries positive. This is reasonable if both α and β are constrained to be less than or equal to 0.5. The restriction is not a hardship in practice.
![Page 16: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/16.jpg)
Univariate Linear Regression
• Note that the σ0 factor is changed to σ0/σX.
• There is a similar adjustment for the alternative standard deviation.
![Page 17: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/17.jpg)
Example Problem Group
• Two hundred values of an independent variable xi are chosen so that Σ(xi-xbar)2 is equal to 400,000. For each setting of xi, the random variable Yi=β0+β1xi+σZi is observed. Here β0 and β1 are fixed but unknown parameters, σ=400, and the Zi are independent standard normal random variables.
![Page 18: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/18.jpg)
Example Problem Group
• The null hypothesis to be tested is H0: β1=0, α=0.01, and the alternative is H1: β1<0. The random variable B1 is the OLS estimate of β1.
![Page 19: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/19.jpg)
Example Question 1
• When H0 is true, what is the standard deviation of B1, the OLS estimate of the slope?
• Var(B1)=σ2/Σ(xi-xbar)2=4002/400,000=0.4.
• sd(B1)=0.632.
![Page 20: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/20.jpg)
Example Question 2
• What is the probability of a Type II error in the test specified in the common section using B1, the OLS estimator of the slope, as test statistic when β1=-4, α=0.01, σ=400, and Σ(xi-xbar)2 is equal to 400,000?
![Page 21: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/21.jpg)
Solution to Question 2
• The critical value is 0-2.326(0.632)=-1.47
• A Type II error occurs when B1>-1.47.
• Under alternative B1 is normal with expected value -4 and standard deviation (error) 0.632.
• Pr{B1>-1.47}=Pr{Z>(-1.47-(-4))/0.632} =Pr{Z>4.00}=.000032
• The answer is 0.000032.
![Page 22: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/22.jpg)
Example Question 3
• How many observations n are necessary so that the probability of a Type II error in the test specified in the common section when β1=-4, α=0.01, σ=400, and Σ(xi-xbarn)2 is equal to 2,000n?
![Page 23: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/23.jpg)
Outline of Solution to Problem 3
• For σo term, use (4002/2000)0.5=8.94.
• Use same value for σ1 term.
• Use |z0.01|=2.326.
• Use |E1-E0|=|-4-0|=4.
• Square root of sample size is 10.39.
• Sample size is 109 or more.
![Page 24: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/24.jpg)
Chapter 21: Residual Analysis
• If the assumptions in regression are violated:
– Residuals are one way of checking model:
Ri = Yi - Fitted value at xi
![Page 25: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/25.jpg)
Checking the Assumptions
– Check for normality (test of normality, histogram, q-q plots)
– Check variance if it is the same for all values of the independent variable (plot residuals against predicted values)
– Check independence (plot residuals against sequence variable)
– Check for linearity (plot dependent variable against independent variable)
![Page 26: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/26.jpg)
Residual Plots
• Plot residuals against independent variable.– Plot should be flat indicating the same variance.– There should be no fanning out pattern.– Check for influential observations.
• Plot residuals against predicted variable.– For univariate regression this is the same as the
above plot. There should be no pattern.
![Page 27: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/27.jpg)
What to do if problem?
• Can look for transformations of either independent or dependent variable or both.
• Using computer this is easy: compute option from menu bar.
![Page 28: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/28.jpg)
Influential Points
• An easier way to look for points that have a large impact on the slope is to plot the change in slope against an arbitrary case sequence number.
![Page 29: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/29.jpg)
Example
• Data set in the web page
• aim: predict final exam score from midterm score
• dependent variable: final exam score
• independent variable: midterm score
• model, check assumptions, predict
![Page 30: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/30.jpg)
score on first exam
3002001000
final
exa
min
atio
n sc
ore
700
600
500
400
300
200
![Page 31: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/31.jpg)
Output
• Model: Y= 0 + 1 X +
• R2 = 0.508
• F statistics=60.91, Significance=0.01=1.391117, t statistic=7.805,
Significance=0.00=238.95, t statistic=8.329,
Significance=0.0
![Page 32: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/32.jpg)
Predicted Value
600500400300
Res
idua
l200
100
0
-100
-200
![Page 33: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/33.jpg)
Residual
120.0100.0
80.060.0
40.020.0
0.0-20.0
-40.0-60.0
-80.0-100.0
-120.0-140.0
-160.0
14
12
10
8
6
4
2
0
Std. Dev = 66.68 Mean = 0.0N = 61.00
![Page 34: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/34.jpg)
Normal Q-Q Plot of Residual
Observed Value
2001000-100-200
Exp
ecte
d N
orm
al V
alue
3
2
1
0
-1
-2
-3
![Page 35: Univariate Linear Regression Problem Model: Y= 0 + 1 X+ Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both](https://reader036.vdocument.in/reader036/viewer/2022081603/56649f1d5503460f94c34b2e/html5/thumbnails/35.jpg)
Next Class
• Multiple Regression!
• Check web site for your data file