chapter 11 simple linear regression. 2 probabilistic models general form of probabilistic models y =...
TRANSCRIPT
Chapter 11
Simple Linear Regression
2
Probabilistic Models
General form of Probabilistic Models
Y = Deterministic Component + Random Errorwhere
E(y) = Deterministic Component
3
Probabilistic Models
First Order (Straight-Line) Probabilistic Model
xy 10
4
Probabilistic Models
5 steps of Simple Linear Regression1. Hypothesize the deterministic component
2. Use sample data to estimate unknown model parameters
3. Specify probability distribution of , estimate standard deviation of the distribution
4. Statistically evaluate model usefulness
5. Use for prediction, estimatation, once model is useful
5
Fitting the Model: The Least Squares Approach
Reaction Time versus Drug PercentageSubject Amount of Drug x (%) Reaction Time y (seconds)
1 1 1
2 2 1
3 3 2
4 4 2
5 5 4
6
Fitting the Model: The Least Squares Approach
Least Squares Line has:•Sum of errors (SE) = 0•Sum of Squared errors (SSE) is smallest of all straight line models
Formulas:
Slope: y-intercept
( )( )( )( ) i i
xy i i i i
x ySS x x y y x y
n
xy 10ˆˆˆ
1xy
xx
SS
SS
0 1ˆ ˆy x
22 2 ( )
( ) ixx i i
xSS x x x
n
7
Fitting the Model: The Least Squares Approach
Preliminary Computations ix iy 2
ix i ix y
1 1 1 1 2 1 4 2 3 2 9 6 4 2 16 8 5 4 25 20
Totals 15ix 10iy 2 55ix 37i ix y
Comparing Observed and Predicted Values for the Least Squares Prediction Equation
x y ˆ .1 .7y x ˆy y 2ˆy y
1 1 .6 .4 .16
2 1 1.3 -.3 .09
3 2 2.0 0.0 .00
4 2 2.7 -.7 .49
5 4 3.4 .6 .36
Sum of Errors = 0 SSE = 1.10
8
Model Assumptions
1. Mean of the probability distribution of ε is 0
2. Variance of the probability distribution of ε is constant for all values of x
3. Probability distribution of ε is normal
4. Values of ε are independent of each other
9
An Estimator of 2
Estimator of 2 for a straight-line model
2
2
SSE SSEs
Degreesof freedom for error n
1
22 2
2
ˆyy xy
iyy i i
SSE SS SS
ySS y y y
n
s s Estimated StandardErrorof theRegressionModel
10
Assessing the Utility of the Model: Making Inferences about the Slope 1
Sampling Distribution of 1
1
xxSS
11
Assessing the Utility of the Model: Making Inferences about the Slope 1
A Test of Model Utility: Simple Linear RegressionOne-Tailed Test Two-Tailed Test
H0: β1=0
Ha: β1<0 (or Ha: β1>0)
H0: β1=0
Ha: β1≠0
Rejection region: t< -tα
(or t< -tα when Ha: β1>0)
Rejection region: |t|> tα/2
Where tα and tα/2 are based on (n-2) degrees of freedom
1
1 1
ˆ
ˆ ˆ:
xx
Test statistic ts s SS
12
Assessing the Utility of the Model: Making Inferences about the Slope 1
A 100(1-α)% Confidence Interval for 1
where1
1 2ˆ t s
1
xx
ss
SS
13
The Coefficient of Correlation
A measure of the strength of the linear relationship between two variables x and y
xy
xx yy
SSr
SS SS
14
The Coefficient of Determination
2 1yy
yy yy
SS SSE SSEr
SS SS
15
Using the Model for Estimation and Prediction
Sampling errors and confidence intervals will be larger for Predictions than for Estimates
Standard error of
Standard error of the prediction
2
ˆ
( )1 py
xx
x x
n SS
y
2
ˆ
11
p
y yxx
x x
n SS
16
Using the Model for Estimation and Prediction
100(1-α)% Confidence interval for Mean Value of y at x=xp
100(1-α)% Confidence interval for an Individual New Value of y at x=xp
where tα/2 is based on (n-2) degrees of freedom
2
2
1ˆ p
xx
x xy t s
n SS
2
2
1ˆ 1
p
xx
x xy t s
n SS