chapter 11 simple linear regression. 2 probabilistic models general form of probabilistic models y =...

Post on 15-Jan-2016

225 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Chapter 11

Simple Linear Regression

2

Probabilistic Models

General form of Probabilistic Models

Y = Deterministic Component + Random Errorwhere

E(y) = Deterministic Component

3

Probabilistic Models

First Order (Straight-Line) Probabilistic Model

xy 10

4

Probabilistic Models

5 steps of Simple Linear Regression1. Hypothesize the deterministic component

2. Use sample data to estimate unknown model parameters

3. Specify probability distribution of , estimate standard deviation of the distribution

4. Statistically evaluate model usefulness

5. Use for prediction, estimatation, once model is useful

5

Fitting the Model: The Least Squares Approach

Reaction Time versus Drug PercentageSubject Amount of Drug x (%) Reaction Time y (seconds)

1 1 1

2 2 1

3 3 2

4 4 2

5 5 4

6

Fitting the Model: The Least Squares Approach

Least Squares Line has:•Sum of errors (SE) = 0•Sum of Squared errors (SSE) is smallest of all straight line models

Formulas:

Slope: y-intercept

( )( )( )( ) i i

xy i i i i

x ySS x x y y x y

n

xy 10ˆˆˆ

1xy

xx

SS

SS

0 1ˆ ˆy x

22 2 ( )

( ) ixx i i

xSS x x x

n

7

Fitting the Model: The Least Squares Approach

Preliminary Computations ix iy 2

ix i ix y

1 1 1 1 2 1 4 2 3 2 9 6 4 2 16 8 5 4 25 20

Totals 15ix 10iy 2 55ix 37i ix y

Comparing Observed and Predicted Values for the Least Squares Prediction Equation

x y ˆ .1 .7y x ˆy y 2ˆy y

1 1 .6 .4 .16

2 1 1.3 -.3 .09

3 2 2.0 0.0 .00

4 2 2.7 -.7 .49

5 4 3.4 .6 .36

Sum of Errors = 0 SSE = 1.10

8

Model Assumptions

1. Mean of the probability distribution of ε is 0

2. Variance of the probability distribution of ε is constant for all values of x

3. Probability distribution of ε is normal

4. Values of ε are independent of each other

9

An Estimator of 2

Estimator of 2 for a straight-line model

2

2

SSE SSEs

Degreesof freedom for error n

1

22 2

2

ˆyy xy

iyy i i

SSE SS SS

ySS y y y

n

s s Estimated StandardErrorof theRegressionModel

10

Assessing the Utility of the Model: Making Inferences about the Slope 1

Sampling Distribution of 1

1

xxSS

11

Assessing the Utility of the Model: Making Inferences about the Slope 1

A Test of Model Utility: Simple Linear RegressionOne-Tailed Test Two-Tailed Test

H0: β1=0

Ha: β1<0 (or Ha: β1>0)

H0: β1=0

Ha: β1≠0

Rejection region: t< -tα

(or t< -tα when Ha: β1>0)

Rejection region: |t|> tα/2

Where tα and tα/2 are based on (n-2) degrees of freedom

1

1 1

ˆ

ˆ ˆ:

xx

Test statistic ts s SS

12

Assessing the Utility of the Model: Making Inferences about the Slope 1

A 100(1-α)% Confidence Interval for 1

where1

1 2ˆ t s

1

xx

ss

SS

13

The Coefficient of Correlation

A measure of the strength of the linear relationship between two variables x and y

xy

xx yy

SSr

SS SS

14

The Coefficient of Determination

2 1yy

yy yy

SS SSE SSEr

SS SS

15

Using the Model for Estimation and Prediction

Sampling errors and confidence intervals will be larger for Predictions than for Estimates

Standard error of

Standard error of the prediction

2

ˆ

( )1 py

xx

x x

n SS

y

2

ˆ

11

p

y yxx

x x

n SS

16

Using the Model for Estimation and Prediction

100(1-α)% Confidence interval for Mean Value of y at x=xp

100(1-α)% Confidence interval for an Individual New Value of y at x=xp

where tα/2 is based on (n-2) degrees of freedom

2

2

1ˆ p

xx

x xy t s

n SS

2

2

1ˆ 1

p

xx

x xy t s

n SS

top related