epi 809/spring 2008 1 probability distribution of random error

62
EPI 809/Spring 2008 EPI 809/Spring 2008 1 Probability Probability Distribution Distribution of Random Error of Random Error

Upload: marylou-ellis

Post on 16-Dec-2015

230 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 11

Probability Distribution Probability Distribution of Random Errorof Random Error

Page 2: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 22

Regression Modeling Steps Regression Modeling Steps

1.1. Hypothesize Deterministic Hypothesize Deterministic ComponentComponent

2.2. Estimate Unknown Model ParametersEstimate Unknown Model Parameters 3.3. Specify Probability Distribution of Specify Probability Distribution of

Random Error TermRandom Error Term Estimate Standard Deviation of ErrorEstimate Standard Deviation of Error

4.4. Evaluate ModelEvaluate Model 5.5. Use Model for Prediction & Estimation Use Model for Prediction & Estimation

Page 3: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 33

Linear Regression Assumptions Linear Regression Assumptions

Assumptions of errors Assumptions of errors nn

- Gauss-Markov condition- Gauss-Markov condition 1.1. Independent errors Independent errors 2.2. Mean of probability distribution of errors Mean of probability distribution of errors

is 0is 03.3. Errors have constant variance Errors have constant variance σσ22, for , for

which an estimator is Swhich an estimator is S22

4.4. Probability distribution of error is normalProbability distribution of error is normal5.5. Potential violation of G-M condition. Potential violation of G-M condition.

Page 4: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 44

Error Error Probability DistributionProbability Distribution

Y

f()

X

X 1X 2

Y

f()

X

X 1X 2

Page 5: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 55

Random Error VariationRandom Error Variation

Page 6: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 66

Random Error VariationRandom Error Variation

1.1. Variation of Actual Variation of Actual YY from Predicted from Predicted YY

Page 7: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 77

Random Error VariationRandom Error Variation

1.1. Variation of Actual Variation of Actual YY from Predicted from Predicted YY

2.2. Measured by Standard Error of Measured by Standard Error of Regression ModelRegression Model Sample Standard Deviation of Sample Standard Deviation of , , ss

^

Page 8: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 88

Random Error VariationRandom Error Variation

1.1. Variation of Actual Variation of Actual YY from Predicted from Predicted YY

2.2. Measured by Standard Error of Measured by Standard Error of Regression ModelRegression Model Sample Standard Deviation of Sample Standard Deviation of , , ss

3. 3. Affects Several FactorsAffects Several Factors Parameter SignificanceParameter Significance Prediction AccuracyPrediction Accuracy

^

Page 9: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 99

Evaluating the ModelEvaluating the Model

Testing for SignificanceTesting for Significance

Page 10: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1010

Regression Modeling Steps Regression Modeling Steps

1.1. Hypothesize Deterministic ComponentHypothesize Deterministic Component 2.2. Estimate Unknown Model ParametersEstimate Unknown Model Parameters 3.3. Specify Probability Distribution of Specify Probability Distribution of

RandomRandom

Error TermError Term Estimate Standard Deviation of ErrorEstimate Standard Deviation of Error

4.4. Evaluate ModelEvaluate Model 5.5. Use Model for Prediction & EstimationUse Model for Prediction & Estimation

Page 11: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1111

Test of Slope CoefficientTest of Slope Coefficient

1.1. Shows If There Is a Linear Relationship Shows If There Is a Linear Relationship Between Between XX & & YY

2.2. Involves Population Slope Involves Population Slope 11

3.3. Hypotheses Hypotheses HH00: : 1 1 = 0 (No Linear Relationship) = 0 (No Linear Relationship)

HHaa: : 11 0 (Linear Relationship) 0 (Linear Relationship)

4.4. Theoretical basis of the test statistic is the Theoretical basis of the test statistic is the sampling distribution of slopesampling distribution of slope

Page 12: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1212

Sampling Distribution Sampling Distribution of Sample Slopesof Sample Slopes

Page 13: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1313

Y

Population LineX

Sample 1 Line

Sample 2 Line

Y

Population LineX

Sample 1 Line

Sample 2 Line

Sampling Distribution Sampling Distribution of Sample Slopesof Sample Slopes

Page 14: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1414

Y

Population LineX

Sample 1 Line

Sample 2 Line

Y

Population LineX

Sample 1 Line

Sample 2 Line

Sampling Distribution Sampling Distribution of Sample Slopesof Sample Slopes

All Possible All Possible Sample SlopesSample Slopes

Sample 1:Sample 1: 2.52.5 Sample 2:Sample 2: 1.6 1.6 Sample 3:Sample 3: 1.81.8 Sample 4:Sample 4: 2.12.1

: : : :Very large number Very large number of sample slopesof sample slopes

Page 15: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1515

Y

Population LineX

Sample 1 Line

Sample 2 Line

Y

Population LineX

Sample 1 Line

Sample 2 Line

Sampling Distribution Sampling Distribution of Sample Slopesof Sample Slopes

11

All Possible All Possible Sample SlopesSample Slopes

Sample 1:Sample 1: 2.52.5 Sample 2:Sample 2: 1.6 1.6 Sample 3:Sample 3: 1.81.8 Sample 4:Sample 4: 2.12.1

: : : :large number of large number of sample slopessample slopes

Sampling DistributionSampling Distribution

11

11SS

^

^

Page 16: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1616

Slope Coefficient Test StatisticSlope Coefficient Test Statistic

n

n

iiX

n

iiX

SS

St

2

1

1

2

1ˆ where

11ˆ

n

n

iiX

n

iiX

SS

St

2

1

1

2

1ˆ where

11ˆ

2

110

1

2 ˆˆˆ

n

iii

n

iii XYYYSSEand

nSSE

Swith

2

110

1

2 ˆˆˆ

n

iii

n

iii XYYYSSEand

nSSE

Swith

Page 17: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1717

Test of Slope Coefficient Test of Slope Coefficient Rejection RuleRejection Rule

Reject HReject H00 in favor of H in favor of Ha a if if tt falls in colored falls in colored

areaarea

Reject HReject H00 for H for Ha a if P-value = P(T>|if P-value = P(T>|tt|) < |) < αα

T=T=tt(n-2)(n-2)00 tt1-1-αα/2, /2, (n-2)(n-2)

Reject HReject H00 Reject HReject H00

αα/2/2

--tt1-1-αα/2, /2, (n-2)(n-2)

αα/2/2

Page 18: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1818

Test of Slope Coefficient Test of Slope Coefficient ExampleExample

Reconsider the Obstetrics example with the following Reconsider the Obstetrics example with the following data: data:

EstriolEstriol (mg/24h)(mg/24h) B.w.B.w. (g/1000)(g/1000)

11 1122 1133 2244 2255 44

Is the Is the Linear RelationshipLinear Relationship between betweenEstriol & Birthweight Estriol & Birthweight significant significant at at .05.05 level? level?

Page 19: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 1919

Solution Table For Solution Table For ββ’s’s

Xi Yi Xi2 Yi

2 XiYi

1 1 1 1 1

2 1 4 1 2

3 2 9 4 6

4 2 16 4 8

5 4 25 16 20

15 10 55 26 37

Xi Yi Xi2 Yi

2 XiYi

1 1 1 1 1

2 1 4 1 2

3 2 9 4 6

4 2 16 4 8

5 4 25 16 20

15 10 55 26 37

Page 20: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2020

Solution Table for SSESolution Table for SSE

Birth weight=y

Estriol=x

Predicted=y=β0+ β1x

(Obs-pred)2

=( y - y)2

1 1 0.6 0.16

1 2 1.3 0.09

2 3 2 0

2 4 2.7 0.49

4 5 3.4 0.36

10 15 - SSE=1.1

^ ^^^

Page 21: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2121

Test of Slope Parameter Test of Slope Parameter SolutionSolution

HH00: : 11 = 0 = 0

HHaa: : 11 0 0

.05.05 df df 5 - 2 = 35 - 2 = 3 Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

t0 3.1824-3.1824

.025

Reject Reject

.025

t0 3.1824-3.1824

.025

Reject Reject

.025

Page 22: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2222

Test StatisticTest StatisticSolutionSolution

60553.025

1.1

2

1915.0

515

55

60553.0 where

656.31915.0

070.0ˆ

32

1

1

2

ˆ

ˆ

11

1

1

n

SSESwith

n

XX

SS

St

n

iin

ii

60553.025

1.1

2

1915.0

515

55

60553.0 where

656.31915.0

070.0ˆ

32

1

1

2

ˆ

ˆ

11

1

1

n

SSESwith

n

XX

SS

St

n

iin

ii

From Table

Page 23: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2323

Test of Slope Parameter Test of Slope Parameter

HH00: : 11 = 0 = 0

HHaa: : 11 0 0

.05.05 df df 5 - 2 = 35 - 2 = 3 Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

t0 3.1824-3.1824

.025

Reject Reject

.025

t0 3.1824-3.1824

.025

Reject Reject

.025

tS

.

..

1 1

1

0 70 001915

3 656tS

.

..

1 1

1

0 70 001915

3 656

Reject at Reject at = .05 = .05

There is evidence of a There is evidence of a linear relationshiplinear relationship

Page 24: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2424

Test of Slope ParameterTest of Slope ParameterComputer OutputComputer Output

Parameter EstimatesParameter Estimates

Parameter Standard Variable DF Estimate Error t Value Pr > |t|

Intercept 1 -0.10000 0.63509 -0.16 0.8849 Estriol 1 0.70000 0.19149 3.66 0.0354

t = k / S

P-Value

Sk

kk

^^

^ ^

Page 25: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2525

Measures of Variation Measures of Variation in Regression in Regression

1.1. Total Sum of Squares (SSTotal Sum of Squares (SSyyyy)) Measures Variation of Observed Measures Variation of Observed YYii Around Around

the Meanthe MeanYY 2.2. Explained Variation (SSR)Explained Variation (SSR)

Variation Due to Relationship Between Variation Due to Relationship Between XX & & YY

3.3. Unexplained Variation (SSE)Unexplained Variation (SSE) Variation Due to Other FactorsVariation Due to Other Factors

Page 26: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2626

Y

X

Y

X i

Y

X

Y

X i

Variation MeasuresVariation Measures

Y Xi i 0 1 Y Xi i 0 1

Total sum Total sum

of squares of squares

(Y(Yii - -Y)Y)22

Unexplained sum Unexplained sum

of squares (Yof squares (Yii - -

YYii))22

^

Explained sum of Explained sum of

squares (Ysquares (Yii - -Y)Y)22 ^

YYii

Page 27: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2727

1.1. ProportionProportion of Variation ‘Explained’ of Variation ‘Explained’ by Relationship Between by Relationship Between XX & & YY

Coefficient of DeterminationCoefficient of Determination

n

ii

n

ii

n

ii

YY

YYYY

r

1

2

1

2

1

2

2

ˆ

Variation Total

Variation Explained

n

ii

n

ii

n

ii

YY

YYYY

r

1

2

1

2

1

2

2

ˆ

Variation Total

Variation Explained

0 r2 1

Page 28: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2828

Y

X

Y

X

Y

X

Coefficient of Determination Coefficient of Determination ExamplesExamples

Y

X

r2 = 1 r2 = 1

r2 = .8 r2 = 0

Page 29: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 2929

Coefficient of Coefficient of Determination ExampleDetermination Example

Reconsider the Obstetrics example. Interpret a Reconsider the Obstetrics example. Interpret a coefficient of Determination coefficient of Determination ofof 0.8167.0.8167.

Answer:Answer: About 82% of the About 82% of the

total variation of birthweight total variation of birthweight

Is explained by the mother’s Is explained by the mother’s

Estriol level. Estriol level.

Page 30: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3030

r r 22 Computer Output Computer Output

Root MSE 0.60553 R-Square 0.8167

Dependent Mean 2.00000 Adj R-Sq 0.7556

Coeff Var 30.27650 r2 adjusted for number of

explanatory variables & sample size

S

r2

N-1Adj R-Sq=1- 1-Rsquare .

N - k - 1

Page 31: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3131

Using the Model for Using the Model for Prediction & EstimationPrediction & Estimation

Page 32: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3232

Regression Modeling Steps Regression Modeling Steps

1.1. Hypothesize Deterministic ComponentHypothesize Deterministic Component 2.2. Estimate Unknown Model ParametersEstimate Unknown Model Parameters 3.3. Specify Probability Distribution of Random Specify Probability Distribution of Random

Error Term-Estimate Standard Deviation of Error Term-Estimate Standard Deviation of ErrorError

4.4. Evaluate ModelEvaluate Model 5.5. Use Model for Prediction & Estimation Use Model for Prediction & Estimation

Page 33: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3333

Prediction With Regression Prediction With Regression ModelsModels

What Is Predicted?What Is Predicted?

Population Mean Response Population Mean Response EE((YY) for Given ) for Given XX• Point on Population Regression LinePoint on Population Regression Line

Individual Response (Individual Response (YYii) for Given ) for Given XX

Page 34: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3434

What Is Predicted?What Is Predicted?

Mean Y, E(Y)

Y

Y i= 0

+ 1X

^Y Individual

Prediction, Y

E(Y) = 0 + 1X

^

XXP

^^

Mean Y, E(Y)

Y

Y i= 0

+ 1X

^Y Individual

Prediction, Y

E(Y) = 0 + 1X

^

XXP

^^

Page 35: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3535

ConfidenceConfidence Interval Estimate of Interval Estimate of Mean Mean YY

n

ii

p

Y

YnYn

XX

XX

nSS

StYYEStY

1

2

2

ˆ

ˆ2/,2ˆ2/,2

1

where

ˆ)(ˆ

n

ii

p

Y

YnYn

XX

XX

nSS

StYYEStY

1

2

2

ˆ

ˆ2/,2ˆ2/,2

1

where

ˆ)(ˆ

Page 36: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3636

Factors Affecting Factors Affecting Interval WidthInterval Width

1.1. Level of Confidence (1 - Level of Confidence (1 - )) Width Increases as Confidence IncreasesWidth Increases as Confidence Increases

2.2. Data Dispersion (Data Dispersion (ss)) Width Increases as Variation IncreasesWidth Increases as Variation Increases

3.3. Sample SizeSample Size Width Decreases as Sample Size IncreasesWidth Decreases as Sample Size Increases

4.4. Distance of Distance of XXpp from Mean from MeanXX Width Increases as Distance IncreasesWidth Increases as Distance Increases

Page 37: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3737

Why Distance from Mean?Why Distance from Mean?

Sample 2 Line

Y

XX1 X2

Y_ Sample 1 Line

Sample 2 Line

Y

XX1 X2

Y_ Sample 1 Line

Greater Greater dispersion dispersion than than XX11

XX

Page 38: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3838

ConfidenceConfidence Interval Interval Estimate ExampleEstimate Example

Reconsider the Obstetrics example with the following Reconsider the Obstetrics example with the following data: data:

EstriolEstriol (mg/24h)(mg/24h) B.w.B.w. (g/1000)(g/1000)

11 1122 1133 2244 2255 44

Estimate the Estimate the meanmean BW and a subject’s BW response BW and a subject’s BW response when the Estriol level is when the Estriol level is 44 at at .05.05 level. level.

Page 39: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 3939

Solution TableSolution Table

Xi Yi Xi2 Yi

2 XiYi

1 1 1 1 1

2 1 4 1 2

3 2 9 4 6

4 2 16 4 8

5 4 25 16 20

15 10 55 26 37

Xi Yi Xi2 Yi

2 XiYi

1 1 1 1 1

2 1 4 1 2

3 2 9 4 6

4 2 16 4 8

5 4 25 16 20

15 10 55 26 37

Page 40: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4040

ConfidenceConfidence Interval Estimate Interval Estimate Solution - Mean BWSolution - Mean BW

7553.3)(6445.1

3316.01824.37.2)(3316.01824.37.2

3316.010

34

5

160553.

7.247.01.0ˆ

ˆ)(ˆ

2

ˆ

ˆ2/,2ˆ2/,2

YE

YE

S

Y

StYYEStY

Y

YnYn

7553.3)(6445.1

3316.01824.37.2)(3316.01824.37.2

3316.010

34

5

160553.

7.247.01.0ˆ

ˆ)(ˆ

2

ˆ

ˆ2/,2ˆ2/,2

YE

YE

S

Y

StYYEStY

Y

YnYn

XX to be predicted to be predictedXX to be predicted to be predicted

Page 41: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4141

n

ii

PYY

YYnPYYn

XX

XX

nSS

StYYStY

1

2

2

ˆ

ˆ2/,2ˆ2/,2

11

where

ˆˆ

n

ii

PYY

YYnPYYn

XX

XX

nSS

StYYStY

1

2

2

ˆ

ˆ2/,2ˆ2/,2

11

where

ˆˆ

PredictionPrediction Interval of Individual Interval of Individual ResponseResponse

Note!Note!

Page 42: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4242

Why the Extra ‘SWhy the Extra ‘S’’??

Expected(Mean) Y

Y

Y i= 0

+ 1X i

^

Y we're trying to predict

Prediction, Y

E(Y) = 0 + 1X

^

XXP

^

^Expected(Mean) Y

Y

Y i= 0

+ 1X i

^

Y we're trying to predict

Prediction, Y

E(Y) = 0 + 1X

^

XXP

^

^

Page 43: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4343

SAS codes for computing mean SAS codes for computing mean and prediction intervalsand prediction intervals

DataData BW; /*Reading data in SAS*/ BW; /*Reading data in SAS*/ input estriol birthw;input estriol birthw; cards;cards; 11 11 22 11 33 22 44 22 55 44 ; ; runrun;;

PROC REGPROC REG data=BW data=BW; /*Fitting a linear regression model*/; /*Fitting a linear regression model*/ model birthw=estriol/CLI CLM alpha=.05;model birthw=estriol/CLI CLM alpha=.05; runrun; ;

Page 44: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4444

Interval Estimate from SAS- Interval Estimate from SAS- OutputOutput

The REG Procedure

Dependent Variable: y

Output Statistics

Dep Var Predicted Std Error

Obs y Value Mean Predict 95% CL Mean 95% CL Predict Residual

1 1.0000 0.6000 0.4690 -0.8927 2.0927 -1.8376 3.0376 0.4000

2 1.0000 1.3000 0.3317 0.2445 2.3555 -0.8972 3.4972 -0.3000

3 2.0000 2.0000 0.2708 1.1382 2.8618 -0.1110 4.1110 0

4 2.0000 2.7000 0.3317 1.6445 3.7555 0.5028 4.8972 -0.7000

5 4.0000 3.4000 0.4690 1.9073 4.8927 0.9624 5.8376 0.6000

Predicted Predicted YY when when XX = 3 = 3

Confidence Confidence IntervalInterval

SSYYPrediction Prediction IntervalInterval

Page 45: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4545

Hyperbolic Interval BandsHyperbolic Interval Bands

X

Y

X

Y i= 0

+ 1X i

^

XP

_

^^

X

Y

X

Y i= 0

+ 1X i

^

XP

_

^^

Page 46: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4646

Correlation ModelsCorrelation Models

Page 47: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4747

Types of Types of Probabilistic ModelsProbabilistic Models

ProbabilisticModels

RegressionModels

CorrelationModels

OtherModels

ProbabilisticModels

RegressionModels

CorrelationModels

OtherModels

Page 48: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4848

Both variables are treated the same in Both variables are treated the same in correlation; in regression there is a predictor correlation; in regression there is a predictor and a responseand a response

In regression the x variable is assumed non-In regression the x variable is assumed non-random or measured without errorrandom or measured without error

Correlation is used in looking for relationships, Correlation is used in looking for relationships, regression for predictionregression for prediction

Correlation vs. regressionCorrelation vs. regression

Page 49: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 4949

Correlation ModelsCorrelation Models

1.1. Answer ‘Answer ‘How Strong How Strong Is the Linear Is the Linear Relationship Between 2 Variables?’Relationship Between 2 Variables?’

2.2. Coefficient of Correlation UsedCoefficient of Correlation Used Population Correlation Coefficient Denoted Population Correlation Coefficient Denoted

(Rho) (Rho) Values Range from -1 to +1Values Range from -1 to +1 Measures Degree of AssociationMeasures Degree of Association

3.3. Used Mainly for UnderstandingUsed Mainly for Understanding

Page 50: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5050

1.1. Pearson Product Moment Coefficient Pearson Product Moment Coefficient of Correlation between x and y:of Correlation between x and y:

Sample Coefficient Sample Coefficient of Correlationof Correlation

yyxx

xy

n

ii

n

ii

n

iii

SSSS

SS

YYXX

YYXXr

1

2

1

2

1

yyxx

xy

n

ii

n

ii

n

iii

SSSS

SS

YYXX

YYXXr

1

2

1

2

1

Page 51: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5151

Coefficient of Correlation Coefficient of Correlation ValuesValues

-1.0-1.0 +1.0+1.000-.5-.5 +.5+.5

Page 52: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5252

Coefficient of Correlation Coefficient of Correlation ValuesValues

-1.0-1.0 +1.0+1.000-.5-.5 +.5+.5

No No CorrelationCorrelation

Page 53: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5353

Coefficient of Correlation Coefficient of Correlation ValuesValues

-1.0-1.0 +1.0+1.000

Increasing degree of Increasing degree of negative correlationnegative correlation

-.5-.5 +.5+.5

No No CorrelationCorrelation

Page 54: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5454

Coefficient of Correlation Coefficient of Correlation ValuesValues

-1.0-1.0 +1.0+1.000-.5-.5 +.5+.5

Perfect Perfect Negative Negative

CorrelationCorrelationNo No

CorrelationCorrelation

Page 55: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5555

Coefficient of Correlation Coefficient of Correlation ValuesValues

-1.0-1.0 +1.0+1.000-.5-.5 +.5+.5

Perfect Perfect Negative Negative

CorrelationCorrelationNo No

CorrelationCorrelation

Increasing degree of Increasing degree of positive correlationpositive correlation

Page 56: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5656

Coefficient of Correlation Coefficient of Correlation ValuesValues

-1.0-1.0 +1.0+1.000

Perfect Perfect Positive Positive

CorrelationCorrelation

-.5-.5 +.5+.5

Perfect Perfect Negative Negative

CorrelationCorrelationNo No

CorrelationCorrelation

Page 57: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5757

Coefficient of CorrelationCoefficient of Correlation ExamplesExamples

Y

X

Y

X

Y

X

Y

X

r = 1 r = -1

r = .89 r = 0

Page 58: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5858

Test of Test of Coefficient of Correlation Coefficient of Correlation

1.1. Shows If There Is a Linear Shows If There Is a Linear Relationship Between 2 Numerical Relationship Between 2 Numerical VariablesVariables

2.2. Same Conclusion as Testing Same Conclusion as Testing Population Slope Population Slope 11

3.3. Hypotheses Hypotheses HH00: : = 0 (No Correlation) = 0 (No Correlation)

HHaa: : 0 (Correlation) 0 (Correlation)

Page 59: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 5959

1 Sample t-Test on 1 Sample t-Test on Correlation Coefficient Correlation Coefficient

Hypotheses Hypotheses HH00: : = 0 (No Correlation) = 0 (No Correlation)

HHaa: : 0 (Correlation) 0 (Correlation)

test statistic: test statistic: under Hunder H00

tt = = r r (n-2)(n-2)1/21/2 / (1- / (1-rr22))1/2 1/2 ~ ~ tt ((nn-2)-2)

Reject Reject HH00 if | if |tt| > t| > tαα/2, n-2/2, n-2

Page 60: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 6060

1 Sample Z-Test on 1 Sample Z-Test on Correlation Coefficient Correlation Coefficient

Hypotheses (Fisher)Hypotheses (Fisher) HH00: : = = 00

HHaa: : 00

test statistic: test statistic: under Hunder H00::

Reject Reject HH00 if | if |zz| > z | > z 1-1-αα/2/2

21 1ln ~ ( , )

2 1

rz N

r

0

0

11ln

2 1

2 1

3n

Page 61: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 6161

ConclusionConclusion

1.1. Describe the Linear Regression ModelDescribe the Linear Regression Model

2.2. State the Regression Modeling StepsState the Regression Modeling Steps

3.3. Explain Ordinary Least SquaresExplain Ordinary Least Squares

4.4. Compute Regression CoefficientsCompute Regression Coefficients

5.5. Understand and check model assumptionsUnderstand and check model assumptions

6.6. Predict Response VariablePredict Response Variable

7.7. Comments of SAS OutputComments of SAS Output

Page 62: EPI 809/Spring 2008 1 Probability Distribution of Random Error

EPI 809/Spring 2008EPI 809/Spring 2008 6262

Conclusion … Conclusion …

8.8. Correlation ModelsCorrelation Models

9.9. Test of coefficient of CorrelationTest of coefficient of Correlation