m22- regression & correlation 1 department of ism, university of alabama, 1992-2003 lesson...

52
M22- Regression & Correlation 1 Department of ISM, University of Alabama, 1992-2003 Lesson Objectives Know what the equation of a straight line is, in terms of slope and y-intercept. Learn how find the equation of the least squares regression line. Know how to draw a regression line on a scatterplot. Know how to use the regression equatio to estimate the mean of Y for a given value of X.

Upload: albert-houston

Post on 30-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 1 Department of ISM, University of Alabama, 1992-2003

Lesson Objectives

Know what the equation of a straight line is, in terms of slope and y-intercept.

Learn how find the equation of the least squares regression line.

Know how to draw a regression line on a scatterplot.

Know how to use the regression equation to estimate the mean of Y for a given value of X.

Page 2: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 2 Department of ISM, University of Alabama, 1992-2003

Best graphical tool for “seeing”the relationship between two quantitative variables.

Use to identify:

• Patterns (relationships)

• Unusual data (outliers)

Scatterplot

Page 3: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 3 Department of ISM, University of Alabama, 1992-2003

Y

X

Y

X

Y

X

Y

X

Y

Positive Linear Relationship

Negative Linear Relationship

Nonlinear Relationship,need to change the model

No Relationship (X is not useful)

Page 4: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 4 Department of ISM, University of Alabama, 1992-2003

RegressionRegression

AnalysisAnalysis

mechanicsmechanics

RegressionRegression

AnalysisAnalysis

mechanicsmechanics

Page 5: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 5 Department of ISM, University of Alabama, 1992-2003

Equation of a straight line.

Y = mx + b m = slope = “rate of change”

b = the “y” intercept.

Y = a + bx

^

b = slope

a = the “y” intercept.

Days of algebra

Days of algebra

Statistics form

Statistics form

Y = estimate of the mean of Y for some X value.

^

Page 6: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 6 Department of ISM, University of Alabama, 1992-2003

by “eyeball”.

by using equations by hand.

by hand calculator.

by computer: Minitab, Excel, etc.

Equation of a straight line.How are the slope and y-interceptdetermined?

Page 7: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 7 Department of ISM, University of Alabama, 1992-2003

Equation of a straight line.

Y = a + bx ^

X-axis0

rise

run

a“y” intercept

b =

Page 8: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 8 Department of ISM, University of Alabama, 1992-2003

Equation of a straight line.

Y = a + bx ^

X-axis0

rise

run

a“y” intercept

b =

Page 9: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 9 Department of ISM, University of Alabama, 1992-2003

Population: All ST 260 students

Each value of X defines a subpopulation of “height” values.

The goal is to estimate the true mean weight for each of the infinite number of subpopulations.

Example 1:

Y = Weight in pounds,X = Height in inches.

Measure:

Is height a goodestimator of mean weight?

Page 10: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 10 Department of ISM, University of Alabama, 1992-2003

Sample of n = 5 studentsY = Weight in pounds,X = Height in inches.

1

2

3

4

5

Ht Wt

73 175

68 158

67 140

72 207

62 115

Case

Example 1:

Step 1?

Page 11: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 11 Department of ISM, University of Alabama, 1992-2003

DTDPDTDP

Page 12: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 12 Department of ISM, University of Alabama, 1992-2003

100

120

140

160

180

200

220

60 64 68 72 76HEIGHT

.

.

.

.

.

WE

IGH

T

Where should the line go?

Where should the line go?

X Y

73 175

68 158

67 140

72 207

62 115

X Y

73 175

68 158

67 140

72 207

62 115

Example 1

Page 13: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 13 Department of ISM, University of Alabama, 1992-2003

2 2

( )

( )i i

i

x y nxyb

x nx

page 615

a y bx

Equation of Least Squares Regression LineEquation of Least Squares Regression Line

y a bx Slope:Slope:

y-intercepty-intercept These are not

the preferred

computational

equations.

These are not

the preferred

computational

equations.

Page 14: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 14 Department of ISM, University of Alabama, 1992-2003

Basic intermediate calculations

(xi - x)(yi - y)

(xi - x)2

(yi - y)2

1

2

3

= Sxy =

= Sxx =

= Syy =

Numerator part of S2

Look at your formula sheet Look at your formula sheet

Page 15: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 15 Department of ISM, University of Alabama, 1992-2003

1

2

3

= Sxy = xy ( x)( y)

n

= Sxx =

= Syy = y2

ny)2 (

x2

nx)2 (

Alternate intermediate calculations

Look at your formula sheet Look at your formula sheet

Numerator part of S2

Page 16: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

1

2

3

4

5

Case x y

Ht Wt

73 175

68 158

67 140

72 207

62 115

342 795

x y

xy

Ht*Wt

12775

10744

.

.

__.___54933

xy

x2

Ht 2

5329

4624

.

.

_ .___23470

x2

30625

24964

.

.

_ _.___131263

y2

Wt 2

y2

Example 1

Page 17: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 17 Department of ISM, University of Alabama, 1992-2003

Intermediate Summary Values

xy ( x)( y) n54933 ( 342 ) ( 795 ) 5

1

=

x2 n x)2 (2

23470 (342 ) 2 5 =

y2 n y)2 (3

131263 (795 )2 5 =

Example 1

Page 18: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 18 Department of ISM, University of Alabama, 1992-2003

Intermediate Summary ValuesExample 1

1

2

3

= 555.0

= 77.2

= 4858.0Once these values are calculated,the rest is easy!

Page 19: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 19 Department of ISM, University of Alabama, 1992-2003

Least Squares Regression Line

whereY = a + b X

b

a y b x

1

2

Prediction equation Prediction equation

Estimated Slope

Estimated Slope

Estimated Y - intercept Estimated Y - intercept

Page 20: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 20 Department of ISM, University of Alabama, 1992-2003

Slope, for Weight vs. Height

b 1

2 77.2555

=

= 7.189

Example 1

Page 21: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 21 Department of ISM, University of Alabama, 1992-2003

Intercept, for Weight vs. Height

a b y x

– 332.73 =

=795 5

y = 159342

5 x = = 68.4

= 159a (+7.189) 68.4

Example 1

Page 22: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 22 Department of ISM, University of Alabama, 1992-2003

Prediction equation

^Y = a + b X

Wt = – 332.73 + 7.189 Ht^Y = – 332.73 + 7.189 X^

Example 1

Page 23: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 23 Department of ISM, University of Alabama, 1992-2003

100

120

140

160

180

200

220

60 64 68 72 76HEIGHT

Y = – 332.7 + 7.189X^

WE

IGH

TExample 1 Draw the line on the plot

Page 24: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 24 Department of ISM, University of Alabama, 1992-2003

100

120

140

160

180

200

220

60 64 68 72 76HEIGHT

Y = – 332.7 + 7.189 60^

Y = 98.64

X

Y = – 332.7 + 7.189 76^

Y = 213.7

XW

EIG

HT

Example 1 Draw the line on the plot

Page 25: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 25 Department of ISM, University of Alabama, 1992-2003

What a regression equation gives you:

The “line of means” for the Y population.

A prediction of the mean of the population of Y-values defined by a specific value of X.

Each value of X defines a subpopulation of Y-values; the value of regression equation is the

“least squares” estimate of the mean of that Y subpopulation.

Page 26: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 26 Department of ISM, University of Alabama, 1992-2003

Example 2: Estimate the weight of a student 5’ 5” tall.

Y = a + b X = – 332.73 + 7.189 X^

Page 27: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 27 Department of ISM, University of Alabama, 1992-2003

100

120

140

160

180

200

220

60 64 68 72 76HEIGHT

Y = – 332.7 + 7.189(65) =

^

WE

IGH

TExample 2

Page 28: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 28 Department of ISM, University of Alabama, 1992-2003

Calculate your own weight.

Why was your estimate not exact?

Page 29: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 29 Department of ISM, University of Alabama, 1992-2003

1. Calculate the least squares regression line.

2. Plot the data and draw theline through the data.

3. Predict Y for a given X.

4. Interpret the meaning of the regression line.

Regression: Know How To:

Page 30: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 30 Department of ISM, University of Alabama, 1992-2003

Page 31: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 31 Department of ISM, University of Alabama, 1992-2003

CorrelationCorrelation

Page 32: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 32 Department of ISM, University of Alabama, 1992-2003

Sample Correlation Coefficient, r

A numerical summary statistic that measures the strength of

the linear association between two quantitative variables.

Page 33: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 33 Department of ISM, University of Alabama, 1992-2003

Notation:

• r = sample correlation.

• = population correlation, “rho”.

r is an “estimator” of

Page 34: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 34 Department of ISM, University of Alabama, 1992-2003

Interpreting correlation:

-1.0 -1.0 rr +1.0 +1.0

r > 0.0 Pattern runs upward from left to right; “positive” trend.

r < 0.0 Pattern runs downward from left to right; “negative” trend.

Page 35: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 35 Department of ISM, University of Alabama, 1992-2003

Upward & downward trends:

r > 0.0 r < 0.0

Y

X-axis

Y

X-axisSlope and correlation

must have the same sign.

Slope and correlationmust have the same sign.

Page 36: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 36 Department of ISM, University of Alabama, 1992-2003

All data exactly on a straight line:

r = _____ r = _____

Perfect positive

relationship

Perfect positive

relationship Perfect negative

relationship

Perfect negative

relationship

Y

X-axis

Y

X-axis

Page 37: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 37 Department of ISM, University of Alabama, 1992-2003

r = _____________ r = _____________

Which has stronger correlation?

Y

X-axis

Y

X-axis

Page 38: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 38 Department of ISM, University of Alabama, 1992-2003

r close to -1 or +1 means _________________________ linear relation.

r close to 0 means _________________________ linear relation.

"Strength": How tightly the data follow a straight line.

Page 39: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 39 Department of ISM, University of Alabama, 1992-2003

r = ________________ r = ________________

Which has stronger correlation?

Y

X-axis

Y

X-axis

Page 40: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 40 Department of ISM, University of Alabama, 1992-2003

Y

X-axis X -axis

Y

Which has stronger correlation?

Strong parabolic pattern! We can fix it.

Strong parabolic pattern! We can fix it.

r = ________________ r = ________________

Page 41: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 41 Department of ISM, University of Alabama, 1992-2003

Computing Correlation

by hand using the formula

using a calculator (built-in)

using a computer: Excel, Minitab, . . . .

Page 42: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 42 Department of ISM, University of Alabama, 1992-2003

Formula for Sample Correlation (Page 627)

2 2 2 2

( ) ( )( )

( ) ( ) ( ) ( )

i i i i

i i i i

n x y x yr

n x x n y y

2 3

1r Sxy

Syy

Sxx

Look at your formula sheet Look at your formula sheet

Page 43: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 43 Department of ISM, University of Alabama, 1992-2003

Calculating Correlation

2 3

1r =

Look at your formula sheet Look at your formula sheet

Example 1; Weight versus Height

=

“Go to Slide 18 for values.”

Page 44: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 44 Department of ISM, University of Alabama, 1992-2003

3000250020001500

200000

150000

100000

SQFT

ECI

RPLES

Scatterplot of Selling Price vs Square Footage for 50 Houses

Positive Linear Relationship

Example 6 Real estate data,Real estate data, previous sectionprevious section

r =

Page 45: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 45 Department of ISM, University of Alabama, 1992-2003

1009080706050403020100

90

80

70

60

50

40

30

20

10

FRLUNCH

8TPTAS

Scatterplot of 8th Grade SAT Percentile vs Free Lunch Participationfor the 128 Public School Systems in Alabama in 1995

Negative Linear Relationship

Example 7 AL school dataAL school data ,, previous sectionprevious section

r =

Page 46: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 46 Department of ISM, University of Alabama, 1992-2003

6543210

6

5

4

3

2

1

M_RAIN

NIAR_T

Scatterplot of Tuscaloosa Rainfall vs Moscow Rainfall

for 60 Months

No linear Relationship

Example 9 RainfallRainfall data data ,, previous previous sectionsection

r =

Page 47: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 47 Department of ISM, University of Alabama, 1992-2003

Size of “r” does NOT reflect the steepness of the slope, “b”;

but “r” and “b” must have the same sign.

r = b s x s y

and = b r s y

s x

Comment 1:

Page 48: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 48 Department of ISM, University of Alabama, 1992-2003

Changing the units of Y and X does not affect the size of r.

Comment 2:

Inches to centimetersPounds to kilogramsCelsius to FahrenheitX to Z (standardized)

Page 49: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 49 Department of ISM, University of Alabama, 1992-2003

Comment 3: High correlation does not always imply causation.

Example: X = dryer temperature Y = drying time for clothes

Causation: Changes in X

actually do cause changes in Y.

Consistency, responsiveness, mechanism

Page 50: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 50 Department of ISM, University of Alabama, 1992-2003

Common ResponseBoth X and Y change as some unobserved third variable changes.

Comment 4:

Example:In basketball, there is a high correlation between points scored and personal fouls committed over a season. Third variable is ___?

Page 51: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 51 Department of ISM, University of Alabama, 1992-2003

ConfoundingThe effect of X on Y is"hopelessly" mixed up with the effects of other variables on Y.

Example:

Is adult behavior most affected

by environment or genetics?

Comment 5:

Page 52: M22- Regression & Correlation 1  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Know what the equation of a straight line is,

M22- Regression & Correlation 52 Department of ISM, University of Alabama, 1992-2003

The end