correlation and regression paired data is there a relationship? do the numbers appear to increase or...

Post on 11-Jan-2016

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Correlation and Regression

• Paired Data

• Is there a relationship?

• Do the numbers appear to increase or decrease together?

• Does one set increase as the other decreases?

• How consistent is the pattern?

• If so can we…

• Quantify it?

• Model it with an equation?

• Use the equation for prediction?

0.27

2

1.41

3

2.19

3

2.83

6

2.19

4

1.81

2

0.85

1

3.05

5

x Plastic (lb)

y Household

• The Linear Correlation Coefficient measures strength and direction of the linear relationship between paired x and y values in a sample.

• ρ (rho) is the population’s linear correlation coefficient.

• r is the sample’s linear correlation coefficient

Linear Correlation Coefficient

-110

no correlationnegative positive

Example/Homework

• Estimate r for the following relationships

1. Household size and amount of trash

2. Car weight and gas mileage

3. Car length and braking distance

4. Height and shoe size

5. Facebook friends and time spent on line

6. Car cost and number of cup holders

7. Time watching television and SAT scores

8. Outside temperature and student absences

9. Number of pages for the term paper and its grade

10.Number of accidents and car insurance premiums

Calculation

• The values of r is not affected by the units of measurements or the assignment of x and y.

• Round to three decimal places• The sample of paired data (x,y) is a random sample.• The pairs of (x,y) data have a bivariate normal

distribution.1. For every x, the paired y values are normally distributed2. For every y, the paired x values are normally distributed

nxy - (x)(y)

n(x2) - (x)2 n(y2) - (y)2r =

Calculating r

X Y XY X2 Y2

2 0.27

3 1.41

3 2.19

6 2.83

2 1.81

4 2.19

1 0.85

5 3.05

Calculating r

• Excel• The correl function

• Calculator• Data into two lists• STAT->TEST>E:

LinRegTTest• Enter two lists• Highlight

CALCULATE, Select Enter

• Find r (and t and p)

Budget Gross

18.5 81.8

72 75

0.25 12

55 68.75

10 138.3

70 19.8

17 72

8 107.9

Formal Hypothesis Test

• Test whether the linear correlation is significant• Hypothesis

• H0: ρ = 0 (no significant linear correlation)

• H1: ρ 0 (significant linear correlation)

• Two-tailed test• Still need a significance level• Two methods for calculating the test statistic and

critical value

1 - r 2

n - 2

rt =

Test Statistic and Critical Value

• Test statistic:

• Critical values: – T-table

– Two-tailed alpha heading

– Degrees of freedom = n - 2

Test Statistic and Critical Value

• Test statistic: r

• Critical value• Use to Table A-5

456789

101112131415161718192025303540455060708090

100

n

.999

.959

.917

.875

.834

.798

.765

.735

.708

.684

.661

.641

.623

.606

.590

.575

.561

.505

.463

.430

.402

.378

.361

.330

.305

.286

.269

.256

.950

.878

.811

.754

.707

.666

.632

.602

.576

.553

.532

.514

.497

.482

.468

.456

.444

.396

.361

.335

.312

.294

.279

.254

.236

.220

.207

.196

= .05= .01

For Example

Is there a correlation between engine size and mileage? If so, is it significant?

r =

Size Mileage

2.2 23

2 23

3 19

2.3 23

4.6 17

2.5 20

4 17

2.4 22

Common Errors Involving Correlation

1. Causation: It is wrong to conclude that correlation implies causality.

1. If strongly correlated, we can not always assume “x causes y”

1. y might cause x

2. The both might be caused by z

2. Averages: Averages suppress individual variation and may inflate the correlation coefficient.

3. There may be some relationship between x and y even when there is no significant linear correlation.

Homework

• For each of the following pairs of data find the linear correlation coefficient and determine if the correlation is significant.

MathCritical

Reading

720 690

720 590

690 500

680 490

550 470

480 560

664 654

750 710

650 680

560 610

LengthBraking Distance

194 131

183 136

194 129

191 127

198 146

196 146

200 155

188 139

197 133

200 131

191 131

Homework

• For each of the following pairs of data find the linear correlation coefficient and determine if the correlation is significant.

depth (ft)

Velocity (ft/sec)

0.7 1.55

2.0 1.11

2.6 1.42

3.3 1.39

4.6 1.39

5.9 1.14

7.3 0.91

8.6 0.59

9.9 0.59

10.6 0.41

11.2 0.22

Altitude (km)Temp

(C)

0.0 15.0

0.5 11.8

1.0 8.5

1.5 5.3

2.0 2.0

2.5 -1.2

3.0 -4.5

3.5 -7.7

4.0 -11.0

4.5 -14.2

5.0 -17.5

top related