chap10 anova

47
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft ® Excel 4 th Edition

Upload: novyakurnianingputri

Post on 28-Dec-2015

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1

Chapter 10

Analysis of Variance

Statistics for ManagersUsing Microsoft® Excel

4th Edition

Page 2: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-2

Chapter Goals

After completing this chapter, you should be able to:

Recognize situations in which to use analysis of variance Understand different analysis of variance designs Perform a single-factor hypothesis test and interpret results Conduct and interpret post-hoc multiple comparisons

procedures Analyze two-factor analysis of variance tests

Page 3: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-3

Chapter Overview

Analysis of Variance (ANOVA)

F-test

Tukey-Kramer

test

One-Way ANOVA

Two-Way ANOVA

InteractionEffects

Page 4: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-4

General ANOVA Setting

Investigator controls one or more independent variables Called factors (or treatment variables) Each factor contains two or more levels (or groups or

categories/classifications) Observe effects on the dependent variable

Response to levels of independent variable Experimental design: the plan used to collect

the data

Page 5: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-5

Completely Randomized Design

Experimental units (subjects) are assigned randomly to treatments Subjects are assumed homogeneous

Only one factor or independent variable With two or more treatment levels

Analyzed by one-factor analysis of variance (one-way ANOVA)

Page 6: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-6

One-Way Analysis of Variance

Evaluate the difference among the means of three or more groups

Examples: Accident rates for 1st, 2nd, and 3rd shift

Expected mileage for five brands of tires

Assumptions Populations are normally distributed Populations have equal variances Samples are randomly and independently drawn

Page 7: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-7

Hypotheses of One-Way ANOVA

All population means are equal i.e., no treatment effect (no variation in means among

groups)

At least one population mean is different i.e., there is a treatment effect Does not mean that all population means are different

(some pairs may be the same)

c3210 μμμμ:H

same the are means population the of all Not:H1

Page 8: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-8

One-Factor ANOVA

All Means are the same:The Null Hypothesis is True

(No Treatment Effect)

c3210 μμμμ:H

same the are μ all Not:H i1

321 μμμ

Page 9: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-9

One-Factor ANOVA

At least one mean is different:The Null Hypothesis is NOT true

(Treatment Effect is present)

c3210 μμμμ:H

same the are μ all Not:H i1

321 μμμ 321 μμμ

or

(continued)

Page 10: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-10

Partitioning the Variation

Total variation can be split into two parts:

SST = Total Sum of Squares (Total variation)

SSA = Sum of Squares Among Groups (Among-group variation)

SSW = Sum of Squares Within Groups (Within-group variation)

SST = SSA + SSW

Page 11: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-11

Partitioning the Variation

Total Variation = the aggregate dispersion of the individual data values across the various factor levels (SST)

Within-Group Variation = dispersion that exists among the data values within a particular factor level (SSW)

Among-Group Variation = dispersion between the factor sample means (SSA)

SST = SSA + SSW

(continued)

Page 12: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-12

Partition of Total Variation

Variation Due to Factor (SSA)

Variation Due to Random Sampling (SSW)

Total Variation (SST)

Commonly referred to as: Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within Groups Variation

Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation

= +

Page 13: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-13

Total Sum of Squares

c

1j

n

1i

2ij

j

)XX(SSTWhere:

SST = Total sum of squares

c = number of groups (levels or treatments)

nj = number of observations in group j

Xij = ith observation from group j

X = grand mean (mean of all data values)

SST = SSA + SSW

Page 14: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-14

Total Variation

Group 1 Group 2 Group 3

Response, X

X

2cn

212

211 )XX(...)XX()XX(SST

c

(continued)

Page 15: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-15

Among-Group Variation

Where:

SSA = Sum of squares among groups

c = number of groups or populations

nj = sample size from group j

Xj = sample mean from group j

X = grand mean (mean of all data values)

2j

c

1jj )XX(nSSA

SST = SSA + SSW

Page 16: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-16

Among-Group Variation

Variation Due to Differences Among Groups

i j

2j

c

1jj )XX(nSSA

1

k

SSAMSA

Mean Square Among =

SSA/degrees of freedom

(continued)

Page 17: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-17

Among-Group Variation

Group 1 Group 2 Group 3

Response, X

X1X

2X3X

2222

211 )xx(n...)xx(n)xx(nSSA cc

(continued)

Page 18: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-18

Within-Group Variation

Where:

SSW = Sum of squares within groups

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

Xij = ith observation in group j

2jij

n

1i

c

1j

)XX(SSWj

SST = SSA + SSW

Page 19: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-19

Within-Group Variation

Summing the variation within each group and then adding over all groups

i

cn

SSWMSW

Mean Square Within =

SSW/degrees of freedom

2jij

n

1i

c

1j

)XX(SSWj

(continued)

Page 20: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-20

Within-Group Variation

Group 1 Group 2 Group 3

Response, X

1X2X

3X

2ccn

2212

2111 )XX(...)XX()Xx(SSW

c

(continued)

Page 21: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-21

Obtaining the Mean Squares

cn

SSWMSW

1c

SSAMSA

1n

SSTMST

Page 22: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-22

One-Way ANOVA Table

Source of Variation

dfSS MS(Variance)

Among Groups

SSA MSA =

Within Groups

n - cSSW MSW =

Total n - 1SST =SSA+SSW

c - 1 MSA

MSW

F ratio

c = number of groupsn = sum of the sample sizes from all groupsdf = degrees of freedom

SSA

c - 1

SSW

n - c

F =

Page 23: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-23

One-Factor ANOVAF Test Statistic

Test statistic

MSA is mean squares among variances

MSW is mean squares within variances

Degrees of freedom df1 = c – 1 (c = number of groups)

df2 = n – c (n = sum of sample sizes from all populations)

MSW

MSAF

H0: μ1= μ2 = … = μc

H1: At least two population means are different

Page 24: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-24

Interpreting One-Factor ANOVA F Statistic

The F statistic is the ratio of the among estimate of variance and the within estimate of variance The ratio must always be positive df1 = c -1 will typically be small df2 = n - c will typically be large

Decision Rule: Reject H0 if F > FU,

otherwise do not reject H0

0

= .05

Reject H0Do not reject H0

FU

Page 25: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-25

One-Factor ANOVA F Test Example

You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the .05 significance level, is there a difference in mean distance?

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

Page 26: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-26

••••

One-Factor ANOVA Example: Scatter Diagram

270

260

250

240

230

220

210

200

190

••

•••

•••••

Distance

1X

2X

3X

X

227.0 x

205.8 x 226.0x 249.2x 321

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

Club1 2 3

Page 27: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-27

One-Factor ANOVA Example Computations

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

X1 = 249.2

X2 = 226.0

X3 = 205.8

X = 227.0

n1 = 5

n2 = 5

n3 = 5

n = 15

c = 3SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4

SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSA = 4716.4 / (3-1) = 2358.2

MSW = 1119.6 / (15-3) = 93.325.275

93.3

2358.2F

Page 28: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-28

F = 25.275

One-Factor ANOVA Example Solution

H0: μ1 = μ2 = μ3

H1: μi not all equal

= .05

df1= 2 df2 = 12

Test Statistic:

Decision:

Conclusion:

Reject H0 at = 0.05

There is evidence that at least one μi differs from the rest

0

= .05

FU = 3.89Reject H0Do not

reject H0

25.27593.3

2358.2

MSW

MSAF

Critical Value:

FU = 3.89

Page 29: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-29

SUMMARY

Groups Count Sum Average Variance

Club 1 5 1246 249.2 108.2

Club 2 5 1130 226 77.5

Club 3 5 1029 205.8 94.2

ANOVA

Source of Variation

SS df MS F P-value F crit

Between Groups

4716.4 2 2358.2 25.275 4.99E-05 3.89

Within Groups

1119.6 12 93.3

Total 5836.0 14        

ANOVA -- Single Factor:Excel Output

EXCEL: tools | data analysis | ANOVA: single factor

Page 30: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-30

The Tukey-Kramer Procedure

Tells which population means are significantly different e.g.: μ1 = μ2 μ3

Done after rejection of equal means in ANOVA Allows pair-wise comparisons

Compare absolute mean differences with critical range

xμ1 = μ 2

μ3

Page 31: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-31

Tukey-Kramer Critical Range

where:QU = Value from Studentized Range Distribution

with c and n - c degrees of freedom for the desired level of (see appendix E.9 table)

MSW = Mean Square Within ni and nj = Sample sizes from groups j and j’

j'jU n

1

n

1

2

MSWQRange Critical

Page 32: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-32

The Tukey-Kramer Procedure: Example

1. Compute absolute mean differences:Club 1 Club 2 Club 3

254 234 200263 218 222241 235 197237 227 206251 216 204 20.2205.8226.0xx

43.4205.8249.2xx

23.2226.0249.2xx

32

31

21

2. Find the QU value from the table in appendix E.9 with c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom for the desired level of ( = .05 used here):

3.77QU

Page 33: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-33

The Tukey-Kramer Procedure: Example

5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at 5% level of significance.

16.2855

1

5

1

2

93.33.77

n

1

n

1

2

MSWQRange Critical

j'jU

3. Compute Critical Range:

20.2xx

43.4xx

23.2xx

32

31

21

4. Compare:

(continued)

Page 34: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-34

Tukey-Kramer in PHStat

Page 35: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-35

Two-Way ANOVA

Examines the effect of Two factors of interest on the dependent

variable e.g., Percent carbonation and line speed on soft drink

bottling process Interaction between the different levels of these

two factors e.g., Does the effect of one particular carbonation

level depend on which level the line speed is set?

Page 36: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-36

Two-Way ANOVA

Assumptions

Populations are normally distributed

Populations have equal variances

Independent random samples are drawn

(continued)

Page 37: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-37

Two-Way ANOVA Sources of Variation

Two Factors of interest: A and B

r = number of levels of factor A

c = number of levels of factor B

n’ = number of replications for each cell

n = total number of observations in all cells(n = rcn’)

Xijk = value of the kth observation of level i of factor A and level j of factor B

Page 38: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-38

Two-Way ANOVA Sources of Variation

SSTTotal Variation

SSAFactor A Variation

SSBFactor B Variation

SSABVariation due to interaction

between A and B

SSERandom variation (Error)

Degrees of Freedom:

r – 1

c – 1

(r – 1)(c – 1)

rc(n’ – 1)

n - 1

SST = SSA + SSB + SSAB + SSE

(continued)

Page 39: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-39

Two Factor ANOVA Equations

r

1i

c

1j

n

1k

2ijk )XX(SST

2r

1i

..i )XX(ncSSA

2c

1j

.j. )XX(nrSSB

Total Variation:

Factor A Variation:

Factor B Variation:

Page 40: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-40

Two Factor ANOVA Equations

2r

1i

c

1j

.j...i.ij )XXXX(nSSAB

r

1i

c

1j

n

1k

2.ijijk )XX(SSE

Interaction Variation:

Sum of Squares Error:

(continued)

Page 41: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-41

Two Factor ANOVA Equations

where:Mean Grand

nrc

X

X

r

1i

c

1j

n

1kijk

r) ..., 2, 1, (i A factor of level i of Meannc

X

X th

c

1j

n

1kijk

..i

c) ..., 2, 1, (j B factor of level j of Meannr

XX th

r

1i

n

1kijk

.j.

ij cell of Meann

XX

n

1k

ijk.ij

r = number of levels of factor A

c = number of levels of factor B

n’ = number of replications in each cell

(continued)

Page 42: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-42

Mean Square Calculations

1r

SSA Afactor square MeanMSA

1c

SSBB factor square MeanMSB

)1c)(1r(

SSABninteractio square MeanMSAB

)1'n(rc

SSEerror square MeanMSE

Page 43: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-43

Two-Way ANOVA:The F Test Statistic

F Test for Factor B Effect

F Test for Interaction Effect

H0: μ1.. = μ2.. = μ3.. = • • •

H1: Not all μi.. are equal

H0: the interaction of A and B is equal to zero

H1: interaction of A and B is not zero

F Test for Factor A Effect

H0: μ.1. = μ.2. = μ.3. = • • •

H1: Not all μ.j. are equal

Reject H0

if F > FUMSE

MSAF

MSE

MSBF

MSE

MSABF

Reject H0

if F > FU

Reject H0

if F > FU

Page 44: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-44

Two-Way ANOVASummary Table

Source ofVariation

Sum ofSquares

Degrees of Freedom

Mean Squares

FStatistic

Factor A SSA r – 1MSA

= SSA /(r – 1)MSAMSE

Factor B SSB c – 1MSB

= SSB /(c – 1)MSBMSE

AB(Interaction)

SSAB (r – 1)(c – 1)MSAB

= SSAB / (r – 1)(c – 1)MSABMSE

Error SSE rc(n’ – 1)MSE =

SSE/rc(n’ – 1)

Total SST n – 1

Page 45: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-45

Features of Two-Way ANOVA F Test

Degrees of freedom always add up n-1 = rc(n’-1) + (r-1) + (c-1) + (r-1)(c-1)

Total = error + factor A + factor B + interaction

The denominator of the F Test is always the same but the numerator is different

The sums of squares always add up SST = SSE + SSA + SSB + SSAB

Total = error + factor A + factor B + interaction

Page 46: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-46

Examples:Interaction vs. No Interaction

No interaction:

Factor B Level 1

Factor B Level 3

Factor B Level 2

Factor A Levels

Factor B Level 1

Factor B Level 3

Factor B Level 2

Factor A Levels

Mea

n R

espo

nse

Mea

n R

espo

nse

Interaction is present:

Page 47: Chap10 Anova

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-47

Chapter Summary

Described one-way analysis of variance The logic of ANOVA

ANOVA assumptions

F test for difference in c means

The Tukey-Kramer procedure for multiple comparisons

Described two-way analysis of variance Examined effects of multiple factors

Examined interaction between factors