business statistics, a first course (4e) © 2006 prentice-hall, inc. chap 10-1 chapter 10 two-sample...

84
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Two-Sample Tests and One-Way ANOVA Business Statistics, A First Course 4 th Edition

Upload: nora-hensley

Post on 02-Jan-2016

231 views

Category:

Documents


0 download

TRANSCRIPT

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-1

Chapter 10

Two-Sample Tests and One-Way ANOVA

Business Statistics, A First Course4th Edition

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-2

Learning Objectives

In this chapter, you learn hypothesis testing

procedures to test: The means of two independent populations The means of two related populations The proportions of two independent populations The variances of two independent populations The means of more than two populations

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-3

Chapter Overview

One-Way Analysis of Variance (ANOVA)

F-test

Tukey-Kramer test

Two-Sample Tests

Population Means, Independent Samples

Means, Related Samples

PopulationProportions

PopulationVariances

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-4

Two-Sample Tests

Two-Sample Tests

Population Means,

Independent Samples

Means, Related Samples

Population Variances

Mean 1 vs. independent Mean 2

Same population before vs. after treatment

Variance 1 vs.Variance 2

Examples:

Population Proportions

Proportion 1 vs. Proportion 2

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-5

Difference Between Two Means

Population means, independent

samples

σ1 and σ2 known

Goal: Test hypothesis or form a confidence interval for the difference between two population means, μ1 – μ2

The point estimate for the difference is

X1 – X2

*

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-6

Independent Samples

Population means, independent

samples

Different data sources Unrelated Independent

Sample selected from one population has no effect on the sample selected from the other population

Use the difference between 2 sample means

Use Z test, a pooled-variance t test, or a separate-variance t test

*

σ1 and σ2 known

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-7

Difference Between Two Means

Population means, independent

samples

σ1 and σ2 known

*

Use a Z test statistic

Use Sp to estimate unknown σ , use a t test statistic and pooled standard deviation

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Use S1 and S2 to estimate unknown σ1 and σ2, use a separate-variance t test

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-8

Population means, independent

samples

σ1 and σ2 known

σ1 and σ2 Known

Assumptions:

Samples are randomly and independently drawn

Population distributions are normal or both sample sizes are 30

Population standard deviations are known

*σ1 and σ2 unknown,

assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-9

Population means, independent

samples

σ1 and σ2 known …and the standard error of

X1 – X2 is

When σ1 and σ2 are known and both populations are normal or both sample sizes are at least 30, the test statistic is a Z-value…

2

22

1

21

XX n

σ

n

σσ

21

(continued)

σ1 and σ2 Known

*σ1 and σ2 unknown,

assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-10

Population means, independent

samples

σ1 and σ2 known

2

22

1

21

2121

μμXXZ

The test statistic for

μ1 – μ2 is:

σ1 and σ2 Known

*

(continued)

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-11

Hypothesis Tests forTwo Population Means

Lower-tail test:

H0: μ1 μ2

H1: μ1 < μ2

i.e.,

H0: μ1 – μ2 0H1: μ1 – μ2 < 0

Upper-tail test:

H0: μ1 ≤ μ2

H1: μ1 > μ2

i.e.,

H0: μ1 – μ2 ≤ 0H1: μ1 – μ2 > 0

Two-tail test:

H0: μ1 = μ2

H1: μ1 ≠ μ2

i.e.,

H0: μ1 – μ2 = 0H1: μ1 – μ2 ≠ 0

Two Population Means, Independent Samples

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-12

Two Population Means, Independent Samples

Lower-tail test:

H0: μ1 – μ2 0H1: μ1 – μ2 < 0

Upper-tail test:

H0: μ1 – μ2 ≤ 0H1: μ1 – μ2 > 0

Two-tail test:

H0: μ1 – μ2 = 0H1: μ1 – μ2 ≠ 0

/2 /2

-z -z/2z z/2

Reject H0 if Z < -Z Reject H0 if Z > Z Reject H0 if Z < -Z/2

or Z > Z/2

Hypothesis tests for μ1 – μ2

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-13

Population means, independent

samples

σ1 and σ2 known

2

22

1

21

21n

σ

n

σZXX

The confidence interval for

μ1 – μ2 is:

Confidence Interval, σ1 and σ2 Known

*σ1 and σ2 unknown,

assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-14

Population means, independent

samples

σ1 and σ2 known

σ1 and σ2 Unknown, Assumed Equal

Assumptions:

Samples are randomly and independently drawn

Populations are normally distributed or both sample sizes are at least 30

Population variances are unknown but assumed equal

*σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-15

Population means, independent

samples

σ1 and σ2 known

(continued)

*

Forming interval estimates:

The population variances are assumed equal, so use the two sample variances and pool them to estimate the common σ2

the test statistic is a t value with (n1 + n2 – 2) degrees of freedom

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Assumed Equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-16

Population means, independent

samples

σ1 and σ2 knownThe pooled variance is

(continued)

1)n(n

S1nS1nS

21

222

2112

p

()1

*σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Assumed Equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-17

Population means, independent

samples

σ1 and σ2 known

Where t has (n1 + n2 – 2) d.f.,

and

21

2p

2121

n1

n1

S

μμXXt

The test statistic for

μ1 – μ2 is:

*

1)n()1(n

S1nS1nS

21

222

2112

p

(continued)

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Assumed Equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-18

Population means, independent

samples

σ1 and σ2 known

21

2p2-nn21

n

1

n

1StXX

21

The confidence interval for

μ1 – μ2 is:

Where

1)n()1(n

S1nS1nS

21

222

2112

p

*

Confidence Interval, σ1 and σ2 Unknown

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-19

Pooled-Variance t Test: Example

You are a financial analyst for a brokerage firm. Is there a difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data:

NYSE NASDAQNumber 21 25Sample mean 3.27 2.53Sample std dev 1.30 1.16

Assuming both populations are approximately normal with equal variances, isthere a difference in average yield ( = 0.05)?

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-20

Calculating the Test Statistic

1.5021

1)25(1)-(21

1.161251.30121

1)n()1(n

S1nS1nS

22

21

222

2112

p

2.040

251

211

5021.1

02.533.27

n1

n1

S

μμXXt

21

2p

2121

The test statistic is:

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-21

Solution

H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)

H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

= 0.05

df = 21 + 25 - 2 = 44Critical Values: t = ± 2.0154

Test Statistic: Decision:

Conclusion:

Reject H0 at = 0.05

There is evidence of a difference in means.

t0 2.0154-2.0154

.025

Reject H0 Reject H0

.025

2.040

2.040

251

211

5021.1

2.533.27t

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-22

Population means, independent

samples

σ1 and σ2 known

σ1 and σ2 Unknown, Not Assumed Equal

Assumptions:

Samples are randomly and independently drawn

Populations are normally distributed or both sample sizes are at least 30

Population variances are unknown but cannot be assumed to be equal*

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-23

Population means, independent

samples

σ1 and σ2 known

(continued)

*

Forming the test statistic:

The population variances are not assumed equal, so include the two sample variances in the computation of the t-test statistic

the test statistic is a t value (statistical software is generally used to do the necessary computations)

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Not Assumed Equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-24

Population means, independent

samples

σ1 and σ2 known

2

22

1

21

2121

nS

nS

μμXXt

The test statistic for

μ1 – μ2 is:

*

(continued)

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Not Assumed Equal

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-25

Related Populations

Tests Means of 2 Related Populations Paired or matched samples Repeated measures (before/after) Use difference between paired values:

Eliminates Variation Among Subjects Assumptions:

Both Populations Are Normally Distributed Or, if not Normal, use large samples

Related samples

Di = X1i - X2i

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-26

Mean Difference, σD Known

The ith paired difference is Di , whereRelated samples

Di = X1i - X2i

The point estimate for the population mean paired difference is D : n

DD

n

1ii

Suppose the population standard deviation of the difference scores, σD, is known

n is the number of pairs in the paired sample

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-27

The test statistic for the mean difference is a Z value:Paired

samples

n

σμD

ZD

D

Mean Difference, σD Known(continued)

WhereμD = hypothesized mean differenceσD = population standard dev. of differencesn = the sample size (number of pairs)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-28

Confidence Interval, σD Known

The confidence interval for μD isPaired samples

n

σZD D

Where n = the sample size

(number of pairs in the paired sample)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-29

If σD is unknown, we can estimate the unknown population standard deviation with a sample standard deviation:

Related samples

1n

)D(DS

n

1i

2i

D

The sample standard deviation is

Mean Difference, σD Unknown

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-30

Use a paired t test, the test statistic for D is now a t statistic, with n-1 d.f.:Paired

samples

1n

)D(DS

n

1i

2i

D

n

SμD

tD

D

Where t has n - 1 d.f.

and SD is:

Mean Difference, σD Unknown(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-31

The confidence interval for μD isPaired samples

1n

)D(DS

n

1i

2i

D

n

StD D

1n

where

Confidence Interval, σD Unknown

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-32

Lower-tail test:

H0: μD 0H1: μD < 0

Upper-tail test:

H0: μD ≤ 0H1: μD > 0

Two-tail test:

H0: μD = 0H1: μD ≠ 0

Paired Samples

Hypothesis Testing for Mean Difference, σD Unknown

/2 /2

-t -t/2t t/2

Reject H0 if t < -t Reject H0 if t > t Reject H0 if t < -t

or t > t Where t has n - 1 d.f.

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-33

Assume you send your salespeople to a “customer service” training workshop. Has the training made a difference in the number of complaints? You collect the following data:

Paired t Test Example

Number of Complaints: (2) - (1)Salesperson Before (1) After (2) Difference, Di

C.B. 6 4 - 2 T.F. 20 6 -14 M.H. 3 2 - 1 R.K. 0 0 0 M.O. 4 0 - 4 -21

D = Di

n

5.67

1n

)D(DS

2i

D

= -4.2

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-34

Has the training made a difference in the number of complaints (at the 0.01 level)?

- 4.2D =

1.6655.67/

04.2

n/S

μDt

D

D

H0: μD = 0H1: μD 0

Test Statistic:

Critical Value = ± 4.604 d.f. = n - 1 = 4

Reject

/2

- 4.604 4.604

Decision: Do not reject H0

(t stat is not in the reject region)

Conclusion: There is not a significant change in the number of complaints.

Paired t Test: Solution

Reject

/2

- 1.66 = .01

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-35

Two Population Proportions

Goal: test a hypothesis or form a confidence interval for the difference between two population proportions,

π1 – π2

The point estimate for the difference is

Population proportions

Assumptions: n1 π1 5 , n1(1- π1) 5

n2 π2 5 , n2(1- π2) 5

21 pp

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-36

Two Population Proportions

Population proportions

21

21

nn

XXp

The pooled estimate for the overall proportion is:

where X1 and X2 are the numbers from samples 1 and 2 with the characteristic of interest

Since we begin by assuming the null hypothesis is true, we assume π1 = π2

and pool the two sample estimates

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-37

Two Population Proportions

Population proportions

21

2121

n1

n1

)p(1p

ppZ

ππ

The test statistic for

p1 – p2 is a Z statistic:

(continued)

2

22

1

11

21

21

n

Xp ,

n

Xp ,

nn

XXp

where

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-38

Confidence Interval forTwo Population Proportions

Population proportions

2

22

1

1121 n

)p(1p

n

)p(1pZpp

The confidence interval for

π1 – π2 is:

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-39

Hypothesis Tests forTwo Population Proportions

Population proportions

Lower-tail test:

H0: π1 π2

H1: π1 < π2

i.e.,

H0: π1 – π2 0H1: π1 – π2 < 0

Upper-tail test:

H0: π1 ≤ π2

H1: π1 > π2

i.e.,

H0: π1 – π2 ≤ 0H1: π1 – π2 > 0

Two-tail test:

H0: π1 = π2

H1: π1 ≠ π2

i.e.,

H0: π1 – π2 = 0H1: π1 – π2 ≠ 0

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-40

Hypothesis Tests forTwo Population Proportions

Population proportions

Lower-tail test:

H0: π1 – π2 0H1: π1 – π2 < 0

Upper-tail test:

H0: π1 – π2 ≤ 0H1: π1 – π2 > 0

Two-tail test:

H0: π1 – π2 = 0H1: π1 – π2 ≠ 0

/2 /2

-z -z/2z z/2

Reject H0 if Z < -Z Reject H0 if Z > Z Reject H0 if Z < -Z

or Z > Z

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-41

Example: Two population Proportions

Is there a significant difference between the proportion of men and the proportion of women who will vote Yes on Proposition A?

In a random sample, 36 of 72 men and 31 of 50 women indicated they would vote Yes

Test at the .05 level of significance

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-42

The hypothesis test is:H0: π1 – π2 = 0 (the two proportions are equal)

H1: π1 – π2 ≠ 0 (there is a significant difference between proportions)

The sample proportions are: Men: p1 = 36/72 = .50

Women: p2 = 31/50 = .62

.549122

67

5072

3136

nn

XXp

21

21

The pooled estimate for the overall proportion is:

Example: Two population Proportions

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-43

The test statistic for π1 – π2 is:

Example: Two population Proportions

(continued)

.025

-1.96 1.96

.025

-1.31

Decision: Do not reject H0

Conclusion: There is not significant evidence of a difference in proportions who will vote yes between men and women.

1.31

501

721

.549)(1.549

0.62.50

n1

n1

p(1p

ppz

21

2121

)

Reject H0 Reject H0

Critical Values = ±1.96For = .05

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-44

Hypothesis Tests for Variances

Tests for TwoPopulation Variances

F test statistic

H0: σ12 = σ2

2

H1: σ12 ≠ σ2

2Two-tail test

Lower-tail test

Upper-tail test

H0: σ12 σ2

2

H1: σ12 < σ2

2

H0: σ12 ≤ σ2

2

H1: σ12 > σ2

2

*

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-45

Hypothesis Tests for Variances

Tests for TwoPopulation Variances

F test statistic22

21

S

SF

The F test statistic is:

= Variance of Sample 1 n1 - 1 = numerator degrees of freedom

n2 - 1 = denominator degrees of freedom = Variance of Sample 2

21S

22S

*

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-46

The F critical value is found from the F table

There are two appropriate degrees of freedom: numerator and denominator

In the F table, numerator degrees of freedom determine the column

denominator degrees of freedom determine the row

The F Distribution

where df1 = n1 – 1 ; df2 = n2 – 122

21

S

SF

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-47

F 0

Finding the Rejection Region

L22

21

U22

21

FS

SF

FS

SF

rejection region for a two-tail test is:

FL Reject H0

Do not reject H0

F 0

FU Reject H0Do not

reject H0

F 0 /2

Reject H0Do not reject H0 FU

H0: σ12 = σ2

2

H1: σ12 ≠ σ2

2

H0: σ12 σ2

2

H1: σ12 < σ2

2

H0: σ12 ≤ σ2

2

H1: σ12 > σ2

2

FL

/2

Reject H0

Reject H0 if F < FL

Reject H0 if F > FU

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-48

Finding the Rejection Region

F 0 /2

Reject H0Do not reject H0 FU

H0: σ12 = σ2

2

H1: σ12 ≠ σ2

2

FL

/2

Reject H0

(continued)

2. Find FL using the formula:

Where FU* is from the F table with

n2 – 1 numerator and n1 – 1

denominator degrees of freedom (i.e., switch the d.f. from FU)

*UL F

1F 1. Find FU from the F table

for n1 – 1 numerator and

n2 – 1 denominator

degrees of freedom

To find the critical F values:

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-49

F Test: An Example

You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data: NYSE NASDAQNumber 21 25Mean 3.27 2.53Std dev 1.30 1.16

Is there a difference in the variances between the NYSE & NASDAQ at the = 0.05 level?

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-50

F Test: Example Solution

Form the hypothesis test:

H0: σ21 – σ2

2 = 0 (there is no difference between variances)

H1: σ21 – σ2

2 ≠ 0 (there is a difference between variances)

Numerator: n1 – 1 = 21 – 1 = 20 d.f.

Denominator: n2 – 1 = 25 – 1 = 24 d.f.

FU = F.025, 20, 24 = 2.33

Find the F critical values for = 0.05:

Numerator: n2 – 1 = 25 – 1 = 24 d.f.

Denominator: n1 – 1 = 21 – 1 = 20 d.f.

FL = 1/F.025, 24, 20 = 1/2.41

= 0.415

FU: FL:

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-51

The test statistic is:

0

256.116.1

30.1

S

SF

2

2

22

21

/2 = .025

FU=2.33Reject H0Do not

reject H0

H0: σ12 = σ2

2

H1: σ12 ≠ σ2

2

F Test: Example Solution

F = 1.256 is not in the rejection region, so we do not reject H0

(continued)

Conclusion: There is not sufficient evidence of a difference in variances at = .05

FL=0.43

/2 = .025

Reject H0

F

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-52

Two-Sample Tests in EXCEL

For independent samples: Independent sample Z test with variances known:

Tools | data analysis | z-test: two sample for means

Pooled variance t test: Tools | data analysis | t-test: two sample assuming equal variances

Separate-variance t test: Tools | data analysis | t-test: two sample assuming unequal variances

For paired samples (t test): Tools | data analysis | t-test: paired two sample for means

For variances: F test for two variances:

Tools | data analysis | F-test: two sample for variances

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-53

One-Way Analysis of Variance

One-Way Analysis of Variance (ANOVA)

F-test Tukey-Kramer test

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-54

General ANOVA Setting

Investigator controls one or more independent variables Called factors (or treatment variables) Each factor contains two or more levels (or groups or

categories/classifications) Observe effects on the dependent variable

Response to levels of independent variable Experimental design: the plan used to collect

the data

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-55

One-Way Analysis of Variance

Evaluate the difference among the means of three or more groups

Examples: Accident rates for 1st, 2nd, and 3rd shift

Expected mileage for five brands of tires

Assumptions Populations are normally distributed Populations have equal variances Samples are randomly and independently drawn

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-56

Hypotheses of One-Way ANOVA

All population means are equal i.e., no treatment effect (no variation in means among

groups)

At least one population mean is different i.e., there is a treatment effect Does not mean that all population means are different

(some pairs may be the same)

c3210 μμμμ:H

same the are means population the of all Not:H1

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-57

One-Way ANOVA

All Means are the same:The Null Hypothesis is True

(No Treatment Effect)

c3210 μμμμ:H same the are μ all Not:H j1

321 μμμ

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-58

One-Way ANOVA

At least one mean is different:The Null Hypothesis is NOT true

(Treatment Effect is present)

c3210 μμμμ:H same the are μ all Not:H j1

321 μμμ 321 μμμ

or

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-59

Partitioning the Variation

Total variation can be split into two parts:

SST = Total Sum of Squares (Total variation)

SSA = Sum of Squares Among Groups (Among-group variation)

SSW = Sum of Squares Within Groups (Within-group variation)

SST = SSA + SSW

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-60

Partitioning the Variation

Total Variation = the aggregate dispersion of the individual data values across the various factor levels (SST)

Within-Group Variation = dispersion that exists among the data values within a particular factor level (SSW)

Among-Group Variation = dispersion between the factor sample means (SSA)

SST = SSA + SSW

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-61

Partition of Total Variation

Variation Due to Factor (SSA)

Variation Due to Random Sampling (SSW)

Total Variation (SST)

Commonly referred to as: Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within-Group Variation

Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation

= +

d.f. = n – 1

d.f. = c – 1 d.f. = n – c

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-62

Total Sum of Squares

c

1j

n

1i

2ij

j

)XX(SSTWhere:

SST = Total sum of squares

c = number of groups (levels or treatments)

nj = number of observations in group j

Xij = ith observation from group j

X = grand mean (mean of all data values)

SST = SSA + SSW

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-63

Total Variation

Group 1 Group 2 Group 3

Response, X

X

2cn

212

211 )XX(...)XX()XX(SST

c

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-64

Among-Group Variation

Where:

SSA = Sum of squares among groups

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

X = grand mean (mean of all data values)

2j

c

1jj )XX(nSSA

SST = SSA + SSW

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-65

Among-Group Variation

Variation Due to Differences Among Groups

i j

2j

c

1jj )XX(nSSA

1c

SSAMSA

Mean Square Among =

SSA/degrees of freedom

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-66

Among-Group Variation

Group 1 Group 2 Group 3

Response, X

X1X

2X3X

2222

211 )xx(n...)xx(n)xx(nSSA cc

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-67

Within-Group Variation

Where:

SSW = Sum of squares within groups

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

Xij = ith observation in group j

2jij

n

1i

c

1j

)XX(SSWj

SST = SSA + SSW

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-68

Within-Group Variation

Summing the variation within each group and then adding over all groups cn

SSWMSW

Mean Square Within =

SSW/degrees of freedom

2jij

n

1i

c

1j

)XX(SSWj

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-69

Within-Group Variation

Group 1 Group 2 Group 3

Response, X

1X2X

3X

2ccn

2212

2111 )XX(...)XX()Xx(SSW

c

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-70

Obtaining the Mean Squares

cn

SSWMSW

1c

SSAMSA

1n

SSTMST

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-71

One-Way ANOVA Table

Source of Variation

dfSS MS(Variance)

Among Groups

SSA MSA =

Within Groups

n - cSSW MSW =

Total n - 1SST =SSA+SSW

c - 1 MSA

MSW

F ratio

c = number of groupsn = sum of the sample sizes from all groupsdf = degrees of freedom

SSA

c - 1

SSW

n - c

F =

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-72

One-Way ANOVAF Test Statistic

Test statistic

MSA is mean squares among groups

MSW is mean squares within groups

Degrees of freedom df1 = c – 1 (c = number of groups)

df2 = n – c (n = sum of sample sizes from all populations)

MSW

MSAF

H0: μ1= μ2 = … = μc

H1: At least two population means are different

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-73

Interpreting One-Way ANOVA F Statistic

The F statistic is the ratio of the among estimate of variance and the within estimate of variance The ratio must always be positive df1 = c -1 will typically be small df2 = n - c will typically be large

Decision Rule: Reject H0 if F > FU,

otherwise do not reject H0

0

= .05

Reject H0Do not reject H0

FU

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-74

One-Way ANOVA F Test Example

You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the 0.05 significance level, is there a difference in mean distance?

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-75

••••

One-Way ANOVA Example: Scatter Diagram

270

260

250

240

230

220

210

200

190

••

•••

•••••

Distance

1X

2X

3X

X

227.0 x

205.8 x 226.0x 249.2x 321

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

Club1 2 3

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-76

One-Way ANOVA Example Computations

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

X1 = 249.2

X2 = 226.0

X3 = 205.8

X = 227.0

n1 = 5

n2 = 5

n3 = 5

n = 15

c = 3SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4

SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSA = 4716.4 / (3-1) = 2358.2

MSW = 1119.6 / (15-3) = 93.325.275

93.3

2358.2F

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-77

F = 25.275

One-Way ANOVA Example Solution

H0: μ1 = μ2 = μ3

H1: μj not all equal

= 0.05

df1= 2 df2 = 12

Test Statistic:

Decision:

Conclusion:

Reject H0 at = 0.05

There is evidence that at least one μj differs from the rest

0

= .05

FU = 3.89Reject H0Do not

reject H0

25.27593.3

2358.2

MSW

MSAF

Critical Value:

FU = 3.89

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-78

SUMMARY

Groups Count Sum Average Variance

Club 1 5 1246 249.2 108.2

Club 2 5 1130 226 77.5

Club 3 5 1029 205.8 94.2

ANOVA

Source of Variation

SS df MS F P-value F crit

Between Groups

4716.4 2 2358.2 25.275 4.99E-05 3.89

Within Groups

1119.6 12 93.3

Total 5836.0 14        

One-Way ANOVA Excel Output

EXCEL: tools | data analysis | ANOVA: single factor

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-79

The Tukey-Kramer Procedure

Tells which population means are significantly different e.g.: μ1 = μ2 μ3

Done after rejection of equal means in ANOVA Allows pair-wise comparisons

Compare absolute mean differences with critical range

xμ1 = μ 2

μ3

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-80

Tukey-Kramer Critical Range

where:QU = Value from Studentized Range Distribution

with c and n - c degrees of freedom for the desired level of (see appendix E.8 table)

MSW = Mean Square Within nj and nj’ = Sample sizes from groups j and j’

j'jU n

1

n

1

2

MSWQRange Critical

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-81

The Tukey-Kramer Procedure: Example

1. Compute absolute mean differences:Club 1 Club 2 Club 3

254 234 200263 218 222241 235 197237 227 206251 216 204 20.2205.8226.0xx

43.4205.8249.2xx

23.2226.0249.2xx

32

31

21

2. Find the QU value from the table in appendix E.8 with c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom for the desired level of ( = 0.05 used here):

3.77QU

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-82

The Tukey-Kramer Procedure: Example

5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at 5% level of significance. Thus, with 95% confidence we can conclude that the mean distance for club 1 is greater than club 2 and 3, and club 2 is greater than club 3.

16.2855

1

5

1

2

93.33.77

n

1

n

1

2

MSWQRange Critical

j'jU

3. Compute Critical Range:

20.2xx

43.4xx

23.2xx

32

31

21

4. Compare:

(continued)

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-83

Chapter Summary

Compared two independent samples Performed Z test for the difference in two means Performed pooled variance t test for the difference in two

means Performed separate-variance t test for difference in two means Formed confidence intervals for the difference between two

means Compared two related samples (paired

samples) Performed paired sample Z and t tests for the mean difference Formed confidence intervals for the mean difference

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-84

Chapter Summary

Compared two population proportions Formed confidence intervals for the difference between two population

proportions Performed Z-test for two population proportions

Performed F tests for the difference between two population variances

Used the F table to find F critical values Described one-way analysis of variance

The logic of ANOVA ANOVA assumptions F test for difference in c means The Tukey-Kramer procedure for multiple comparisons

(continued)