copyright © 2013, 2009, and 2007, pearson education, inc. chapter 14 comparing groups: analysis of...

20

Upload: regina-hensley

Post on 27-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences
Page 2: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.

Chapter 14Comparing Groups: Analysis

of Variance Methods

Section 14.2

Estimating Differences in Groups for a Single Factor

Page 3: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.3

Confidence Intervals Comparing Pairs of Means

Follow Up to an ANOVA F-Test:

When an analysis of variance F-test has a small P-value, the test does not specify which means are different or how different they are.

We can estimate differences between population means with confidence intervals.

Page 4: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.4

For two groups i and j, with sample means and

having sample sizes ni and nj, the 95% confidence

interval for is:

The t-score has total sample size - # groups

SUMMARY: Confidence Interval Comparing Means

i j

df N g

Page 5: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.5

Confidence Intervals Comparing Pairs of Means

In the context of follow-up analyses after the ANOVA F test by forming this confidence interval to compare a pair of means, some software (such as MINITAB) refers to this method of comparing means as the Fisher method.

When the confidence interval does not contain 0, we can infer that the population means are different. The interval shows just how different the means may be.

Page 6: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.6

A recent GSS study asked: “About how many good friends do you have?”

The study also asked each respondent to indicate whether they were ‘very happy,’ ‘pretty happy,’ or ‘not too happy’.

Example: Number of Good Friends and Happiness

Page 7: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.7

Let the response variable y = number of good friends

Let the categorical explanatory variable x = happiness level

Example: Number of Good Friends and Happiness

Page 8: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.8

Table 14.3 Summary of ANOVA for Comparing Mean Number of Good Friendsfor Three Happiness Categories. The analysis is based on GSS data.

Example: Number of Good Friends and Happiness

Page 9: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.9

Construct a 95% CI to compare the population mean number of good friends for the three pairs of happiness categories—very happy with pretty happy, very happy with not too happy, and pretty happy with not too happy.

95% CI formula:

Example: Number of Good Friends and Happiness

Page 10: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.10

First, use the output to find s:

df=828 Use software or a table to find the t-value of 1.963

Example: Number of Good Friends and Happiness

Page 11: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.11

For comparing the very happy and pretty happy categories, the confidence interval for is

Since the CI contains only positive numbers, this suggests that, on average, people who are very happy have more good friends than people who are pretty happy.

Example: Number of Good Friends and Happiness

1 2

Page 12: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.12

The Effects of Violating Assumptions

The t confidence intervals have the same assumptions as the ANOVA F test:1. normal population distributions with

2. identical standard deviations

3. data obtained from randomization

When the sample sizes are large and the ratio of the largest standard deviation to the smallest is less than 2, these procedures are robust to violations of these assumptions.

If the ratio of the largest standard deviation to the smallest exceeds 2, use the confidence interval formulas that use separate standard deviations for the groups.

Page 13: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.13

Controlling Overall Confidence with Many Confidence Intervals

The confidence interval method just discussed is mainly used when g is small or when only a few comparisons are of main interest.

The confidence level of 0.95 applies to any particular confidence interval that we construct.

Page 14: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.14

How can we construct the intervals so that the 95% confidence extends to the entire set of intervals rather than to each single interval?

Methods that control the probability that all confidence intervals will contain the true differences in means are called multiple comparison methods.

For these methods, all intervals are designed to contain the true parameters simultaneously with an overall fixed probability.

Controlling Overall Confidence with Many Confidence Intervals

Page 15: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.15

The method that we will use is called the Tukey method.

It is designed to give overall confidence level very close to the desired value (such as 0.95).

This method is available in most software packages.

Controlling Overall Confidence with Many Confidence Intervals

Page 16: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.16

Example: Number of Good Friends

Table 14.4 Multiple Comparisons of Mean Good Friends for Three Happiness Categories. An asterisk * indicates a significant difference, with the confidence interval not containing 0.

Page 17: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.17

ANOVA and Regression

ANOVA can be presented as a special case of multiple regression by using indicator variables to represent the factors. For example, with 3 groups we need 2 indicator variables to indicate group membership:

The first indicator variable is x1 = 1 for observations from the first group,

= 0 otherwise

Page 18: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.18

The second indicator variable is for observations from the second group otherwise

The indicator variables identify the group to which an observation belongs as follows:

ANOVA and Regression

2 1x 0

1 2

1 2

1 2

1: 1 0

2 : 0 1

3: 0 0

Group if x and x

Group if x and x

Group if x and x

Page 19: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.19

The multiple regression equation for the mean of y is

Table 14.5 Interpretation of Coefficients of Indicator Variables in Regression ModelThe indicator variables represent a categorical predictor with three categories specifying three groups.

ANOVA and Regression

Page 20: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.20

Using Regression for the ANOVA Comparison of Means

For three groups, the null hypothesis for the ANOVA F test is

If is true, then and

In the Multiple Regression model:

with and

Thus, ANOVA hypothesis is equivalent to

in the regression model.