copyright © 2011 pearson education, inc. analysis of variance chapter 26
Post on 03-Jan-2016
215 Views
Preview:
TRANSCRIPT
26.1 Comparing Several Groups
Did agricultural yield go up this year because of more fertilizer or more rain? Or is it the result of temperature or type of seed used?
Use regression analysis with dummy variables to compare the averages of several groups.
This approach is also known as analysis of variance.
Copyright © 2011 Pearson Education, Inc.
3 of 42
26.1 Comparing Several Groups
Which Wheat Variety Should a Farmer Plant?
Five varieties of wheat are being considered: Endurance, Hatcher, NuHills, RonL, and Ripper.
Each variety was grown in randomly chosen plots and yield was measured as bushels per acre.
Balanced experiment: experiment with an equal number of observations for each treatment.
Copyright © 2011 Pearson Education, Inc.
4 of 42
26.1 Comparing Several Groups
Steps to Follow in the Analysis
Plot the data to find patterns.
Propose a regression model for the data.
Check conditions associated with the model.
Test hypotheses and draw a conclusion.
Copyright © 2011 Pearson Education, Inc.
5 of 42
26.1 Comparing Several Groups
Comparing Groups in Plots – Boxplots of Yield
Copyright © 2011 Pearson Education, Inc.
6 of 42
26.1 Comparing Several Groups
Comparing Groups in Plots – Summary Statistics
Copyright © 2011 Pearson Education, Inc.
7 of 42
26.1 Comparing Several Groups
Relating the t-Test to Regression
Is there a significant difference between the average yield of Endurance and the others?
Since the variances among groups appear similar, use the two sample t-test and pool the variances.
Copyright © 2011 Pearson Education, Inc.
8 of 42
26.1 Comparing Several Groups
Relating the t-Test to Regression
The t-statistic and p-value show that Endurance has a significantly higher mean yield per acre than the combination of other varieties.
Copyright © 2011 Pearson Education, Inc.
9 of 42
26.1 Comparing Several Groups
Relating the t-Test to Regression
The t-test can be formulated as a regression with a dummy variable D(Endurance) that is coded 1 if plot is seeded with Endurance and 0 otherwise.
Copyright © 2011 Pearson Education, Inc.
10 of 42
26.1 Comparing Several Groups
Relating the t-Test to Regression
The slope b1 = 5.53 matches the estimate for the difference between means.
Testing the slope is equivalent to a pooled two-sample t-test of the difference between means (the t-statistic and p-value are the same).
Copyright © 2011 Pearson Education, Inc.
11 of 42
26.1 Comparing Several Groups
Comparing Several Groups Using Regression
Define the following dummy variables:D(Endurance) = 1 if plot grows Endurance, 0 otherwise.D(Hatcher) = 1 if plot grows Hatcher, 0 otherwise.D(NuHills) = 1 if plot grows NuHills, 0 otherwise.D(Ripper) = 1 if plot grows Ripper, 0 otherwise.
J-1 dummy variables are needed to represent J categories.
Copyright © 2011 Pearson Education, Inc.
12 of 42
26.1 Comparing Several Groups
Comparing Several Groups Using Regression
The variety RonL is the baseline category (defined by all zeros for the dummy variables).
Analysis of variance (ANOVA): the comparison of two or more averages using regression model with all dummy variables.
Copyright © 2011 Pearson Education, Inc.
13 of 42
26.1 Comparing Several Groups
Comparing Several Groups Using Regression
Copyright © 2011 Pearson Education, Inc.
14 of 42
26.1 Comparing Several Groups
Interpreting the Estimates
The slope of each dummy variable compares the average response of its category to the average of the baseline category.
If D(Endurance) = 1, we find = 19.58 bushels per acre. Since b0 = 11.68 is the mean yield for RonL, the slope for D(Endurance), which is b1 = 7.9 is the difference between the average yields.
Copyright © 2011 Pearson Education, Inc.
15 of 42
y
26.1 Comparing Several Groups
ANOVA Regression Model
The equation of the MRM for the Wheat example can be written in terms of the population means:
Copyright © 2011 Pearson Education, Inc.
16 of 42
)()()()( 52515 HatcherDEnduranceDy
)()()()( 5453 RipperDNuHillsD
26.1 Comparing Several Groups
ANOVA Regression Model
One-Way Analysis of Variance. This regression model compares the averages of the groups defined by J levels of a categorical variable. The observations in each group are a sample from the associated population.
Equation:Assumptions: Errors are independent, have equal
variances and are normally distributed.
Copyright © 2011 Pearson Education, Inc.
17 of 42
ijjijy
26.2 Inference in ANOVA Regression Models
Checking Conditions
Linear association: automatic for ANOVA. No obvious lurking variable: automatic if data are
from a randomized experiment (i.e., wheat example).
Check the remaining conditions (independence, similar variances, and normality) with appropriate residual plots.
Copyright © 2011 Pearson Education, Inc.
18 of 42
26.2 Inference in ANOVA Regression Models
Checking Conditions
If IQR’s are similar, within a factor of 3 to 1 with up to five groups, similar variances condition is met.
Copyright © 2011 Pearson Education, Inc.
19 of 42
26.2 Inference in ANOVA Regression Models
Checking Conditions
Residuals appear nearly normal.
Copyright © 2011 Pearson Education, Inc.
20 of 42
26.2 Inference in ANOVA Regression Models
F-Test for the Difference among Means
Used to test the following null hypothesis:H0: µ1 = µ2 = µ3 = µ4 = µ5
Typically summarized in an ANOVA table:
The p-value < 0.05; reject H0.
Copyright © 2011 Pearson Education, Inc.
21 of 42
26.2 Inference in ANOVA Regression Models
Understanding the F-Test
Consider the following hypothetical means:
Are these averages statistically significantly different? To answer this question, need to know the variance within each group.
Copyright © 2011 Pearson Education, Inc.
22 of 42
26.2 Inference in ANOVA Regression Models
Understanding the F-Test
Both plots show groups with the same averages, but different within group variances. No significant differences in averages in right plot.
Copyright © 2011 Pearson Education, Inc.
23 of 42
26.2 Inference in ANOVA Regression Models
Confidence Intervals
Since the F-test shows that the mean yields among varieties of wheat are not the same, which variety is best?
Copyright © 2011 Pearson Education, Inc.
24 of 42
26.3 Multiple Comparisons
Multiple comparisons: inferential procedure composed of numerous separate tests; also called multiplicity.
Compare pairwise differences for wheat varieties:
Copyright © 2011 Pearson Education, Inc.
25 of 42
26.3 Multiple Comparisons
Tukey Confidence Intervals
These intervals hold the chance for a Type I error to 5% over the entire collection of pairwise comparisons.
Replaces the t-percentile in confidence intervals with a larger multiple of the standard error (obtained from a special table).
Copyright © 2011 Pearson Education, Inc.
26 of 42
26.3 Multiple Comparisons
Tukey Confidence Intervals - Wheat Example
The 95% Tukey confidence interval for the two best varieties of wheat (Endurance and Hatcher):
2.04 ± 2.875 2.11 = 2.04 ± 6.07 bushels/acre
This difference is not statistically significant since the Tukey interval includes 0.
Copyright © 2011 Pearson Education, Inc.
27 of 42
26.3 Multiple Comparisons
Tukey Confidence Intervals - Wheat Example
Note that the width of the 95% Tukey confidence interval is the same for any pairwise comparison.
The difference in yield between any two varieties compared must be more than 6.07 bushels/acre in order to be statistically significant.
Copyright © 2011 Pearson Education, Inc.
28 of 42
26.3 Multiple Comparisons
Bonferroni Confidence Intervals
These intervals adjust for multiple comparisons by changing the α level used in the standard interval to α/M for M intervals.
For the comparison among wheat varieties, Bonferroni confidence intervals reduce α = 0.05 to α/10 = 0.005 and replaced t = 2.08 with t = 3.00.
Copyright © 2011 Pearson Education, Inc.
29 of 42
26.4 Groups of Different Size
With groups of different sizes, unbalanced data produce confidence intervals of different widths.
Compute the estimated standard error for a pairwise comparison using the following formula with relevant sample sizes:
Copyright © 2011 Pearson Education, Inc.
30 of 42
2121
11)(
nnsyyse e
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Motivation
Advertising executives want to compare four commercials for a retail item that make claims of varying strengths. Specifically, they want to know how over-the-top an ad can be before customers turn away in disbelief.
Copyright © 2011 Pearson Education, Inc.
31 of 42
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Method
The data consist of reactions for a sample of 80 customers who viewed commercials with claims in one of four categories: Tame, Plausible, Stretch and Outrageous. Each customer was randomly assigned to a commercial. The response variable is Credibility obtained by customers’ responses to items on a questionnaire they completed after viewing the ad.
Copyright © 2011 Pearson Education, Inc.
32 of 42
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Method
Use regression with three dummy variables to capture the four types of claims made in the commercials. Check the conditions for ANOVA. Linearity is not an issue and there are no obvious lurking variables because randomization was used in designing the study.
Copyright © 2011 Pearson Education, Inc.
33 of 42
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Mechanics - Results
Copyright © 2011 Pearson Education, Inc.
34 of 42
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Mechanics – Results
Copyright © 2011 Pearson Education, Inc.
35 of 42
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Mechanics – Check remaining conditions before proceeding with
inference.
Similar variances condition is satisfied.
Copyright © 2011 Pearson Education, Inc.
36 of 42
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Mechanics –Check remaining conditions before proceeding with
inference.
Nearly normal condition is satisfied.
Copyright © 2011 Pearson Education, Inc.
37 of 42
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Mechanics –
The F-test has a p-value 0.0251; reject H0. The mean credibility of the four commercials is not equal. Performing pairwise comparisons using Tukey intervals, the difference between average credibility must be more than 3.25 to be statistically significant.
Copyright © 2011 Pearson Education, Inc.
38 of 42
4M Example 26.1: JUDGING THE CREDIBILITY OF ADVERTISEMENTS
Message
Based on the Tukey intervals, there is only one statistically significant pairwise difference (between commercials making tame claims and those that make outrageous). Customers place less credibility in ads that make outrageous claims than ads that make tame claims.
Copyright © 2011 Pearson Education, Inc.
39 of 42
Best Practices
Use a randomized experiment to obtain data.
Check the assumptions of multiple regression when using ANOVA regression.
Use Tukey or Bonferroni confidence intervals to identify groups that are significantly different.
Recognize the cost of snooping in the data to choose hypotheses.
Copyright © 2011 Pearson Education, Inc.
40 of 42
Pitfalls
Don’t compare the means of several groups using lots of t-tests.
Don’t forget confounding factors.
Never pretend you have only two groups.
Copyright © 2011 Pearson Education, Inc.
41 of 42
top related