1 topic 12 – further topics in anova unequal cell sizes (chapter 20)

52
1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

Upload: stephon-masterson

Post on 14-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

1

Topic 12 – Further Topics in ANOVA

Unequal Cell Sizes

(Chapter 20)

Page 2: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

2

Overview

We’ll start with the Learning Activity.

More practice in interpreting ANOVA results; and a baby-step into 3-way ANOVA.

An illustration of the problems that an unbalanced design will cause.

We’ll then continue with a discussion of unbalanced designs (Chapter 20)

Page 3: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

3

Collaborative Learning Activity

Take your time going through this. Ask questions as needed!

Page 4: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

4

Question 1

Analyze the design elements.

Page 5: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

5

Design Chart

Unequal Cell Sizes – but there is SOME balance achieved

Single Factor Analyses will be balanced.

Gender*Age = 6 observations per cell

Time*Age = 6 observations per cell

Gender*Time = Unbalanced

Age Young Middle Elderly Young Middle ElderlyWeekday xxxx xxx xxxxx xx xxx xWeekend xx xxx x xxxx xxx xxxxx

GenderMale Female

Time

Page 6: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

6

Question 2

Analyze Age*Time

Page 7: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

7

Interaction Plot (ignoring gender)

Page 8: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

8

Interpretations

No interaction is evident between age and time

Seems middle age group gets generally higher offers.

Seems offers during the week are generally higher than on the weekend (this effect is not as big as the age effect)

Page 9: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

9

Main Effects Plots

Page 10: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

10

ANOVA

Sum of Source DF Squares Mean Square F Value Pr > F age 2 316.7222222 158.3611111 169.67 <.0001 time 1 53.7777778 53.7777778 57.62 <.0001 age*time 2 0.3888889 0.1944444 0.21 0.8131 Error 30 28.0000000 0.9333333 Total 35 398.8888889

• Type I vs Type III?

Page 11: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

11

LSMeans

#3 (middle aged, weekday) is the highest

Using Tukey comparisons it is significantly higher than all others.

“Slicing” will show the same things that we guessed from the plots.

Page 12: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

12

LSMeans (sliced)Least Squares Means age*time Effect Sliced by time for offer Sum of time DF Squares Mean Square F Value Pr > F wkday 2 148.111111 74.055556 79.35 <.0001 wkend 2 169.000000 84.500000 90.54 <.0001 age*time Effect Sliced by age for offer Sum of age DF Squares Mean Square F Value Pr > F Elderly 1 18.750000 18.750000 20.09 0.0001 Middle 1 14.083333 14.083333 15.09 0.0005 Young 1 21.333333 21.333333 22.86 <.0001

Page 13: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

13

Slicing of LSMeans

Sums of Squares add to???

DF add to???

Effect of slicing is to look at differences for one of the two factors at a specific level of the other factor.

Interpretations???

Page 14: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

14

Question 3

Analyze Age*Gender

Page 15: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

15

Interaction Plot (ignoring Time)

Page 16: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

16

Interpretations

Small interaction is seen; might be described as follows:

There is still a clear main effect: Middle aged get higher offers in general

There seem to be no gender differences for middle aged or young.

For elderly, women may be getting lower offers than men.

Page 17: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

17

LSMeans (sliced comparisons)Least Squares Means age*gender Effect Sliced by gender for offer Sum of gender DF Squares Mean Square F Value Pr > F Female 2 184.333333 92.166667 38.58 <.0001 Male 2 137.444444 68.722222 28.77 <.0001 age*gender Effect Sliced by age for offer Sum of age DF Squares Mean Square F Value Pr > F Elderly 1 10.083333 10.083333 4.22 0.0487 Middle 1 0.083333 0.083333 0.03 0.8531 Young 1 0.333333 0.333333 0.14 0.7114

Page 18: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

18

ANOVA / LSMeans

Only age differences show up in the ANOVA.

“Sliced” LSMeans comparisons do pick up gender difference within elderly

Note: Type I error rate is uncontrolled. But on the other hand sample sizes are also fairly small.

Conclusions?

Page 19: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

19

Question 4

Analyze Time*Gender

Page 20: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

20

Interaction Plot (ignoring age)

Page 21: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

21

Interpretations

Seems to be a clear interaction: For men, there is not much difference in the offer between weekday/weekend.

Women should go on the weekdays, where it seems they average about $400 more.

Interestingly, significance is not seen in the ANOVA table, but is seen in the ‘sliced’ LSMeans output.

Remember Type I Error is uncontrolled.

Page 22: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

22

ANOVA Table

Sum of Source DF Squares Mean Square F Value Pr > F gender 1 5.44444444 5.44444444 0.55 0.4656 time 1 48.34722222 48.34722222 4.84 0.0351 gender*time 1 25.68055556 25.68055556 2.57 0.1185 Error 32 319.4166667 9.9817708 Corrected Total 35 398.8888889 Source DF Type III SS Mean Square F Value Pr > F gender 1 0.01388889 0.01388889 0.00 0.9705 time 1 48.34722222 48.34722222 4.84 0.0351 gender*time 1 25.68055556 25.68055556 2.57 0.1185

• Why are Type I / Type III SS different here?

Page 23: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

23

Sliced LSMeans

gender*time Effect Sliced by time for offer Sum of time DF Squares Mean Square F Value Pr > F wkday 1 13.444444 13.444444 1.35 0.2544 wkend 1 12.250000 12.250000 1.23 0.2762 gender*time Effect Sliced by gender for offer Sum of gender DF Squares Mean Square F Value Pr > F Female 1 72.250000 72.250000 7.24 0.0112 Male 1 1.777778 1.777778 0.18 0.6758

Page 24: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

24

Conclusions

This is an intriguing example, because the ANOVA output would lead you to believe there is a small time effect, but no gender effect.

Looking at the interaction plot presents a completely different picture (and likely a more accurate one). Let’s reconsider that, showing the sample sizes.

Page 25: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

25

Interaction Plot (ignoring age)

n = 12

n = 12

n = 6

n = 6

Page 26: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

26

Confounding This picture illustrates how the effects of

gender and time will be confounded.

Suppose that women do get lower offers than men in general. Then because the women received more weekend offers (and men more offers on weekdays), the average offer on the weekend will by default be lower than the weekday.

Simple example: Suppose men get $2 and women get $1. Then with the sample sizes, the weekday average will be 30/18 while the weekend average will be only 24/18.

Page 27: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

27

Questions 5 & 6

3-way ANOVA

Is Gender Important?

Page 28: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

28

Modeling

Removing unimportant terms (starting at the interaction level) seems like a reasonable way to go.

Use Type III SS to do this since cell sizes are not the same.

The procedure leads to a model containing only Age and Time; suggesting that gender is unimportant. But we know this may not be accurate since gender/time are confounded.

Page 29: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

29

Confounding What exactly does it mean to say that the

time/gender effects are confounded.

The biggest thing that it means is that the analysis we just did is inappropriate since...

The time effect may have been seen because more women went on the weekend. It may well be a gender effect that is disguised as a time effect due to the unbalanced design.

Due to the lack of balance – we were forced to use Type III SS which (due to collinearity / confounding may not tell the whole story).

Page 30: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

30

Importance of Gender? Probably!

Direct algorithmic analysis suggests both time and age are important, while gender is not. But due to confounding, that wasn’t really appropriate.

The plot for time*gender indicates what is probably the real story (due to small sample sizes it is hard to get significance).

With a balanced design – we would be much better off. The effects would not be confounded, and we could therefore see an accurate picture.

Page 31: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

31

Importance of Gender? (2)

Differing sample sizes means that Estimates for women on weekdays, and men

on weekends, will have larger standard errors.

This will reduce our power to detect differences, and the effects will “overlap” to some extent because of the unequal sample sizes.

When we looked at the gender*time interaction, the plot suggested there was an important one. Further studies should be conducted to determine if this is the case.

Page 32: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

32

Unbalanced Two-Way ANOVA

Unequal Cell Sizes

(Chapter 20 – skim only)

Page 33: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

33

Differing Cell Sizes Encountered for a variety of reasons

including:

Convenience – usually if we have an observational study, we have very little control over the cell sizes.

Cost Effectiveness – sometimes the cost of samples is different, and we may use larger sample sizes when the cost is less.

Accidently – In experimental studies, you may start with a balanced design, but lose that balance if some problem occurs.

Page 34: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

34

Differing Cell Sizes (2)

What changes?

Loss of balance brings “intercorrelation” among the predictors.

Type I and III SS will be different; typically Type III SS should be used for testing but as we have seen even that is not perfect!

Standard errors for cell means and for multiple comparisons will be different (they depend on the cell size). For the same reason, confidence intervals will have different widths.

Page 35: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

35

Example

Examine the effects of gender (A) and anxiety level (B) on a toxin level in the bloodstream.

Three categories of anxiety (Severe, Moderate, and Mild).

We categorize people on this basis after they are in the study (it is an observational factor).

For cost effectiveness, we wouldn’t want to throw away data just to keep a balanced design.

Page 36: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

36

Data

Severe Moderate Mild1.4 2.1 0.72.4 1.7 1.12.22.4 2.5 0.5

1.8 0.92.0 1.3

Male

Female

Page 37: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

37

Interaction Plot

Page 38: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

38

Interpretation

Effect seems to be greater if anxiety is more severe.

This is an interaction of the “enhancement type”. The effect of anxiety level on toxin levels is greater for women than it is for men.

Remember, we aren’t saying anything about significance here – we’ll do that when we look at the ANOVA.

Page 39: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

39

ANOVA Output

Source DF SS MS F Value Pr > F Model 5 4.474 0.895 5.51 0.0172 Error 8 1.300 0.163 Total 13 5.774 R-Square Root MSE toxin Mean 0.774864 0.403113 1.642857

Page 40: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

40

Type I / III SS

Source DF Type I SS MS F Value Pr > F gender 1 0.00286 0.00286 0.02 0.8978 anxiety 2 4.39600 2.19800 13.53 0.0027 gen*anx 2 0.07543 0.03771 0.23 0.7980

Source DF Type III SS MS F Value Pr > F gender 1 0.1200 0.1200 0.74 0.4152 anxiety 2 4.1897 2.0949 12.89 0.0031 gen*anx 2 0.0754 0.0377 0.23 0.7980

Page 41: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

41

Differences in Type I / III SS

The more unbalanced the design, the further apart these may be.

There are actually four types of SS:

I – Sequential

II – Added Last (Observation)

III – Added Last (Cell)

IV – Added Last (Empty Cells)

Page 42: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

42

Type I SS

Sequential Sums of Squares; Most appropriate for equal cell sizes.

SS(A), SS(B|A), SS(A*B|A,B)

Each observation is weighted equally. So the net result for an unbalanced design is that some treatments will be considered with greater weight than others.

Page 43: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

43

Type II SS

Variable Added Last SS; Generally only used for regression because again each observation is weighted equally.

SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

Page 44: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

44

Type III SS

Variable Added Last SS, appropriate for unequal cell sizes. Type III SS adjusts for the fact that cell sizes are different.

Each cell is weighted equally, with the result that treatments are weighted equally. This means that observations in “smaller” cells will carry more weight.

SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

Page 45: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

45

Type IV SS

Variable Added Last SS and similar to Type III SS but further allows for the possibility of empty cells.

It is only necessary to use these if there are empty cells (which hopefully there won’t be if you’ve designed the experiment well).

SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

Page 46: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

46

General Strategy

Remember that Type I SS and Type III SS examine different null hypotheses.

Type III SS are preferred when sample sizes are not equal, but can be somewhat misleading if sample sizes differ greatly.

Type IV SS are appropriate if there are empty cells.

Can obtain Type IV SS if necessary by using /ss4 in MODEL statement

Page 47: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

47

Example (continued)

The interaction is unimportant, nor is there an apparent large effect of gender.

Now look at comparing different levels of anxiety; should not ‘change’ models at this point, so just average over gender (LSMeans).

Source DF Type III SS MS F Value Pr > F gender 1 0.1200 0.1200 0.74 0.4152 anxiety 2 4.1897 2.0949 12.89 0.0031

gen*anx 2 0.0754 0.0377 0.23 0.7980

Page 48: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

48

LSMeans

Must use LSMeans to adjust all means to the same “average level” of gender.

toxin LSMEAN anxiety LSMEAN Number mild 0.90000000 1 moderate 2.00000000 2 severe 2.20000000 3

Page 49: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

49

Comparisons

Mild group has significantly lower toxin levels than the moderate and severe groups

Least Squares Means for effect anxiety Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: toxin i/j 1 2 3 1 0.0072 0.0059 2 0.0072 0.7845 3 0.0059 0.7845

Page 50: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

50

Confidence Intervals

Could get CI’s for means and/or differences if you wanted them.

They will be of different widths – why?

It will be harder to detect differences for groups with fewer observations.

Page 51: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

51

Questions?

Page 52: 1 Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

52

Upcoming in Topic 13...

Random Effects

(parts of chapters 17 & 19 that were previously skipped)