lecture 29 rcbd & unequal cell sizes - purdue university29-1 lecture 29 rcbd & unequal cell...

37
29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

Upload: others

Post on 29-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-1

Lecture 29

RCBD & Unequal Cell Sizes

STAT 512

Spring 2011

Background Reading

KNNL: 21.1-21.6, Chapter 23

Page 2: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-2

Topic Overview

• Randomized Complete Block Designs

(RCBD)

• ANOVA with unequal sample sizes

Page 3: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-3

RCBD

• Randomized complete block designs are

useful whenever the experimental units are

non-homogeneous.

• Grouping EU’s into “blocks” of

homogeneous units helps reduce the SSE

and increase the likelihood that we will be

able to see differences among treatments.

• A “block” consists of a complete replication

of the set of treatments. Blocks and

treatments are assumed not to interact.

Page 4: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-4

RCBD Model

• Assuming no replication, same as two-way ANOVA

with one observation per cell. No interaction between

block and treatment.

ijk i j ijkY µ ρ τ ε= + + +

where ( )20,iid

ijkNε σ∼ and 0

i iρ τ= =∑ ∑

• We refer to iρ as the block effects and

jτ as the

treatment effects.

• We are really only interested in further analysis on

the treatment effects.

Page 5: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-5

RCBD Example

• Want to study the effects of three different

sealers on protecting concrete patios from

the weather.

• Ten unsealed patios are available spread

across Indianapolis.

• Separate each patio into three portions, and

apply the treatments (randomly) in such a

way that each patio receives each treatment

for 1/3 of the surface.

Page 6: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-6

RCBD Example (2)

• Patio (location) is a blocking factor. Probably

the weather will be different in each location;

some patios may be better sheltered (trees,

etc.)

• If patio location is important, then failing to

block on patio location would probably mean

that the MSE will be overestimated.

• Blocking requires DF (9 in this case), but

usually if blocking variable is unimportant,

the MSE with/without blocking will be about

the same.

Page 7: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-7

RCBD Example (3)

Source DF SS MS F Value

patio 9 900 100 9.0

sealer 2 100 50 5.0

Error 18 180 10

Total 29 1180

• If the ANOVA results are as above, then

blocking is clearly important. If we do not

block here... Source DF SS MS F Value

sealer 2 100 50 1.25

Error 27 1080 40

Page 8: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-8

RCBD Example (4)

Source DF SS MS F Value

patio 9 108 12 1.2

sealer 2 100 50 5.0

Error 18 180 10

Total 29 388

• If the ANOVA results are as above, then

blocking doesn’t appear to have been as

important. In this case if we fail to block... Source DF SS MS F Value

sealer 2 100 50 4.69

Error 27 288 10.7

Page 9: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-9

Big Picture

• Failing to block when you should block can

cost you the ability to see treatment effects

• Blocking when there is no need usually often

doesn’t cost much at all (though it can if the

SSBlock is small enough relative to df).

• Blocking effectively requires foresight. An

experimenter must guess what sources of

variation will exist in order to block on them.

Page 10: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-10

Other Advantages of RCBD

• Reasonably simple analysis to perform

• Effective grouping makes results much more

precise.

• Can drop an entire block or treatment if

necessary, without complicating the

analysis.

• Can deliberately introduce extra variability

into the EU’s to widen the range of validity

of the results without sacrificing precision.

Page 11: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-11

Some Disadvantages of RCBD

• Missing observations are a complex problem

(since generally each treatment is

represented exactly once in each block)

• Loss of error degrees of freedom

• Additional assumptions are required for the

model (additivity, constant variance across

blocks)

Page 12: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-12

Multiple Blocking Variables

• Often is the case that you the EU’s have

multiple characteristics on which you

could block.

• Example: Consider the effect of three

treatments for asthma. Might block on

both AGE and GENDER.

• Each treatment would be represented once at

each AGE*GENDER combination.

Page 13: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-13

More on Blocking...

• Quite a bit more information in Chapter 21

o More than one replicate per block

o Factorial treatments

• Would discuss this and related topics in

STAT 514.

Page 14: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-14

Unequal Sample Sizes

• Encountered for a variety of reasons

including:

� Convenience – usually if we have an

observational study, we have very little

control over the cell sizes.

� Cost Effectiveness – sometimes the cost of

samples is different, and we may use larger

sample sizes when the cost is less

� In experimental studies, you may start with

a balanced design, but lose that balance if

problems occur.

Page 15: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-15

Unequal Sample Sizes (2)

• What changes?

� Loss of balance brings “intercorrelation”

among the predictors (i.e., variables are no

longer orthogonal)

� Type I and III SS will be different; typically

Type III SS should be used for testing

� LSMeans should be used for testing

� Standard errors for cell means and for

multiple comparisons will be different

� Confidence intervals will have different

widths

Page 16: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-16

Example

• Examine the effects of gender (A) and bone

development (B) on the rate of growth

induced by a synthetic growth hormone.

• Three categories of Bone Development

Depression (Severe, Moderate, and Mild)

• We categorize people on this basis after they

are in the study (it is an observational

factor); we wouldn’t want to throw away

data just to keep a balanced design.

• Page 954, growth.sas

Page 17: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-17

Data / Sample Sizes

Severe Moderate Mild

Male

1.4

2.4

2.2

2.1

1.7

0.7

1.1

Female

2.4 2.5

1.8

2.0

0.5

0.9

1.3

Page 18: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-18

Page 19: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-19

Interpretation

• Same as any interaction plot

• Effect seems to be greater if disease is more

severe.

• Effect seems greater for women than men.

• Possibly an interaction. The effect of bone

development is enhanced (greater) for

women as compared to men.

• We aren’t saying anything about

significance here – we’ll do that when we

look at the ANOVA.

Page 20: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-20

ANOVA Output

Source DF SS MS F Value Pr > F

Model 5 4.474 0.895 5.51 0.0172

Error 8 1.300 0.163

Total 13 5.774

R-Square Root MSE growth Mean

0.774864 0.403113 1.642857

Page 21: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-21

Type I / III SS

Source DF Type I SS MS F Value Pr > F

gender 1 0.00286 0.00286 0.02 0.8978

bone 2 4.39600 2.19800 13.53 0.0027

gen*bone 2 0.07543 0.03771 0.23 0.7980

Source DF Type III SS MS F Value Pr > F

gender 1 0.1200 0.1200 0.74 0.4152

bone 2 4.1897 2.0949 12.89 0.0031

gen*bone 2 0.0754 0.0377 0.23 0.7980

Page 22: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-22

Type X SS

• There are actually four relevant types of

sums of squares.

� I – Sequential

� II – Added Last (Observation)

� III – Added Last (Cell)

� IV – Added Last (Empty Cells)

Page 23: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-23

Types I SS

• Sequential Sums of Squares, appropriate for

equal cell sizes.

• SS(A), SS(B|A), SS(A*B|A,B)

• Each observation is weighted equally, with

the result that treatments are weighted in

proportion to their cell size (if unequal,

then not all treatments get the same weight

in the analysis)

Page 24: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-24

Types II SS

• Variable Added Last SS, appropriate for

equal cell sizes.

• SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

• Each observation is weighted equally

Page 25: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-25

Types III SS

• Variable Added Last SS, appropriate for

unequal cell sizes.

• SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

• Each cell/treatment is weighted equally,

but observations are weighted differently.

Type III SS adjusts for the fact that cell

sizes are different, unequal weighting of

observations.

Page 26: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-26

Type IV SS

• Variable Added Last SS, necessary if there

are empty cells

• SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

• Like Type III SS but additionally takes into

account the possibility of empty cells.

Page 27: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-27

Data: Design Chart

Severe Moderate Mild

Male xxx xx xx

Female x xxx xxx

Page 28: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-28

Example: Type I Hypotheses

Main Effect Gender

3 2 2 1 3 37 7 7 7 7 70 11 12 13 21 22 23:H µ µ µ µ µ µ+ + = + +

Main Effect Bone

3 1 2 3 2 35 5 5 54 40 11 21 12 22 13 23:H µ µ µ µ µ µ+ = + = +

Observations weighted equally, treatment

weighted by sample size.

Page 29: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-29

Example: Type III Hypotheses

Main Effect Gender

( ) ( )1 13 30 11 12 13 21 22 23:H µ µ µ µ µ µ+ + = + +

Main Effect Bone

( ) ( ) ( )1 1 12 2 20 11 21 12 22 13 23:H µ µ µ µ µ µ+ = + = +

Treatments are weighted equally, observations not

weighted equally.

Page 30: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-30

General Strategy

• Remember that Type I SS and Type III SS

examine different null hypotheses.

• Type III SS are preferred when sample sizes

are not equal, but can be somewhat

misleading if sample sizes differ greatly.

• Type IV SS are appropriate if there are

empty cells.

• Can obtain Type II/IV SS if necessary by

using /ss1 ss2 ss3 ss4 in MODEL

statement

Page 31: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-31

Example: Type III SS

Source DF Type III SS MS F Value Pr > F

gender 1 0.1200 0.1200 0.74 0.4152

bone 2 4.1897 2.0949 12.89 0.0031

gen*bone 2 0.0754 0.0377 0.23 0.7980

• The interaction and gender effects are not

significant.

• Now look at comparing different levels of

bone; should not ‘change’ models at this

point, so need to average over gender.

Page 32: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-32

Multiple Comparisons

• Suppose we keep model as is, and examine

effect of bone.

• Output from MEANS statement (WRONG):

Level of ------------growth-----------

bone N Mean Std Dev

mild 5 0.900 0.31622777

moderate 5 2.020 0.31144823

severe 4 2.100 0.47609523

• These numbers are not adjusted for gender.

Page 33: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-33

Multiple Comparisons (2)

• Output from LSMeans (means are correctly

adjusted for the level of gender – can think

of these as the means for the “average”

gender).

growth LSMEAN

bone LSMEAN Number

mild 0.90000000 1

moderate 2.00000000 2

severe 2.20000000 3

Page 34: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-34

Multiple Comparisons

• Example – Severe Case

MEANS: 1.4 2.4 2.2 2.4

2.14

+ + +=

(Sum up all severe cases and divide by number of severe

cases regardless of gender)

LSMEANS:

1.4 2.4 2.22.4

3 2.22

+ ++

=

(For severe cases, get averages for men and women and

then take the average – accounts for gender)

Page 35: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-35

Examining Differences (LSMEANS)

Least Squares Means for effect bone

Pr > |t| for H0: LSMean(i)=LSMean(j)

Dependent Variable: growth

i/j 1 2 3

1 0.0072 0.0059

2 0.0072 0.7845

3 0.0059 0.7845

• Mild group is significantly different from

the moderate and severe groups (those

groups are aided more by the hormone)

Page 36: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-36

Examining Differences (LSMEANS)

bone LSMEAN 95% Confidence Limits

mild 0.900 0.475707 1.324293

moderate 2.000 1.575707 2.424293

severe 2.200 1.663307 2.736693

• Growth rate increased for each group, but

increased by about 1 cm / month more in

the moderate/severe groups than in the

mild group

• Note that the widths of these CI’s are

different due to different sample sizes

(severe is wider, since less observations)

Page 37: Lecture 29 RCBD & Unequal Cell Sizes - Purdue University29-1 Lecture 29 RCBD & Unequal Cell Sizes STAT 512 Spring 2011 Background Reading KNNL: 21.1-21.6, Chapter 23

29-37

Upcoming in Lecture 30

• A few more examples of unequal sample

sizes.

• Analysis of Covariance