topic 27: two-way anova strategies of analysisbacraig/notes512/topic_27.pdf · •n=2 stores for...

51
Topic 27: Two-Way ANOVA Strategies of Analysis

Upload: doanduong

Post on 24-Apr-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Topic 27: Two-Way

ANOVA

Strategies of Analysis

Page 2: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Outline

• Strategy for analysis of two-way

studies

– Interaction is not significant

– Interaction is significant

• What if levels of factor or factors is

quantitative?

• What if cell size is n=1?

Page 3: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

An analytical strategy

• Run the model with main effects and

the two-way interaction

• Plot the data, the means (interaction

plot), and look at the normal quantile

plot

– Allows you to check assumptions and

get impression of what may be

significant

• If assumptions reasonable, check the

significance test for the interaction

Page 4: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

AB interaction not sig• Can now consider main effect tests

• Possibly rerun the analysis without the

interaction. Only consider if dfe is small

• For a main effect that is not significant

– No evidence to conclude that the levels of

this factor are associated with different

means of the response

– Possibly rerun without this factor giving a

one-way anova. Only consider if dfe is small

Watch out for Type II errors that inflate MSE

Page 5: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

AB interaction not sig• If both main effects are not significant

– Basically assuming a one population model

– Model could be run as Y=;

• For a main effect with more than two levels that is significant, use the means or lsmeans statement with the tukeymultiple comparison procedure

• Contrasts and linear combinations can also be examined using the contrast and estimate statements

Page 6: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

AB interaction sig but not

important

• Plots and a careful examination of the

cell means may indicate that the

interaction is not very important even

though it is statistically significant

• Use the marginal means for each

significant main effect to describe the

important results

• You may need to qualify these results

using the interaction

Page 7: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

AB interaction sig but not

important

• Use the methods described for the

situation where the interaction is not

significant but keep the interaction in the

model

• Carefully interpret the marginal means

as averages over the levels of the other

factor

Page 8: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

AB interaction is sig and

important

• Options include

–Treat as a one-way with ab levels; use tukey to compare means; contrasts and estimate can also be useful

–Report that the interaction is significant; plot the means and describe the pattern

–Analyze the levels of A for each level of B (use a BY statement) or vice versa

Page 9: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Specification of contrast

and estimate statements

• When using the contrast statement always double check your results with the estimate statement

• The order of factors is determined by the order in the class statement, not the order in the model statement

• Contrasts should come a priori from research questions

Page 10: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

KNNL Example

• KNNL p 833

• Y is the number of cases of bread sold

• A is the height of the shelf display, a=3 levels: bottom, middle, top

• B is the width of the shelf display, b=2: regular, wide

• n=2 stores for each of the 3x2 treatment combinations

Page 11: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Contrast/Estimate Example

• With this approach, the contrast

should correspond to a research

question formulated before

examining the data

• Suppose we a priori want to compare

the average of the two height =

middle cells with the average of the

other four cells

Page 12: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Contrast/Estimate

• First, formulate question as a contrast

using the cell means model

(μ21 + μ22)/2 = (μ11 + μ12 + μ31 + μ32)/4

-.25μ11-.25μ12+.5μ21+.5μ22-.25μ31-.25μ32 =0

• Then translate the contrast into the

factor effects model using

μij = μ + αi + βj + (αβ)ij

Page 13: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Contrast/Estimate

• -.25μ11 = -.25(μ + α1 + β1 + (αβ)11)

• -.25μ12 = -.25(μ + α1 + β2 + (αβ)12)

• +.5μ21 = +.5 (μ + α2 + β1 + (αβ)21)

• +.5μ22 = +.5 (μ + α2 + β2 + (αβ)22)

• -.25μ31 = -.25(μ + α3 + β1 + (αβ)31)

• -.25μ32 = -.25(μ + α3 + β2 + (αβ)32)

• C = (-.5α1 + α2 - .5α3) + (-.25(αβ)11-.25(αβ)12+.5(αβ)21+.5(αβ)22-.25(αβ)31-.25(αβ)32)

Page 14: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Proc glm

proc glm data=a1;

class height width;

model sales=height width

height*width;

Page 15: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Contrast and estimatecontrast 'middle two vs all others'

height -.5 1 -.5

height*width -.25 -.25 .5 .5 -.25 -.25;

estimate 'middle two vs all others'

height -.5 1 -.5

height*width -.25 -.25 .5 .5 -.25 -.25;

means height*width;

run;

Page 16: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Contrast DF Contrast SS Mean Square F Value Pr > F

middle two vs all others 1 1536.000000 1536.000000 148.65 <.0001

Parameter Estimate

Standard

Error t Value Pr > |t|

middle two vs all others 24.0000000 1.96850197 12.19 <.0001

Page 17: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Check with means

1 1 451 2 432 1 652 2 693 1 403 2 44

(65+69)/2 – (45+43+40+44)/4

= 67 – 43 = 24

Page 18: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

One Quantitative factor

and one categorical• Plot the means vs the quantitative

factor for each level of the categorical factor

• Consider linear and quadratic terms for the quantitative factor (regression)

• Consider different relationships for the different levels of the categorical factor

• Lack of fit analysis can be useful

Page 19: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Two Quantitative factors

• Plot the means

–vs A for each level B

–vs B for each level A

• Consider linear and quadratic terms

• Consider products to allow for

interaction

• Lack of fit analysis can be useful

• Response surface methodology

Page 20: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

One observation per cell

• For Yijk we use

– i to denote the level of the factor A

– j to denote the level of the factor B

–k to denote the kth observation in cell (i,j)

• i = 1, . . . , a levels of factor A

• j = 1, . . . , b levels of factor B

• n = 1 observation in cell (i,j)

Page 21: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Factor effects model

• μij = μ + αi + βj

• μ is the overall mean

• αi is the main effect of A

• βj is the main effect of B

• Because we have only one observation per cell, we do not have enough information to estimate the interaction in the usual way…it is confounded with error

Page 22: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Constraints• Σi αi= 0

• Σjβj = 0

• SAS GLM Constraints

–αa= 0 (1 constraint)

–βb = 0 (1 constraint)

Page 23: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

ANOVA Table

Source df SS MS F

A a-1 SSA MSA MSA/MSE

B b-1 SSB MSB MSB/MSE

Error (a-1)(b-1) SSE MSE _

Total ab-1 SST MST

Would normally be df for

interaction

Page 24: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Expected Mean Squares

• Based on model which has no

interaction

• E(MSE) = σ2

• E(MSA) = σ2 + b(Σiαi2)/(a-1)

• E(MSB) = σ2 + a(Σjβj2)/(b-1)

• Here, αi and βj, are defined with the

usual factor effects constraints

Page 25: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

KNNL Example

• KNNL p 882

• Y is the premium for auto insurance

• A is the size of the city, a=3 levels:

small, medium and large

• B is the region, b=2: East, West

• n=1 the response is the premium

charged by a particular company

Page 26: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

The data

data a2;

infile 'c:\...\CH20TA02.txt';

input premium size region;

if size=1 then sizea='1_small ';

if size=2 then sizea='2_medium';

if size=3 then sizea='3_large ';

proc print data=a1; run;

Page 27: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Obs premium size region sizea1 140 1 1 1_small2 100 1 2 1_small3 210 2 1 2_medium4 180 2 2 2_medium5 220 3 1 3_large6 200 3 2 3_large

Page 28: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Proc glm

proc glm data=a2;

class sizea region;

model premium=

sizea region/solution;

means sizea region / tukey;

output out=a3 p=muhat;

run;

Page 29: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Source DF

Sum of

Squares Mean Square F Value Pr > F

Model 3 10650.00000 3550.00000 71.00 0.0139

Error 2 100.00000 50.00000

Corrected Total 5 10750.00000

Page 30: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Source DF Type III SS Mean Square F Value Pr > F

sizea 2 9300.000000 4650.000000 93.00 0.0106

region 1 1350.000000 1350.000000 27.00 0.0351

Differences in both city size and

region

Page 31: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output solution

Parameter Estimate

Standard

Error t Value Pr > |t|

Intercept 195.0000000 B 5.77350269 33.77 0.0009

sizea 1_small -90.0000000 B 7.07106781 -12.73 0.0061

sizea 2_medium -15.0000000 B 7.07106781 -2.12 0.1679

sizea 3_large 0.0000000 B . . .

region 1 30.0000000 B 5.77350269 5.20 0.0351

region 2 0.0000000 B . . .

Page 32: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Mathematical Check

region sizea muhat1 1_s 135=195-90+30 2 1_s 105=195-901 2_m 210=195-15+30 2 2_m 180=195-151 3_l 225=195 +30 2 3_l 195=195

Page 33: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

1_small 2_medium 3_large

sizea

100

125

150

175

200

225pre

miu

m

21region

Interaction Plot for premium

Shows model

predicted values

and observations.

That’s why

“curves” are

parallel

Page 34: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Multiple comparisons

Size

Means with the same letter are not significantly different.

Tukey Grouping Mean N sizea

A 210.000 2 3_large

A

A 195.000 2 2_medium

B 120.000 2 1_small

Page 35: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Multiple comparisons

Region

The anova results told us that

these were different. No need for multiple

comparison adjustment when only two levels

Means with the same letter are not significantly different.

Tukey Grouping Mean N region

A 190.000 3 1

B 160.000 3 2

Page 36: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Plot the data

symbol1 v='E' i=join c=blue;

symbol2 v='W' i=join c=green;

title1 'Plot of the data';

proc gplot data=a3;

plot premium*sizea=

region;

run;

Page 37: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Possible

interaction?

Page 38: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Tukey test for additivity

• Can’t look at usual interaction but

can add one additional term to the

model (θ) to look at specific type of

interaction

• μij = μ + αi + βj + θαiβj

• We use one degree of freedom to

estimate θ

• There are other variations, eg θiβj

Page 39: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Step 1: Get muhat

(grand mean)

proc glm data=a2;

model premium=;

output out=aall p=muhat;

proc print data=aall;

run;

Page 40: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Obs premium size region muhat1 140 1 1 1752 100 1 2 1753 210 2 1 1754 180 2 2 1755 220 3 1 1756 200 3 2 175

Page 41: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Step 2: Get muhatA

(factor A means)

proc glm data=a2;

class size;

model premium=size;

output out=aA p=muhatA;

proc print data=aA;

run;

Page 42: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Obs premium size region muhatA1 140 1 1 1202 100 1 2 1203 210 2 1 1954 180 2 2 1955 220 3 1 2106 200 3 2 210

Page 43: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Step 3: Find muhatB

(factor B means)

proc glm data=a2;

class region;

model premium=region;

output out=aB p=muhatB;

proc print data=aB;

run;

Page 44: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Obs remium size region muhatB1 140 1 1 1902 100 1 2 1603 210 2 1 1904 180 2 2 1605 220 3 1 1906 200 3 2 160

Page 45: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Step 4: Combine data

sets and compute

data a4; merge aall aA aB;

alpha=muhatA-muhat;

beta=muhatB-muhat;

atimesb=alpha*beta;

proc print data=a4;

var size region atimesb;

run;

Page 46: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Obs size region atimesb1 1 1 -8252 1 2 8253 2 1 3004 2 2 -3005 3 1 5256 3 2 -525

Page 47: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Step 5: Run glm

proc glm data=a4;

class size region;

model premium=size region

atimesb/solution;

output out=a5 p=tukmuhat;

run;

Page 48: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Source DF F Value Pr > Fsize 2 360.37 0.0372region 1 104.62 0.0620atimesb 1 6.75 0.2339

Not significant…not enough

evidence of this type of

interaction but really small study

(1 df error)

Page 49: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Output

Par Est SE t PInt 195 B 2.9 66.49 0.0096size1 -90 B 3.5 -25.05 0.0254size2 -15 B 3.5 -4.18 0.1496size3 0 B . . .region1 30 B 2.9 10.23 0.0620region2 0 B . . .atimesb -0.006 0.002 -2.60 0.2339

Page 50: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Does this look

more like the

means plot?

Page 51: Topic 27: Two-Way ANOVA Strategies of Analysisbacraig/notes512/Topic_27.pdf · •n=2 stores for each of the 3x2 ... •KNNL p 882 •Y is the premium ... I n te r a c tio n P lo

Last slide

• Read KNNL Chapters 19-20

• We used program topic27.sas to

generate the output for today