section 4.1-4.2 pairwise comparison & contrastsyibi/teaching/stat222/2019/ch04_2019.… · 4n...

26
Section 4.1-4.2 Pairwise Comparison & Contrasts Yibi Huang Chapter 4 - 1

Upload: others

Post on 15-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Section 4.1-4.2 Pairwise Comparison &Contrasts

Yibi Huang

Chapter 4 - 1

Page 2: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Analysis of Factor Level Means

I When F -test is significant; there exist differences among themeans. Now what?

I Want to determine which means are different and identifytreatments statistically of the same effect

Chapter 4 - 2

Page 3: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Notations for Normal and t-Critical ValuesIn the remainder of the course, we use zα/2 to denote the criticalvalue of N(0, 1) such that P(−zα/2 < Z < zα/2) = 1− α forZ ∼ N(0, 1).

− zα 2 zα 2,

α 2α 21−α

N(0,1) curve

We use tα/2,df to denote the critical value such thatP(−tα/2,df < T < tα/2,df ) = 1− α where T has a t-distributionw/ df degrees of freedom

− tα 2,df tα 2,df

α 2α 2 1−α

tdf curve

Chapter 4 - 3

Page 4: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Confidence Interval (CI) for One-Sample Mean (Review)

Recall that for a one-sample problem, X1,X2, . . . ,Xn are i.i.d. froma population with unknown mean µ and SD σ.

If σ is known, a (1− α)100% CI for µ is given by

X ± zα/2 ×σ√n

If σ is unknown and is estimated with the sample SD s, a(1− α)100% CI for µ is given by

X ± tα/2,n−1 ×s√n

Chapter 4 - 4

Page 5: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Confidence Interval for Treatment MeansFor one way ANOVA model,

yij = µi + εij

a naive 100(1− α)% CI for µi is

y i• ± tα/2,ni−1 ×si√ni, where si = sample SD of the ith group

I using only data in the ith group, ignoring the rest,

I not optimal

A better 100(1− α)% CI for µi is

y i• ± tα/2,N−g

√MSE√ni

I using observations in all groups to estimate the unknown σ

I with a higher df = N − g , not ni − 1

Chapter 4 - 5

Page 6: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Case Study: Grass/Weed Competition

Treatment 1N 1Y 2N 3N 4N 4YMean y i• 95 82.25 81.5 68.25 50.5 52

, MSE = 17.97

As N = 24, g = 6, for α = 0.05, the t-critical value ist0.05/2,24−6 ≈ 2.101.

> qt(0.05/2, df = 24-6, lower.tail=F)

[1] 2.100922

The 95% CI for µ4Y is

y i• ± tα/2,N−g

√MSE√ni

= 52± 2.101×√

17.97√4≈ 52± 4.45

Interpretation: For plots received 800 mg N/kg soil and 1 cm ofirrigation per week, we estimate that 52.0% of living material isblue stem on average one year after the plot is seeded with quackgrass, with a margin of error of 4.45% at 95% confidence.

Chapter 4 - 6

Page 7: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

If Analyzing Like One-Sample Data...

Treatment 1N 1Y 2N 3N 4N 4YMean y i• 95 82.25 81.5 68.25 50.5 52

SD si 2.160 3.775 3.512 5.560 5.000 4.546, MSE = 17.97

If using only the data in Group 4Y, the critical value would betα/2,n4Y−1 = t0.05/2,4−1 ≈ 3.182.

> qt(0.05/2, df = 4-1, lower.tail=F)

[1] 3.182446

The naive 95% CI for µ4Y would be

y4Y • ± tα/2,n4Y−1s4Y√n4Y≈ 52± 3.182× 4.546√

4≈ 52± 7.23.

Observe that the margin of error 7.23% for the naive CI is biggerthan the margin of error 4.45% for the CI in the previous page.

Chapter 4 - 7

Page 8: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Confidence Intervals for Pairwise DifferencesThe 100(1− α)% confidence interval (C.I.) for µi − µk is

y i• − yk• ± tα/2,N−g

√MSE

(1

ni+

1

nk

).

Note this is NOT a two-sample CI assuming equal SDs

y i•−yk•±tα/2,ni+nk−2

√s2p

( 1

ni+

1

nk

), where s2

p =(ni−1)s2

i +(nk−1)s2k

ni + nk − 2,

NOR the two-sample CI not assuming equal SDs

y i• − yk• ± tα/2,df

√s2i

ni+

s2k

nk

I MSE calculated using the entire dataset is a more accurateestimator of σ2 than s2

p or s2i , s2

k calculated using only data in thetwo groups compared

I The critical value for the two-sample C.I. is larger

tα/2,ni+nk−2 > tα/2,N−g

Chapter 4 - 8

Page 9: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Hypothesis Testing for DifferenceFor testing the hypothesis H0: µi − µk = 0, the test statistic is

t0 =y i• − yk•

SE(y i• − yk•)=

y i• − yk•√MSE

(1ni

+ 1nk

)which has a t−distribution with d.f. = N − g .

The calculation of the p-value depends on Ha as follows

Ha µi − µk 6= 0 µi − µk < 0 µi − µk > 0

p-value

−|t| |t| t t

t t

The bell curve above is the t-curve with df = N − g .Chapter 4 - 9

Page 10: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Case Study: Grass/Weed Competition

Treatment 1N 1Y 2N 3N 4N 4YMean y i• 95 82.25 81.5 68.25 50.5 52

SD si 2.160 3.775 3.512 5.560 5.000 4.546, MSE = 17.97

A 95% confidence interval for µ1N − µ1Y is

y1N• − y1Y • ± t0.025,18 ×

√MSE

(1

n1N+

1

n1Y

)

= 95− 82.25± 2.101×

√17.97

(1

4+

1

4

)= 12.75± 6.65

in which t0.025,18 = 2.101 is found using the R command

> qt(0.05/2, df = 18, lower.tail=F)

[1] 2.100922

With irrigation, we see the percentage of bluestem is reduced by12.75% on average, with a margin of error of 6.65% at 95% level.

Chapter 4 - 10

Page 11: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Case Study: Grass/Weed Competition

If one wants to test whether treatment 1N and treatment 1Y havethe same effect

H0 : µ1N − µ1Y = 0 v.s. Ha : µ1N − µ1Y 6= 0

the test statistic is

t =y1N• − y1Y •√

MSE(

1n1N

+ 1n1Y

) =95− 82.25√17.97

(14 + 1

4

) ≈ 12.75

2.9975≈ 4.253

with df = N − g = 24− 6 = 18. The two-sided P-value is

> 2*pt(4.235, df = 18, lower.tail=F)

[1] 0.0004979698

As the P-value < 0.05, we again confirm that irrigation makesbluestem less competitive.

Chapter 4 - 11

Page 12: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Pairwise t-Tests in R

The R command pairwise.t.test can perform pairwise comparisonsbetween all pairs of treatments, but it shows the P-values only.

> pairwise.t.test(grass$percent, grass$trt, p.adjust="none")

Pairwise comparisons using t tests with pooled SD

data: grass$percent and grass$trt

1N 1Y 2N 3N 4N

1Y 0.00048 - - - -

2N 0.00027 0.80527 - - -

3N 5.0e-08 0.00019 0.00033 - -

4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 -

4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287

P value adjustment method: none

Note that we must include p.adjust="none" in the command.Otherwise the P-value is not calculated using t-tests.

Chapter 4 - 12

Page 13: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Underline Diagrams (p.88, Section 5.4.1)a concise way to summarize pairwise comparisons

1N 1Y 2N 3N 4N

1Y 0.00048 - - - -

2N 0.00027 0.80527 - - -

3N 5.0e-08 0.00019 0.00033 - -

4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 -

4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287

How to make a underline diagram?

1. Write out treatment labels horizontally in increasing ordersorted by treatment means

2. (Write the group mean y i• under each correspondingtreatment label) (may skip)

3. Draw a line segment under a group of treatments if no pair oftreatments in that group is significantly different.

4Y 4N 3N 2N 1Y 1N50.5 52 68.25 81.5 82.25 95

Chapter 4 - 13

Page 14: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Underline Diagrams

In an experiment with 5 treatments A, B, C , D and E , theunderline diagram for all pairwise comparisons of the 5 treatmentsis as follows.

C B A D E

Answer the following questions:

I Order the means of the 5 groups from low to high.C < B < A < D < E

I Check all the pairs that are significantly different from eachother.

B vA vD vE v v v

C B A D

Chapter 4 - 14

Page 15: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Least Significant Difference (LSD)

I It’s lots of work to to compare all pairs of treatments.One needs to compute the SE, the t-statistic, and P-value foreach pair of treatments. When there g treatments, there are(g

2

)= g(g − 1)/2 pairs to compare with.

I When all groups are of the same size n, an easier way to dopairwise comparisons of all treatments is to compute the leastsignificant difference (LSD), which is the minimum amountby which two means must differ in order to be consideredstatistically different.

Chapter 4 - 15

Page 16: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Least Significant Difference (LSD)I When all groups are of the same size n, the SEs of pairwise

comparisons all equal to

SE =

√MSE

(1

n+

1

n

)I To be significant at level α, the t-statistic for pairwise

comparison

t =y j• − y i•

SEmust be at least tα/2,N−g in absolute value

I So treatment i and j are significantly different at level α ifand only if their difference in mean y j• − y i• is at least

tα/2,N−g

√MSE

(1n + 1

n

)in absolute value. So

LSD = tα/2,N−g

√MSE

(1

n+

1

n

).

Chapter 4 - 16

Page 17: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Least Significant Difference (LSD)

For the Grass/Weed experiment, the critical value for t%significance is tα/2,N−g = t0.025,24−6 ≈ 2.101, the LSD at 5% levelis

LSD = tα/2,N−g

√MSE

(1

n+

1

n

)= 2.101

√17.97

(1

4+

1

4

)≈ 6.30

Two treatments are significantly different at 5% level if and only iftheir mean differ by 6.30 or more.

The only two pairs with no significantly difference are (4Y, 4N) and(2N, 1Y), as they are the only pairs differ less than 6.30 in mean.

4Y 4N 3N 2N 1Y 1N50.5 52 68.25 81.5 82.25 95

Chapter 4 - 17

Page 18: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Questions of Interest Other Than Pairwise ComparisonsFor the Grass/Weed Competition experiment, we are interested in

I Irrigation effect: µ1N − µ1Y or µ4N − µ4Y or the combination

µ1Y + µ4Y

2− µ1N + µ4N

2

I Nitrogen effect: µ1N − µ2N , µ2N − µ3N , etc.

I Does the irrigation effect change with nitrogen levels?

(µ1N − µ1Y︸ ︷︷ ︸) − (µ4N − µ4Y︸ ︷︷ ︸)irrigation effect when irrigation effect when

nitrogen level = 1 nitrogen level = 4

I Is the nitrogen effect linear?

(µ1N − µ2N︸ ︷︷ ︸) − (µ2N − µ3N︸ ︷︷ ︸)slope when nitrogen level slope when nitrogen level

changes from 1 to 2 changes from 2 to 3

Chapter 4 - 18

Page 19: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

ContrastsAll the quantities above are contrasts.A contrast is a linear combination of group means µi ’s

C =∑g

i=1ωiµi

where the ω’s are known coefficients that add up to 0∑g

i=1 ωi = 0.

I Every pairwise comparison is a contrast! (C = µi − µk)ωi = 1, ωk = −1, all other ωi ’s are 0, and∑g

i=1 ωi = 1 + (−1) + 0 + · · ·+ 0 = 0

I A single treatment mean C = µi is NOT a contrast

I Is C =µ1 + µ2

2− µ3 + µ4 + µ5

3a contrast? Yes.

ω1 = ω2 = 12 , ω3 = ω4 = ω5 = −1

3 , which adds up to 0.

I Is C =µ1 + µ2

2− µ3 + µ4

2+ µ5 a contrast? No.

(ω1 = ω2 = 12 , ω3 = ω4 = −1

2 , ω5 = 1, which add up to 1, not 0.

Chapter 4 - 19

Page 20: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Estimator and Confidence Interval for a ContrastA natural estimator for a contrast C =

∑gi=1 ωiµi is

C =

g∑i=1

ωiy i•

As y1•, y2•, . . ., and yg• are indep. of each other, we know

Var( g∑

i=1

ωiy i•

)=

g∑i=1

Var(ωiy i•) =

g∑i=1

ω2i Var(y i•) =

g∑i=1

ω2i

σ2

ni.

The estimator C has standard deviation and standard error

SD(C ) =

√√√√σ2

g∑i=1

ω2i

ni, SE(C ) =

√√√√MSE×g∑

i=1

ω2i

ni

A (1− α)100% confidence interval for the contrast C is

C ± tα/2,N−g × SE(C )

Chapter 4 - 20

Page 21: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Hypothesis Testing for a Contrast

To test whether a contrast C is 0, H0 : C = 0, the test statistic is

t =C

SE(C )=

∑gi=1 ωiy i•√

MSE×∑g

i=1ω2i

ni

∼ tN−g

The calculation of the p-value depends on Ha as follows

Ha C 6= 0 C < 0 C > 0

p-value

−|t| |t| t t

t t

The bell curve above is the t-curve with df = N − g .

Chapter 4 - 21

Page 22: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Does the Irrigation Effect Change with Nitrogen Levels?

Treatment 1N 1Y 2N 3N 4N 4YMean y i• 95 82.25 81.5 68.25 50.5 52

, MSE = 17.97

The contrast we consider is

C = (µ1N − µ1Y︸ ︷︷ ︸) − (µ4N − µ4Y︸ ︷︷ ︸)irrigation effect when irrigation effect when

nitrogen level = 1 nitrogen level = 4

in which (ω1N , ω1Y , ω2N , ω3N , ω4N , ω4Y ) = (1,−1, 0, 0,−1, 1).The contrast is estimated by

C = y1N•−y1Y •−(y4N•−y4Y •) = 95−82.25−(50.5−52) = 14.25.

with the standard error

SE(C ) =

√√√√MSE

g∑i=1

ω2i

ni=

√17.97

(12

4+

(−1)2

4+

(−1)2

4+

12

4

)≈ 4.24

Chapter 4 - 22

Page 23: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Does the Irrigation Effect Change with Nitrogen Levels?

To test whether the irrigation effect changes with nitrogen levelH0: C = 0 v.s. Ha: C 6= 0, the t-statistic is

t =C

SE(C )=

14.25

4.24≈ 3.36

with 24− 6 = 18 degrees of freedom. The two-sided p-value is

> 2*pt(3.36,df=18, lower.tail=F)

[1] 0.003486951

The small P-value indicates the irrigation effects are significantlydifferent at the nitrogen level 1 and 4.

Chapter 4 - 23

Page 24: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Does the Irrigation Effect Change with Nitrogen Levels?

The 95% confidence interval for C is

C ± t0.025,N−g × SE(C ) ≈ 14.25± 2.101× 4.24 ≈ (5.34, 23.16)

in which t0.025,24−6 ≈ is found in R via the command

> qt(0.025,df=18, lower.tail=F)

[1] 2.100922

This means that the irrigation effect (% of bluestem with irrigation− without irrigation) is on average 5.34% to 23.16% higher atnitrogen level 1 than at level 4, with 95% confidence.

Chapter 4 - 24

Page 25: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Is the Nitrogen Effect Linear?

Treatment 1N 1Y 2N 3N 4N 4YMean y i• 95 82.25 81.5 68.25 50.5 52

, MSE = 17.97

The contrast we consider is

C = (µ1N − µ2N)− (µ2N − µ3N)

= µ1N − 2µ2N + µ3N

with the coefficients (ω1N , ω2N , ω3N) = (1,−2, 1).The contrast is estimated by

C = y1N• − 2y2N• + y3N• = 95− 2× 81.5 + 68.25 = 0.25.

with the standard error

SE(C ) =

√√√√MSE

g∑i=1

ω2i

ni=

√17.97

(12

4+

(−2)2

4+

12

4

)≈ 5.19

Chapter 4 - 25

Page 26: Section 4.1-4.2 Pairwise Comparison & Contrastsyibi/teaching/stat222/2019/Ch04_2019.… · 4N 1.5e-11 3.7e-09 5.3e-09 1.3e-05 - 4Y 2.7e-11 7.8e-09 1.1e-08 3.8e-05 0.62287 P value

Is the Nitrogen Effect Linear?To test whether the nitrogen effect is linear H0: C = 0 v.s. Ha:C 6= 0, the t-statistic is

t =C

SE(C )=

0.25

5.19≈ 0.048

with 24− 6 = 18 degrees of freedom. The two-sided p-value is

> 2*pt(0.048,df=18, lower.tail=F)

[1] 0.9622448

Conclusion: No strong evidence of nonlinear nitrogen effect (atlevel 1,2, and 3).

Remark: One can also test the linearity at level 2, 3, and 4

C = (µ2N − µ3N)− (µ3N − µ4N) = µ2N − 2µ3N + µ4N .

which is left as an exercise.Chapter 4 - 26