comparing k > 2 groups - numeric responses

Comparing k > 2 Groups - Numeric Responses

• Extension of Methods used to Compare 2 Groups• Parallel Groups and Crossover Designs• Normal and non-normal data structures

DataDesign

Normal Non-normal

ParallelGroups(CRD)

F-Test1-WayANOVA

Kruskal-Wallis Test

Crossover(RBD)

F-Test2-WayANOVA

Friedman’sTest

Parallel Groups - Completely Randomized Design (CRD)

• Controlled Experiments - Subjects assigned at random to one of the k treatments to be compared

• Observational Studies - Subjects are sampled from k existing groups

• Statistical model Yij is a subject from group i:

ijiijiijY

where is the overall mean, i is the effect of treatment i , ij is a random error, and i is the population mean for group i

1-Way ANOVA for Normal Data (CRD)

• For each group obtain the mean, standard deviation, and sample size:

1

)( 2

i

jiij

ii

jij

i n

yys

n

yy

• Obtain the overall mean and sample size

n

y

nynyn

ynnn i j ijkkk

11

1

Analysis of Variance - Sums of Squares

• Total Variation

1)(1 1

2 ndfyyTotalSS Total

k

i

n

j iji

• Between Group Variation

k

i

n

j

k

i Tiiii kdfyynyySST

1 1 122 1)()(

• Within Group Variation

ETTotal

Ek

i iik

i

n

j iij

dfdfdfSSESSTTotalSS

kndfsnyySSE i

12

1 12 )1()(

Analysis of Variance Table and F-TestSource ofVariation Sum of Squares

Degrres ofFreedom Mean Square F

Treatments SST k-1 MST=SST/(k-1) F=MST/MSEError SSE n-k MSE=SSE/(n-k)Total Total SS n-1

• H0: No differences among Group Means (k=0)

• HA: Group means are not all equal (Not all i are 0)

)(:

)4.(:..

:..

,1,

obs

knkobs

obs

FFPvalP

ATableFFRRMSEMSTFST

Example - Relaxation Music in Patient-Controlled Sedation in Colonoscopy

• Three Conditions (Treatments): – Music and Self-sedation (i = 1)– Self-Sedation Only (i = 2)– Music alone (i = 3)

• Outcomes– Patient satisfaction score (all 3 conditions)– Amount of self-controlled dose (conditions 1 and 2)

Source: Lee, et al (2002)


• Summary Statistics and Sums of Squares Calculations:

Trt (i) ni Mean Std. Dev.1 55 7.8 2.12 55 6.8 2.33 55 7.4 2.3

Total 165 overall mean=7.33 ---

164162275.84046.80929.31162316546.809)3.2)(155()3.2)(155()1.2)(155(

21329.31)33.74.7(55)33.78.6(55)33.78.7(55222

222

Total

E

T

dfTotalSSdfSSE

dfSST


• Analysis of Variance and F-Test for Treatment effects

Source ofVariation Sum of Squares

Degrres ofFreedom Mean Square F

Treatments 31.29 2 15.65 3.13Error 809.46 162 5.00Total 840.75 164

•H0: No differences among Group Means (3=0)

• HA: Group means are not all equal (Not all i are 0)

05.0)13.3(:

)4.(055.3:..

13.300.565.15:..

162,2,05.

FPvalP

ATableFFRR

FST

kobs

obs

Post-hoc Comparisons of Treatments

• If differences in group means are determined from the F-test, researchers want to compare pairs of groups. Three popular methods include:– Dunnett’s Method - Compare active treatments with a

control group. Consists of k-1 comparisons, and utilizes a special table.

– Bonferroni’s Method - Adjusts individual comparison error rates so that all conclusions will be correct at desired confidence/significance level. Any number of comparisons can be made.

– Tukey’s Method - Specifically compares all k(k-1)/2 pairs of groups. Utilizes a special table.

Bonferroni’s Method (Most General)

• Wish to make C comparisons of pairs of groups with simultaneous confidence intervals or 2-sided tests

• Want the overall confidence level for all intervals to be “correct” to be 95% or the overall type I error rate for all tests to be 0.05

• For confidence intervals, construct (1-(0.05/C))100% CIs for the difference in each pair of group means (wider than 95% CIs)

• Conduct each test at =0.05/C significance level (rejection region cut-offs more extreme than when =0.05)

Bonferroni’s Method (Most General)

• Simultaneous CI’s for pairs of group means:

jikncji nn

MSEtyy 11,2/

• If entire interval is positive, conclude i > j

• If entire interval is negative, conclude i < j

• If interval contains 0, cannot conclude i j


• C=3 comparisons: 1 vs 2, 1 vs 3, 2 vs 3. Want all intervals to contain true difference with 95% confidence

• Will construct (1-(0.05/3))100% = 98.33% CIs for differences among pairs of group means

)42.0,62.1(02.1)4.78.6(:32)42.1,62.0(02.1)4.78.7(:31)02.2,02.0(02.1)8.68.7(:21

02.1551

55100.540.211

5500.540.2

162),3(2/05.

3210083.162),3(2/05.

vsvsvs

nnMSEt

nnnMSEzt

ji

Note all intervals contain 0, but first is very close to 0 at lower end

CRD with Non-Normal Data Kruskal-Wallis Test

• Extension of Wilcoxon Rank-Sum Test to k>2 Groups

• Procedure:– Rank the observations across groups from smallest (1)

to largest (n = n1+...+nk), adjusting for ties

– Compute the rank sums for each group: T1,...,Tk . Note that T1+...+Tk = n(n+1)/2

Kruskal-Wallis Test• H0: The k population distributions are identical (1=...=k)

• HA: Not all k distributions are identical (Not all i are equal)

)(:

:..

)1(3)1(

12:..

2

21,

1

2

HPvalP

HRR

nnT

nnHST

k

k

ii

i

Post-hoc comparisons of pairs of groups can be made by pairwise application of rank-sum test with Bonferroni adjustment

Example - Thalidomide for Weight Gain in HIV-1+ Patients with and without TB

• k=4 Groups, n1=n2=n3=n4=8 patients per group (n=32)

• Group 1: TB+ patients assigned Thalidomide

• Group 2: TB- patients assigned Thalidomide

• Group 3: TB+ patients assigned Placebo

• Group 4: TB- patients assigned Placebo

• Response - 21 day weight gains (kg) -- Negative values are weight losses

Source: Klausner, et al (1996)

Example - Thalidomide for Weight Gain in HIV-1+ Patients with and without TB

TB+/Thal TB-/Thal TB+/Plac TB-/Plac9.0 (32) 2.5 (23) 0.0 (9) -0.5 (7)6.0 (31) 3.5 (26.5) 1.0 (15.5) 0.0 (9)4.5 (30) 4.0 (28.5) -1.0 (6) 2.5 (23)

2.0 (20.5) 1.0 (15.5) -2.0 (4) 0.5 (12)2.5 (23) 0.5 (12) -3.0 (1.5) -1.5 (5)3.0 (25) 4.0 (28.5) -3.0 (1.5) 0.0 (9)

1.0 (15.5) 1.5 (18.5) 0.5 (12) 1.0 (15.5)1.5 (18.5) 2.0 (20.5) -2.5 (3) 3.5 (26.5)T1=195.5 T2=173.0 T3=52.5 T4=107.0

815.7:..

98.17)33(38

)0.107(8

)5.52(8

)0.173(8

)5.195()33(32

12:..

214,05.

2222

HRR

HST

Weight Gain Example - SPSS OutputF-Test and Post-Hoc Comparisons

ANOVA

WTGAIN

109.688 3 36.563 10.206 .000100.313 28 3.583210.000 31

Between GroupsWithin GroupsTotal

Sum ofSquares df Mean Square F Sig.

GROUP

TB-/PlaceboTB+/PlaceboTB-/ThalidomideTB+/Thalidomide

Mea

n of

WTG

AIN

4

3

2

1

0

-1

-2

Weight Gain Example - SPSS OutputF-Test and Post-Hoc Comparisons

Multiple Comparisons

Dependent Variable: WTGAIN

1.313 .9464 .518 -1.271 3.8964.938* .9464 .000 2.354 7.5213.000* .9464 .018 .416 5.584

-1.313 .9464 .518 -3.896 1.2713.625* .9464 .003 1.041 6.2091.688 .9464 .302 -.896 4.271

-4.938* .9464 .000 -7.521 -2.354-3.625* .9464 .003 -6.209 -1.041-1.938 .9464 .195 -4.521 .646-3.000* .9464 .018 -5.584 -.416-1.688 .9464 .302 -4.271 .8961.938 .9464 .195 -.646 4.5211.313 .9464 1.000 -1.374 3.9994.938* .9464 .000 2.251 7.6243.000* .9464 .022 .313 5.687

-1.313 .9464 1.000 -3.999 1.3743.625* .9464 .004 .938 6.3121.688 .9464 .512 -.999 4.374

-4.938* .9464 .000 -7.624 -2.251-3.625* .9464 .004 -6.312 -.938-1.938 .9464 .301 -4.624 .749-3.000* .9464 .022 -5.687 -.313-1.688 .9464 .512 -4.374 .9991.938 .9464 .301 -.749 4.624

(J) GROUPTB-/ThalidomideTB+/PlaceboTB-/PlaceboTB+/ThalidomideTB+/PlaceboTB-/PlaceboTB+/ThalidomideTB-/ThalidomideTB-/PlaceboTB+/ThalidomideTB-/ThalidomideTB+/PlaceboTB-/ThalidomideTB+/PlaceboTB-/PlaceboTB+/ThalidomideTB+/PlaceboTB-/PlaceboTB+/ThalidomideTB-/ThalidomideTB-/PlaceboTB+/ThalidomideTB-/ThalidomideTB+/Placebo

(I) GROUPTB+/Thalidomide

TB-/Thalidomide

TB+/Placebo

TB-/Placebo

TB+/Thalidomide

TB-/Thalidomide

TB+/Placebo

TB-/Placebo

Tukey HSD

Bonferroni

MeanDifference

(I-J) Std. Error Sig. Lower Bound Upper Bound95% Confidence Interval

The mean difference is significant at the .05 level.*.

Weight Gain Example - SPSS OutputKruskal-Wallis H-Test

Ranks

8 24.448 21.638 6.568 13.38

32

GROUPTB+/ThalidomideTB-/ThalidomideTB+/PlaceboTB-/PlaceboTotal

WTGAINN Mean Rank

Test Statisticsa,b

18.0703

.000

Chi-SquaredfAsymp. Sig.

WTGAIN

Kruskal Wallis Testa.

Grouping Variable: GROUPb.

Crossover Designs: Randomized Block Design (RBD)

• k > 2 Treatments (groups) to be compared• b individuals receive each treatment (preferably in

random order). Subjects are called Blocks.• Outcome when Treatment i is assigned to Subject j

is labeled Yij

• Effect of Trt i is labeled i

• Effect of Subject j is labeled j

• Random error term is labeled ij

Crossover Designs - RBD

• Model:

ijjiijjiijY

• Test for differences among treatment effects:

• H0: 1k 0 (1k )

• HA: Not all i = 0 (Not all i are equal)

RBD - ANOVA F-Test (Normal Data)• Data Structure: (k Treatments, b Subjects)

• Mean for Treatment i:

• Mean for Subject (Block) j:

• Overall Mean:

• Overall sample size: n = bk

• ANOVA:Treatment, Block, and Error Sums of Squares

.iyjy .

y

)1)(1(

1

1

1

1

2

.

1

2

.

1 1

2

kbdfSSBSSTTotalSSSSE

bdfyykSSB

kdfyybSST

bkdfyyTotalSS

E

Bb

j j

Tk

i i

k

i

b

j Totalij

RBD - ANOVA F-Test (Normal Data)• ANOVA Table:Source SS df MS F

Treatments SST k-1 MST = SST/(k-1) F = MST/MSEBlocks SSB b-1 MSB = SSB/(b-1)Error SSE (b-1)(k-1) MSE = SSE/[(b-1)(k-1)]Total TotalSS bk-1

•H0: 1k 0 (1k )

• HA: Not all i = 0 (Not all i are equal)

)(:

:..

:..

)1)(1(,1,

obs

kbkobs

obs

FFPvalP

FFRRMSEMSTFST

Example - Theophylline Interaction• Goal: Determine whether Cimetidine or Famotidine interact with Theophylline

• 3 Treatments: Theo/Cim, Theo/Fam, Theo/Placebo

• 14 Blocks: Each subject received each treatment

• Response: Theophylline clearance (liters/hour) Subject T/C T/F T/P BLK Mean

1 3.69 5.13 5.88 4.902 3.61 7.04 5.89 5.513 1.15 1.46 1.46 1.364 4.02 4.44 4.05 4.175 1.00 1.15 1.09 1.086 1.75 2.11 2.59 2.157 1.45 2.12 1.69 1.758 2.59 3.25 3.16 3.009 1.57 2.11 2.06 1.91

10 2.34 5.20 4.59 4.0411 1.31 1.98 2.08 1.7912 2.43 2.38 2.61 2.4713 2.33 3.53 3.42 3.0914 2.34 2.33 2.54 2.40

TRT Mean 2.26 3.16 3.08 2.83Source: Bachmann, et al (1995)

Example - Theophylline InteractionTests of Between-Subjects Effects

Dependent Variable: CLRNCE

78.817a 15 5.254 15.888 .000336.713 1 336.713 1018.119 .000

7.005 2 3.503 10.591 .00071.811 13 5.524 16.703 .0008.599 26 .331

424.129 4287.415 41

SourceCorrected ModelInterceptTRTSUBJECTErrorTotalCorrected Total

Type III Sumof Squares df Mean Square F Sig.

R Squared = .902 (Adjusted R Squared = .845)a.

• The test for differences in mean theophylline clearance is given in the third line of the table

•T.S.: Fobs=10.59

• R.R.: Fobs F.05,2,26 = 3.37 (From F-table)

• P-value: .000 (Sig. Level)

Example - Theophylline InteractionPost-hoc Comparisons

Multiple Comparisons

Dependent Variable: CLRNCE

-.9036* .21736 .001 -1.4437 -.3635-.8236* .21736 .002 -1.3637 -.2835.9036* .21736 .001 .3635 1.4437.0800 .21736 .928 -.4601 .6201.8236* .21736 .002 .2835 1.3637

-.0800 .21736 .928 -.6201 .4601-.9036* .21736 .001 -1.4598 -.3474-.8236* .21736 .002 -1.3798 -.2674.9036* .21736 .001 .3474 1.4598.0800 .21736 1.000 -.4762 .6362.8236* .21736 .002 .2674 1.3798

-.0800 .21736 1.000 -.6362 .4762

(J) TRTTheophylline/FamotidineTheophylline/PlaceboTheophylline/CimetidineTheophylline/PlaceboTheophylline/CimetidineTheophylline/FamotidineTheophylline/FamotidineTheophylline/PlaceboTheophylline/CimetidineTheophylline/PlaceboTheophylline/CimetidineTheophylline/Famotidine

(I) TRTTheophylline/Cimetidine

Theophylline/Famotidine

Theophylline/Placebo

Theophylline/Cimetidine

Theophylline/Famotidine


Tukey HSD

Bonferroni

MeanDifference

(I-J) Std. Error Sig. Lower Bound Upper Bound95% Confidence Interval

Based on observed means.The mean difference is significant at the .05 level.*.

514.31:

57.22:

)1)(1(,,05.

)1)(1(,2/

qb

MSEqyyTukey

tb

MSEtyyBonferroni

kbkji

kbcji

Example - Theophylline InteractionPlot of Data (Marginal means are raw data)

Estimated Marginal Means of CLRNCE

SUBJECT

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Est

imat

ed M

argi

nal M

eans

7

6

5

4

3

2

1

0

TRT

Theophylline/Cimetid

ine

Theophylline/Famotid

ine


RBD -- Non-Normal DataFriedman’s Test

• When data are non-normal, test is based on ranks• Procedure to obtain test statistic:

– Rank the k treatments within each block (1=smallest, k=largest) adjusting for ties

– Compute rank sums for treatments (Ti) across blocks

– H0: The k populations are identical (1=...=k)

– HA: Differences exist among the k group means

)(:

:..

)1(3)1(

12:..

2

21,

12

r

kr

k

i ir

FPvalP

FRR

kbTkbk

FST

Example - tmax for 3 formulation/fasting states

• k=3 Treatments of Valproate: Capsule/Fasting (i=1), Capsule/nonfasting (i=2), Enteric-Coated/fasting (i=3)

• b=11 subjects

• Response - Time to maximum concentration (tmax)Subject C/F C/NF EC/F

1 3.5 (2) 4.5 (3) 2.5 (1)2 4.0 (2) 4.5 (3) 3.0 (1)3 3.5 (2) 4.5 (3) 3.0 (1)4 3.0 (1.5) 4.5 (3) 3.0 (1.5)5 3.5 (1.5) 5.0 (3) 3.5 (1.5)6 3.0 (1) 5.5 (3) 3.5 (2)7 4.0 (2.5) 4.0 (2.5) 2.5 (1)8 3.5 (2) 4.5 (3) 3.0 (1)9 3.5 (1.5) 5.0 (3) 3.5 (1.5)

10 3.0 (1) 4.5 (3) 3.5 (2)11 4.5 (2) 6.0 (3) 3.0 (1)

Rank sum T1=19.0 T2=32.5 T3=14.5Source: Carrigan, et al (1990)

Example - tmax for 3 formulation/fasting states

99.5:..

95.15)13)(11(35.145.320.19)13)(3(11

12:..

213,05.

222

r

r

FRR

FST

H0: The k populations are identical (1=...=k)HA: Differences exist among the k group means

comparing k > 2 groups - numeric responses

Documents