multivariate analysis of variance (manova)

36
Multivariate Analysis of Multivariate Analysis of Variance (MANOVA) Variance (MANOVA)

Upload: fiona-cash

Post on 03-Jan-2016

137 views

Category:

Documents


4 download

DESCRIPTION

Multivariate Analysis of Variance (MANOVA). Outline. Purpose and logic : page 3 Hypothesis testing : page 6 Computations: page 11 F -Ratios: page 25 Assumptions and noncentrality : page 35. MANOVA. When ? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance Multivariate Analysis of Variance (MANOVA)(MANOVA)

Page 2: Multivariate Analysis of Variance (MANOVA)

OutlineOutline

Purpose and logic : page 3Purpose and logic : page 3 Hypothesis testing :Hypothesis testing : page 6page 6 Computations:Computations: page 11page 11 FF-Ratios: page 25-Ratios: page 25 Assumptions and noncentrality : page 35Assumptions and noncentrality : page 35

Page 3: Multivariate Analysis of Variance (MANOVA)

MANOVAMANOVA When ?

When a research design contains two or more dependent variables we could perform multiple univariate tests or one multivariate test

Why ? MANOVA does not have the problem of inflated overall type I error rate

() Univariate tests ignore the correlations among the variables Multivariate tests are more powerful than multiple univariate tests

Assumptions Multivariate normality Absence of outliers Homogeneity of variance-covariance matrices Linearity Absence of multicollinearity

Page 4: Multivariate Analysis of Variance (MANOVA)

MANOVAMANOVA If the independent variables are discrete and the dependant variables

are continuous we will performed a MANOVA

Y XB ETo GLM where,

ijhr h ih jh ijh ijhry e From MANOVA

where, = grand mean, = treatment effect 1, = treatment effect 2, = interaction, e = error

1 11 21 1 1 11 21 1 1 111 121 1 1 1

2 12 22 1 2 12 22 1 2 112 122 1 1 2

1 2 1 1 2 1 11

, , ,..., , , ,..., , ( ) , ( ) ,..., ( )

, , ,..., , , ,..., , ( ) , ( ) ,..., ( )B

, , ,..., , , ,..., , ( ) , (

r c r c

r c r c

q q q r q q q c q q

T

12 1 1 ) ,..., ( )q r c q

Page 5: Multivariate Analysis of Variance (MANOVA)

MANOVAMANOVA

ExampleDrug

A B C

Male

5, 65, 49, 97, 6

7, 67, 7

9, 126, 8

21, 1514, 1117, 1212, 10

Female

7, 106, 69, 7

8, 10

10, 138, 77, 66, 9

16, 1214, 914, 810, 5

The general idea behind MANOVA is the same as previously. We want to find a ratio between explained variability over unexplained variability (error)

= treatment effect 1 (rows; r = 2) = treatment effect 2 (columns; c = 3)

ni = 4 N = r*c*ni=24q = number of DV = 2 (WeightLoss, Time)

Page 6: Multivariate Analysis of Variance (MANOVA)

Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)

Drug

A B C

Male

5, 65, 49, 97, 6

7, 67, 7

9, 126, 8

21, 1514, 1117, 1212, 10

Female

7, 106, 69, 7

8, 10

10, 138, 77, 66, 9

16, 1214, 914, 810, 5

Hypothesis Are the drug mean vectors equal? Are the sex mean vectors equal? Do some drugs interact with sex to produce inordinately high or low weight

decrements?

Page 7: Multivariate Analysis of Variance (MANOVA)

Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)

Using the GLM approach through a coding matrix

1 1

1 2

1 3

2 1

2 2

2 3

1 1 0 1 0

1 0 1 0 1

1 1 1 1 1

1 1 0 1 0

1 0 1 0 1

1 1 1 1 1

Page 8: Multivariate Analysis of Variance (MANOVA)

Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)

Then, for each subject we associate its corresponding group coding.

X Y

s1 1 1 0 1 0 5 6 s2 1 1 0 1 0 5 4 s3 1 1 0 1 0 9 9 s4 1 1 0 1 0 7 6 s5 1 0 1 0 1 7 6 s6 1 0 1 0 1 7 7 s7 1 0 1 0 1 9 12 s8 1 0 1 0 1 6 8 s9 1 -1 -1 -1 -1 21 15 s10 1 -1 -1 -1 -1 14 11 s11 1 -1 -1 -1 -1 17 12

M = s12 1 -1 -1 -1 -1 12 10 s13 -1 1 0 -1 0 7 10 s14 -1 1 0 -1 0 6 6 s15 -1 1 0 -1 0 9 7 s16 -1 1 0 -1 0 8 10 s17 -1 0 1 0 -1 10 13 s18 -1 0 1 0 -1 8 7 s19 -1 0 1 0 -1 7 6 s20 -1 0 1 0 -1 6 9 s21 -1 -1 -1 1 1 16 12 s22 -1 -1 -1 1 1 14 9 s23 -1 -1 -1 1 1 14 8 s24 -1 -1 -1 1 1 10 5

1 2 1 2[ : : ... : : : : ... : ]p qM x x x y y x

Page 9: Multivariate Analysis of Variance (MANOVA)

Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)

1 2 1 2

T T T T

[ : : ... : : : : ... : ]

( ) ( )

p q

pp pc

cp cc

S S

n

S S

M x x x y y x

M M 1 M 1 M SSCP

Page 10: Multivariate Analysis of Variance (MANOVA)

Canonical correlation matrixCanonical correlation matrix R is obtained by:

1 1cp pp pc cc

R S S S S

0.93673 -0.349632

R = 0.2258 0.136781

Page 11: Multivariate Analysis of Variance (MANOVA)

Error Matrix (E)Error Matrix (E) In ANOVA, the error was defined as e = (1-R2)Scc This is a special case of the MANOVA error matrix E

94.5 76.5

E = 76.5 114

ccSRIE )(

Page 12: Multivariate Analysis of Variance (MANOVA)

Hypothesis variation matrixHypothesis variation matrix The total variation is the sum of the various hypothesis variation add to

the error variation, i.e. T=E+H+H+H. Each matrix H is obtained by

T T 1 T( )i i i i iH Y M M M M Y

Where i {, , } The full model is omitted when performing hypothesis testing

(We start by testing the interaction, then the main effects, etc.)

Page 13: Multivariate Analysis of Variance (MANOVA)

Hypothesis variation matrixHypothesis variation matrix Interaction

X Y

s1 1 1 0 1 0 5 6 s2 1 1 0 1 0 5 4 s3 1 1 0 1 0 9 9 s4 1 1 0 1 0 7 6 s5 1 0 1 0 1 7 6 s6 1 0 1 0 1 7 7 s7 1 0 1 0 1 9 12 s8 1 0 1 0 1 6 8 s9 1 -1 -1 -1 -1 21 15 s10 1 -1 -1 -1 -1 14 11 s11 1 -1 -1 -1 -1 17 12

M = s12 1 -1 -1 -1 -1 12 10 s13 -1 1 0 -1 0 7 10 s14 -1 1 0 -1 0 6 6 s15 -1 1 0 -1 0 9 7 s16 -1 1 0 -1 0 8 10 s17 -1 0 1 0 -1 10 13 s18 -1 0 1 0 -1 8 7 s19 -1 0 1 0 -1 7 6 s20 -1 0 1 0 -1 6 9 s21 -1 -1 -1 1 1 16 12 s22 -1 -1 -1 1 1 14 9 s23 -1 -1 -1 1 1 14 8 s24 -1 -1 -1 1 1 10 5

= M

Page 14: Multivariate Analysis of Variance (MANOVA)

Hypothesis variation matrixHypothesis variation matrix Interaction

T T 1 T( ) H Y M M M M Y

14.33 21.33

H = 21.33 32.33

Page 15: Multivariate Analysis of Variance (MANOVA)

Here is the catch!Here is the catch! In univariate, the statistics is based on the F-ratio distribution

22

21(1 )

R dfF

R df

However, in MANOVA there is no unique statistic. Four statistics are commonly used: Hotelling-Lawley trace (HL), Pillai-Bartlett trace (PB), Wilk`s likelihood ratio (W) and Roy’s largest root (RLR).

Page 16: Multivariate Analysis of Variance (MANOVA)

Hotelling-Lawley trace (Hotelling-Lawley trace (HLHL)) The HL statistic is defined as

where s = min(dfi, q), i represents the tested effect (i {, , }), dfi is the degree of freedom associated with the hypothesis under investigation (, or ) and k is kth eigenvalue extracted from

HiE-1.

1

=1

=tr( )=s

i i kk

HL H E

Page 17: Multivariate Analysis of Variance (MANOVA)

Hotelling-Lawley trace (Hotelling-Lawley trace (HLHL)) Interaction

1

H E

Extracted eigenvalues

Page 18: Multivariate Analysis of Variance (MANOVA)

Hotelling-Lawley trace (Hotelling-Lawley trace (HLHL)) Interaction

df= (r-1)(c-1)=(2-1)(3-1) = 2 s = min(df, q) = min(2, 2) = 2

1

=1

=tr( )=s

kk

HL H E

1

H E

Trace

1

=1

=tr( )= 0.0004 0.2892 0.2896s

kk

HL H E

Page 19: Multivariate Analysis of Variance (MANOVA)

Pillai-Bartlett trace (Pillai-Bartlett trace (PBPB)) The PB statistic is defined as

where s = min(dfi, q), i represents the tested effect (i {, , }), dfi is the degree of freedom associated with the hypothesis under investigation (, or ) and k is kth eigenvalue extracted from

HiE-1.

1

=1

tr( ( ) )1

sk

i i ik k

PB

H E H

Page 20: Multivariate Analysis of Variance (MANOVA)

Pillai-Bartlett trace (Pillai-Bartlett trace (PBPB)) Interaction

df= (r-1)(c-1)=(2-1)(3-1) = 2 s = min(df, q) = min(2, 2) = 2

1( ) H E H

1

=1

tr( ( ) )1

sk

k k

PB

H E H

1tr( ( ) ) 0.0016 0.2253 0.2269PB H E H

Page 21: Multivariate Analysis of Variance (MANOVA)

Wilk’s likelihood ratio (Wilk’s likelihood ratio (WW)) The W statistic is defined as

where s = min(dfi, q), i represents the tested effect (i {, , }), dfi is the degree of freedom associated with the hypothesis under investigation (, or ), k is kth eigenvalue extracted from

HiE-1 and |E| (as well as |E+Hi|) is the determinant.

1

1

1( )

1

s

i iki k

W

E

E E HE H

Page 22: Multivariate Analysis of Variance (MANOVA)

Wilk’s likelihood ratio (Wilk’s likelihood ratio (WW)) Interaction

df= (r-1)(c-1)=(2-1)(3-1) = 2 s = min(df, q) = min(2, 2) = 2

1( ) E E H

1( ) 0.77436W E E H

1

1

1( )

1

s

k k

W

EE E H

E H

Page 23: Multivariate Analysis of Variance (MANOVA)

Roy’s largest root (Roy’s largest root (RLRRLR)) The RLR statistic is defined as

where i represents the tested effect (i {, , }) and k is kth

eigenvalue extracted from HiE-1.

( )

1 ( )k

ik

MaxRLR

Max

Page 24: Multivariate Analysis of Variance (MANOVA)

Roy’s largest root (Roy’s largest root (RLRRLR)) Interaction

( )

1 ( )k

k

MaxRLR

Max

( ) 0.2837230.221

1 ( ) 1 0.283723k

k

MaxRLR

Max

Page 25: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio-ratio All the statistics are equivalent when s = 1. In general there is no exact formula for finding the associated

p-value except on rare situations. Nevertheless, a convenient and sufficient approximation exists for

all but RLR. Since RLR is the least robust, attention will be focused on the first

three statistics: HL, PB and W. These three statistics’ distributions are approximated using an

F distribution which has the advantage of being simple to understand

22

21

( )( )

(1 )i

i

m ii

m

df mF m

df

Page 26: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio-ratio

Where df1 represents the numerator degree of freedom (df1 = q*dfi)

df2(m) the denominator degree of freedom for each statistic m (m {HLi, PBi and Wi})

2m is the multivariate measure of association for each statistic m

22

21

( )( )

(1 )i

i

m ii

m

df mF m

df

Page 27: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio (-ratio (HLHL)) The multivariate measure of association for HL is given by

The numerator df

The denominator df

2

i

iHL

i

HL

HL s

1 * idf q df

errdf n k l

2 ( ) 1 2i errdf HL s df q

Page 28: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio (-ratio (HLHL))InteractionInteraction

The multivariate measure of association for HL is given by

The numerator df

The denominator df

2 0.28960.1265

0.2896+2HL

HL

HL s

1 * 2*2 4df q df

* 24 2*3 18errdf N r c

2 ( ) 1 2 2(18 2 1) 2 32i errdf HL s df q

Page 29: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio (-ratio (PBPB)) The multivariate measure of association for PB is given by

The numerator df

The denominator df

2

i

iPB

PB

s

1 * idf q df

errdf n k l

2 ( )i errdf PB s df q s

Page 30: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio (-ratio (PBPB)) InteractionInteraction

The multivariate measure of association for PB is given by

The numerator df

The denominator df

2 0.2269490.113475

2PB

PB

s

1 * 2*2 4df q df

* 24 2*3 18errdf N r c

2 ( ) 2(18 2 2) 36errdf PB s df q s

Page 31: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio (-ratio (WW)) The multivariate measure of association for W is given by

The numerator df

The denominator df

12 1 g

iW iW

1 * idf q df

( 1 ) / 2err io df q df

12 ( ) 1

2i

dfdf W og

1

22 2 2 24 / 5i ig q df q df

Page 32: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio (-ratio (WW))InteractionInteraction

The multivariate measure of association for W is given by

The numerator df

The denominator df

122 1 0.774362 0.120021W

1 * 2*2 4df q df

( 1 ) / 2 18 (2 1 2) / 2 17.5erro df q df

12

4( ) 1 17.5*2 1 34

2 2i

dfdf W og

1 122 2 2 2 24 / 5 4*4 4 / 4 4 5 2g q df q df

Page 33: Multivariate Analysis of Variance (MANOVA)

Multivariate Multivariate FF-ratio-ratio

22

21

( ) 0.1265*32( ) 1.15877

(1 ) (1-0.1265)4

HL

HL

df HLF HL

df

HL (interaction, )

22

21

( ) 0.113475*36( ) 1.15199

(1 ) (1 0.113475)4

PB

PB

df PBF PB

df

PB (interaction, )

22

21

( ) 0.120021*34( ) 1.15933

(1 ) (1 0.120021)4

W

W

df WF W

df

W (interaction, )

Page 34: Multivariate Analysis of Variance (MANOVA)

MANOVA SummaryMANOVA Summary

Page 35: Multivariate Analysis of Variance (MANOVA)

MANOVAMANOVA Unfortunately there is no single test that is the most powerful if

the MANOVA assumptions are not met. If there is a violation of homogeneity of the covariance matrices or

the multivariate normality, then the PB statistic is the most robust while RLR is the least robust statistic.

If the noncentrality is concentrated (when the population centroids are largely confined to a single dimension), RLR provides the most power test.

Page 36: Multivariate Analysis of Variance (MANOVA)

MANOVAMANOVA If on the other hand, the noncentrality is diffuse (when the

population centroids differ almost equally in all dimensions) then PB, HT or W will all give good power.

However, in most cases, power differences among the four statistics are quite small (<0.06), thus it does not matter which statistics is used.