lecture 5: hypothesis tests for more than one samplethulin/mm/l5.pdf · paired comparisons:...

Lecture 5: Hypothesis tests for more than onesample

Mans Thulin

Department of Mathematics, Uppsala University

[email protected]

Multivariate Methods • 8/4 2011

1/23

Outline

I Paired comparisons

I Repeated measures

I Comparing mean vectors from two populations

I Comparing mean vectors from more than two populationsI MANOVA

2/23

Repetition: Testing H0 : µ = µ0

Let X ∼ Np(µ,Σ). When testing the hypothesis H0 : µ = µ0, weuse Hotelling’s T 2:

T 2 = n(X− µ0)′S−1(X− µ0).

Under H0,n − p

(n − 1)p· T 2 ∼ Fp,n−p.

The T 2 test therefore rejects H0 : µ = µ0 at level α if

T 2 >(n − 1)p

n − pFp,n−p(α).

Similarly, the p-value of the test is obtained as

p = P(T 2 > x)

where x is the observed value of the statistic.

3/23

Paired comparisons

If we wish to study the effect of a treatment, it is often desirableto measure the response variables of interest on a single unitbefore and after the treatment was applied to that unit. Thisprocedure eliminates unit-to-unit variation.

Examples: pH of lakes before and after chalk was added, healthof patients before and after medication, people’s view of UppsalaUniversity before and after a nationwide advertising campaign...

Similarly, if we wish to compare two treatments, we can apply bothtreatments to the same (or identical) experimental unit.

Such experimental designs are called paired comparisons, since themeasurements are made in pairs.

4/23

Paired comparisons: Hotelling’s T 2

Let Xj1 denote the response to treatment 1 and Xj2 denote theresponse to treatment 2 for experimental unit j .If Xj1 and Xj2 are multivariate normal, then

Dj = Xj1 − Xj2 ∼ Np(δ,Σd),

where δ is the mean difference between the treatments.

If the treatments are applied independently to n independent units,so that D1, . . . ,Dn are independent Np(δ,Σd) random vectors,then

T 2 = n(D− δ)′S−1d (D− δ) ∼ p(n − 1)

n − pFp,n−p.

This is simply the result about Hotelling’s T 2 from the last lecture.The problem of comparing the two samples X11, . . . ,X1n1 andX21, . . . ,X2n2 is simplified to the familiar one sample problem bylooking at the pairwise differences.

5/23

Paired comparisons: testingThe hypothesis H0 : δ = 0 is rejected in favour of the alternativeH1 : δ 6= 0 if

T 2 = nd′S−1d d >

p(n − 1)

n − pFp,n−p(α)

where d′j = (dj1, dj2, . . . , djp), j = 1, . . . , n are the observeddifferences between the n units.

A confidence region for δ with confidence level α consists of all δsuch that

(d− δ)′S−1d (d− δ) ≤ p(n − 1)

n(n − p)Fp,n−p(α).

The simultaneous Bonferroni confidence intervals for the individualmean differences δi are given by

Iδi =(di ± tn−1

( α2p

)√s2di/n).

6/23

Paired comparisons: contrasts

If we use a little matrix algebra, it is not necessary to calculate allthe differences. Instead, we can use contrast matrices.

See blackboard!

On the other hand, it may be advisable to calculate the differencesd1, . . . ,dn in order to assess their normality.

The notion of contrast matrices can also be used for repeatedmeasures designs.

7/23

Repeated measures

A situation that is similar to what we just studied is when we wishto compare the effects of q different treatments on a singleresponse variable.

Let X1, . . . ,Xn be i.i.d. Np(µ,Σ) observations, with

Xj = (Xj1,Xj2, . . . ,Xjq)′,

where Xji is the response to the ith treatment on the jthexperimental unit.

Typically, we wish to test the hypothesis that there is no differencebetween the treatment means. This is stated using contrastmatrices.

See blackboard!

8/23

Repeated measures: testing

When the treatment means are equal, C1µ = C2µ = 0. In fact,Cµ = 0 for any contrast matrix C.

Given C, we can compute the observed contrasts Cxj , with meanCx and sample covariance CSC′.The hypothesis Cµ = 0 is tested using

T 2 = n(Cx)′(CSC′)−1(Cx) ∼ (q − 1)(n − 1)

n − q + 1Fq−1,n−q+1

under H0.

The statistic T 2 is independent of the choice of contrast matrix C.

9/23

Comparing mean vectors from two populations

Often we wish to compare the mean vectors of two populations insituations where it isn’t possible to use paired comparisons.

Assume that we have a p-variate sample X11,X12, . . . ,X1n1 from adistribution with mean µ1 and covariance Σ1 and a p-variatesample X21,X22, . . . ,X2n2 from a distribution with mean µ2 andcovariance Σ2.

Furthermore, assume that the two samples are independent.

We wish to test the hypothesis that µ1 − µ2 = δ0.

10/23

Two populations: Hotelling’s T 2

In order to construct a test statistic for this hypothesis, we thinkabout how Hotelling’s T 2 is constructed in the one-sample case.

See blackboard!

Result 6.2. If X11,X12, . . . ,X1n1 are i.i.d. Np(µ1,Σ) andX21,X22, . . . ,X2n2 are i.i.d. Np(µ2,Σ), then

T 2 =(

X1 − X2 − (µ1 − µ2))′(

(1

n1+

1

n2)Sp

)−1(X1 − X2 − (µ1 − µ2)

)∼ (n1 + n2 − 2)p

n1 + n2 − p − 1Fp,n1+n2−p−1.

The assumption that the covariance matrices are equal is quitestrong! There are p variances and p(1−p)

2 distinct covariances inthe covariance matrix.

On the other hand, the ”real” null hypothesis may be that thedistributions, and not just the mean vectors, are equal for the twotreatments.

11/23

Two populations: Behrens-Fisher problem

The problem of making inferences about the two means of two(univariate) normal populations without assuming that thevariances are equal is called the Behrens-Fisher problem.

Different approaches to this problem have been proposed by Fisher,Behrens, Chapman, Dudewicz and Ahmed, among others. Themost commonly used solution was given by Welch, who proposed at-test using s2

d = s21/n1 + s2

2/n2. His statistic is approximatelyt-distributed, with a complicated expression for the degrees offreedom.

Further reading:

I Kim, S.-H., Cohen, A.S. (1998): On the Behrens-Fisherproblem: a review, Journal of Educational and BehavioralStatistics, 23, pp. 356-377.

12/23

Two populations: Behrens-Fisher problem

When comparing mean vectors of two multivariate normalpopulations with unequal covariance matrices, the problembecomes even more complicated. Some possible solutions are:

I Use that

T 2 =(

X1 − X2 − δ0

)′((

1

n1S1 +

1

n2S2))−1(

X1 − X2 − δ0

)≈ χ2

p

under H0 when n1 − p and n2 − p are large, even if the data isnon-normal.

I Use that, for normal data, T 2 above is approximatelydistributed as

νp

ν − p + 1Fp,ν−p+1

where ν is given by the complicated expression (6-29) in J&W.

I Use a different, more robust, test! (e.g. Tiku and Singh(1982))

13/23

MANOVA: Multivariate ANalysis Of VAriance

Now let’s assume that we have observations from g populations:

Population 1: X11,X12, . . . ,X1n1

Population 2: X21,X22, . . . ,X2n2

......

Population g : Xg1,Xg2, . . . ,Xgng

and that we wish to test the hypothesis that all populations havethe same mean. If there are differences, we’d like to be able to saywhich means differ.

14/23

MANOVA: Assumptions

For MANOVA, we make the following assumptions:

I X`1,X`2, . . . ,X`n` are i.i.d. with mean µ`, ` = 1, 2, . . . , g .

I The samples from different populations are independent.

I All populations have the same covariance matrix Σ.

I The populations are multivariate normal.

I If the sample sizes are large, MANOVA can be used as anapproximative method due to the multivariate central limittheorem.

15/23

MANOVA: Model

Linear model:

X`j = µ + τ ` + e`j , j = 1, 2, . . . , n` and ` = 1, 2, . . . , g

where e`j are independent Np(0,Σ) variables.

Here the parameter vector µ is an overall mean and τ representsthe `th treatment effect with

g∑`=1

n`τ ` = 0.

We wish to test

H0 : τ 1 = τ 2 = . . . = τ g

against the hypothesis that at least two effects differ.

16/23

MANOVA: Sums of squares and cross productsIn analogue to the univariate MANOVA, the total sum of squares(and cross products) is partitioned into different sources ofvariation:

g∑`=1

n∑j=1

(x`j − x)(x`j − x)′ =

g∑`=1

n`(x` − x)(x` − x)′ +

g∑`=1

n∑j=1

(x`j − x`)(x`j − x`)′ = B + W

where B is the treatment (Between) sum of squares and crossproducts and W is the residual (Within) sum of squares and crossproducts.

B and W are p · p matrices. The latter can be rewritten as

W = (n1 − 1)S1 + (n2 − 1)S2 + . . .+ (ng − 1)Sg

17/23

MANOVA: Test statistic

In univariate ANOVA, H0 : τ 1 = τ 2 = . . . = τ g is tested bystudying a suitable rescaling of SSTr/SSRes .

This is equivalent to studying SSTr/SSRes or1 + SSTr/SSRes = (SSRes + SSTr )/SSRes .

This, in turn, is equivalent to studying SSRes/(SSTr + SSRes).

We would like to construct a similar statistic for MANOVA, butratios of matrices are not defined.

Wilks suggested using the statistic

Λ =det W

det(B + W)

known as Wilks’ lambda.

18/23

MANOVA: Distribution of Wilks’ Λ

What can be said about the distribution of

Λ =det W

det(B + W)?

Let N =∑g

`=1 n`. Then we have the following

Exact results:p = 1 g ≥ 2 N−g

g−11−Λ

Λ ∼ Fg−1,N−g

p = 2 g ≥ 2 N−g−1g−1

1−√

Λ√Λ

∼ F2(g−1), 2(N−g−1)

p ≥ 1 g = 2 N−p−1p

1−ΛΛ ∼ Fp,N−p−1

p ≥ 1 g = 3 N−p−2p

1−√

Λ√Λ

∼ F2p, 2(N−p−2)

Approximate result: (for N large)

−(N − 1− (p + g)/2) ln Λ ∼ χ2p(g−1)

19/23

MANOVA: Other test statistics

Three other tests statistics are also common for MANOVA:

Lawley–Hotelling trace:tr(BW−1)

Pillai trace:tr(B(B + W)−1)

Roy’s largest root:

maximum eigenvalue of W(B + W)−1

For g = 2, all four statistics reduce to Hotelling’s T 2.For large samples, all four are ”nearly equivalent”.

20/23

MANOVA: Confidence intervals

Simultaneous confidence intervals for the mean differences areobtained using the Bonferroni approach.

Let N =∑g

`=1 n`. Then(xki − x`i ± tN−g

( α

pg(g − 1)

)√ wii

N − g(1/nk + 1/n`)

)where wii is the ith diagonal element of W, is a confidence intervalfor τki − τ`i with confidence level at least 1− α.

21/23

Equality of covariance matrices

As previously mentioned, the assumption of equal covariancematrices is quite strong, as there are p(p+1)

2 distinct elements inthe covariance matrix.

There are a few methods to investigate the assumption of equality:

I Visual investigation of matrices.I Box’s M test.

I Discussed in J&W.I Good theoretical properties.I Not as good in practice. Some authors call this test

super-sensitive and say that it isn’t useable for α > 0.01.

I Bartlett’s test or Levene’s test for equal variances, formarginals.

22/23

Summary

I Paired comparisons

I Repeated measures

I Comparing mean vectors from two populations

I Comparing mean vectors from more than two populations

I MANOVA

I Different statistics to choose from

I Equality of covariance matrices

23/23

lecture 5: hypothesis tests for more than one samplethulin/mm/l5.pdf · paired comparisons:...

Documents