general linear model with correlated error terms = 2 v ≠ 2 i

General Linear Model

With correlated error terms

= 2V ≠ 2I

The General Linear Model ≠ 2I

npnn

p

p

pn xxx

xxx

xxx

y

y

y

21

22221

11211

2

1

2

1

,,Let Xβy

known

ondistributi , a has where 2

V

V0εεβXy

N

ηβWεΓβXΓyΓu

111Let

known such that matrix an denoteLet ΓΓVΓ nn

εΓηXΓW 11 and where

on distributi ,0 a has and 21 IεΓη N

yVXXVX

yΓΓXXΓΓX

uWWWβ

β

111

11111

1

ˆ

is of estimate likelihood maximum theThus

on distributi ,0 a has 12 WWN

yVXXVXuWWWβ

1111ˆ Now

111 and XVXWW

ondistributi , has

ˆ Thus112

111

XVXβ

yVXXVXβ

N

yVXXVXXVVy

yΓΓXXΓΓXXΓΓΓΓy

yΓΓXXΓΓXXΓIΓy

uWWWWIu

111111

1111111111

11111111

112

2

:is of MLE unbiased The

pn

pn

pn

pns

Summary

known

ondistributi , a has where

Model The2

V

V0εεβXy

N

yVXXVXXVVy

XVXβ

yVXXVXβ

1111112

112

111

2.

ondistributi , has

ˆ 1.

sMLE'

pns

N

Example


Model The2V0εεβXy

N

Simple Linear Model where variance is proportional to X2.

2

22

21

1

02

1

2

1

00

00

00

and ,

1

1

1

,

nnn x

x

x

x

x

x

y

y

y

VβXy

2

22

21

1

100

01

0

001

Thus

nx

x

x

V

n

n

n

xxx

xxx

x

x

x

nxxx

111

111

1

1

1

21

1

21

222

21

2

22

21

00

00

00

111 and

VX

n

ixy

n

ix

y

n

xxx

xxx

n

ix

n

ix

n

ix

n

xxx

xxx

i

i

i

n

n

i

ii

n

n

y

y

y

nx

x

x

1

12

1

111

1111

1

1

1

1

1

1

2

1

111

1111

1

2

21

222

21

2

21

222

21

and

1

1

1

Hence

yVX

XVX

n

ixy

n

ix

y

n

ix

n

ix

n

ix

i

i

i

i

ii

n1

1

1

1

1

1

1

1

1

111

1

22

ˆ yVXXVXβ

n

ixy

n

ix

y

n

ix

n

ix

n

ix

n

i

n

ixx i

i

i

ii

i

ii

n

n1

1

1

1

1

1

1

1

1

2

1

111

0

1

2

22

1ˆ

ˆ ˆ

β

n

i

n

ixx

n

i

n

ixy

x

n

ix

y

ii

iii

i

n

n

1

2

1

11

1 1

1

10

2

12

ˆ

n

i

n

ixx

n

i

n

ix

yx

n

ixy

n

ix

ii

i

i

iii

n1

2

1

11

1 1

1

11

1

1

2

21

2

Testing and Confidence Intervals

The Model:

βWuβWu

hβHHWWHhβH

ˆˆ

ˆˆ

1

111

pn

q

F

2 ,~ where 2I0ηηβWu N

1 ,~ where 2V0εεβXy N

can be converted to the model

εΓηXΓWyΓuΓΓV 111 and ,, where

hβH

: gfor testin statistic test the2 modelwith 0H

yVXXVXXVVyβWuβWu

XVXXΓΓXXΓXΓWW

11111

1111

ˆˆ and

Now

Thus

yVXXVXXVyyVy

hβHHXVXHhβH

111111

1111 ˆˆ

pn

q

F

cHXVXHcβHc 11,ˆor spnqqF

cβbβHc

allfor

parameters for the sCI' ussimultaneo %1001 ofset a form

Simultaneous Confidence Intervals (using model (2))

cHWWHcβHc 1,ˆ spnqqF

Example: Simple Linear Model with no intercept

XfXVarXfXVarXYE 2 i.e. and

x

x

x

x

XfXf

ln

1

choices Possible known. is

2

The model

Vεεxy 22

1

22

1

2

1

2

1

00

00

00

var and

nnnn xf

xf

xf

x

x

x

y

y

y

Thus yVxxVx 111ˆ

nxf

xf

xf

nxf

xf

xf

1

1

11

2

1

1

00

00

00

00

00

00

2

1

V

n

i i

i

nxf

xf

xf

n xf

x

x

x

x

xxx

n

1

22

1

1

1

1

211

00

00

00

2

1

xVx

n

i i

ii

nxf

xf

xf

n xf

yx

y

y

y

xxx

n

1

2

1

1

1

1

211

00

00

00

2

1

yVx

n

i i

in

i i

ii

xfx

xfyx

1

2

1

ˆ and

Also

n

i i

i

xfx

1

2

2112ˆvar

xVx

Special cases

1 i.e. ,var 1. 2 ii xf

n

ii

n

i i

i

n

ii

n

iii

n

i i

in

i i

ii

xxf

xxyx

xfx

xfyx

1

2

2

1

2

2

1

2

11

2

1

ˆvar ,ˆ

iiii xxfx i.e. ,var 2. 2

xnx

xfxx

yxy

xfx

xfyx

n

ii

n

i i

i

n

ii

n

ii

n

i i

in

i i

ii2

1

2

1

2

2

111

2

1

ˆvar ,ˆ

222 i.e. ,var 3. iiii xxfx

nxf

xn

xy

xfx

xfyx

n

i i

i

n

i i

in

i i

in

i i

ii2

1

2

2

11

2

1

ˆvar ,ˆ

General Linear Model

Case 2: unknown

The General Linear Model unknown

npnn

p

p

pn xxx

xxx

xxx

y

y

y

21

22221

11211

2

1

2

1

,,Let Xβy

unknown


Σ

Σ0εεβXy

N

The General Linear Model unknown

yXXXβ 1ˆ Consider

ββXXXXyXXXβ

11ˆ EE

Call this the Ordinary Least Squares (OLS) estimator of β

Note:

Thus the Ordinary Least Squares (OLS) estimator of is always unbiased.

β

1111 varˆvar

XXΣXXXXXXXyXXXβ

cXXΣXXXXcβc 11ˆvar

yXΣXΣXβ 111*ˆLet

ββXΣXXΣXyΣXXΣXβ

111111*ˆ EE

This is the Optimal (UMVU) estimator of β

Note:

This is also an unbiased estimator of β

11

111111

111111*

varˆvar

XΣX

XΣXXΣΣΣXXΣX

XΣXXΣyΣXXΣXβ

cXΣXcβc 11*ˆvar

The Optimal (UMVU) estimator of requires the knowledge of in order to calculate it.

β

) ofestimator UMVU(the ˆ to

equivalent is ) festimatoro OLS (the ˆthen

111*

1

βyΣXXΣXβ

βyXXXβ

Theorem: Equivalence of OLS estimator with UMVU estimator

Σε0εεβXy

var and with Let E

XFΣXF such that matix singular -non a exists there

cXXΣXXXXcβccXΣXcβc 1111* ˆvarˆvar

Proof111 XFXΣΣXFXXFΣX

11111 XFFXXXFFXXXXX

XΣFXX 11

1111 ΣXFXX

111 ΣXXXF

11

1

ΣXXXF

11

1

ΣXXXΣ

111 ΣXXΣX

ˆ ˆ thus *1111 βyΣXXΣXyXXXβ

Application:Consider the general linear model with intercept

11IΣ

12

222

222

222

and , 1

01

ββX1X

In this case the error terms are equally correlated.Also in this case the OLS estimators are equivalent to the UMVU estimators

on,distributi , a has where Σ0εεβXy

N

Proof

1

12

1222

1

I0

X1X1XF

n

1

1Let

21

222

I0

X1F

n

121

222 11 XX111

n

121

2 1 X111X1

n

121

2 1 X11111X1

ΣXX111I 12 1

121

2 1 X111X1

Design Matrix, X, not of full rank

The General Linear Model

npnn

p

p

pn

p

p

n

n

xxx

xxx

xxx

y

y

y

21

22221

11211

2

1

1

2

1

1,,Let Xβy

ondistributi , a has where 2I0εεβXy

N

npr X ofrank that theSuppose

If the rank of X is equal to p then the columns of X are linearly independent and there is a unique way of representing

ppE xxxβXμμy

2211 form in the

If the rank of X is strictly less than p then there is no unique way of representing

βXμμ

form in the

Comment: Usually the situation where the rank of X, r < p, arises in the following instances.

1.The design of the study (the choice of the values of X1, X2, …, Xp) was not careful enough to ensure that X had full rank.

2.Observations were missing causing the model to be altered Elements of are deleted along with corresponding rows of X, reducing the number of linear independent rows from p to r. 3.The model was defined in such a way that:i = 1xi1 + 2xi2 +… +pxip

is not uniquely determined by 1, 2,… ,p.

y

Two Basic approaches:1.Impose p – r linear restrictions on the parameters• This allows us to reduce the number of parameters to r.• will have a unique representation if the

p – r restrictions are added.• This technique is usually used with ANOVA,

MANOVA, ANACOVA models.

2.Live with the singularity. •Restrict our attention to linear combinations of the parameters that have unique estimators.

The two approaches are essentially the same (lead to the same conclusions).

βXμ

βc

Recall: Linear Equations theoryConsider the system of linear equations

11

mnnmbxA

unknown

known ,

x

bA

b

if consistent are equations The M(A),

the linear space spanned by the columns of A

AA of inverse- theLet - g

AAAA

AAAA

AAAA

AAAA

4.

and 3.

, 2.

, 1. i.e

Then the general solution to the system of linear equations

11

mnnmbxA

arbitrary is where zzIAAbAx

is

ondistributi , a has whereLet 2I0εεβXy

N

then ofrank theIf pX

Maximum Likelihood Estimation leads to the system of linear equations

yXβXX ˆ

• p equations in p unknowns • called the Normal equations

issolution theand exists, 1XX

yXXXβ 1ˆ

Theorem The Normal equations

yXβXX ˆ

are consistent.

yX

Since

Proof It can be shown that M(XX) M(X)

M(X) yX

then M(XX)

Theorem The general solution to the Normal equations is

zIXXXXyXXXβ ˆ

arbitrary. is where z

Theorem

βXyβXyβ ˆˆˆ

R

is the same for all solutions of the Normal equations

Proof: the general solution to the Normal equations is

zIXXXXyXXXβ ˆ

arbitrary. is where z

zIXXXXXyXXXXyβXy ˆ

Since M(XX) M(X) there exists a p × n matrix L

such that X = XXL or X = LXX

0IXXXXXXLIXXXXX hence

0XXXXXXXXL

yXXXXIy

XXXXIE where

yEyyEEyβXyβXyβ

ˆˆˆ hence R

since

XXXXIXXXXIEE

XXXXXXXXXXXXXXXXI

EXXXXI

yEyXXXXIyXXXXyβXy

ˆ thus

nixy iii ,,2,1for 10

Definition: (Estimability)The linear function of the parameter vector, is called estimable if there exists a vector such that

βc

a

βcya E

Example The simple linear model

uecommon val some Suppose 21 xxxx n vector.arbitrary an denote Let 21 naaa

a nn yEayEayEaE

2111Then ya

xaxaxa n 10102101

1010101

cxcxcxan

ii

Thus is the only estimable function of 0, 1.

1010 cxcxc

0

40

80

120

0 20 40 60 80 100

x10

x

XY 10

Theorem: The following conditions are equivalent

estimable is 1. βc

2. For some solution, , of the Normal equation, , is a linear (in ) unbiased estimate of

βcβ

βcy

equations.

Normal theof solutions allfor unique is ˆ 3. βc

tobelongs 4. c

tobelongs 5. c

M(XX)

M(X)

Proof: Assume

estimable is 1. βc

Then there exists a vector such thata βcya

E

tobelongs or c

M(X)

βcβXayaya EE i.e.

Thus aXccXa or

Thus 1. implies 5. (as well as 4.)

bXXcb

such that exists therei.e.

bXXcb

such that exists therei.e.Now assume 4.

zIXXXXyXXXβ ˆ

Thus 4. implies 3.

zIXXXXcyXXXcβc ˆ

zIXXXXXXbyXXXXXb

zXXXXbyXXXXXb

yXXXXXb z of choices allfor unique

βc ˆ Also yayXXXXXb

bXXXXXa where

Thus 4. implies 2. and 1.

βcβXXbβXXXXXXb

yXXXXXbya

EE

Example: One-way ANOVA (Analysis of Variance)

Suppose we have k normal populations

Let yi1, yi2, … , yin denote a sample of n from

kiN i ,,1:, 2

2, iN

Let ij = yij - (i), then i1, i2, … , in denotes a sample of n from distribution. 2,0 N

njkiy ijiij ,,1;,,1,

where 11, 12, … , kn are kn independent observations from N(0,2) distribution.

Matrix Notation

Let

knkin

i

i

i

y

y

y

ki

y

y

y

12

11

2

1

2

1

,,,1,

y

y

y

yy

knkin

i

i

i ki

12

11

2

1

2

1

,,,1,

ε

ε

ε

εε

Let

krank

knk

k

k

X

101

001

011

Xβ with ,,1

1

11

kkknk y

y

y

ε

ε

ε

101

001

011

εXβ

y

y

y

y

2

1

112

11

2

1

Then the model is

110000

001100

000011

111111

:

XNote

= then linear space spanned by the vectorsM(X)

1

0

0

1

, ,

0

1

0

1

,

0

0

1

1

vectors

k

11

1 0011

k

βc

Thus the estimable parameters are of the form:

k

k

k

11001βc

The common approach is to add the restriction 01

k

ii

This reduces the number of parameters to k, and converts the model to full rank.

Properties of estimable functions:

βc prank X1. All linear functions are estimable

M(X) = Ep = p-dimensional Euclidean space (which contains all p-dimansional vectors)

Proof If rank(X) = p then

2. is estimable if

Proof

βc XXXXHcHc

_

where

zIHyXXXβ ˆ

yXXXczcHcyXXXcβc ˆ

(unique for all solutions of the normal equations)

Hence is estimable. βc

3. If and are estimable then βc

aXXcβaβccXXcβc

22 ˆ,ˆcov and ˆvar

βa

Proof since and are estimable then βc βa

yXXXaβayXXXcβc ˆ and ˆ

cXXXyXXXcyXXXcβc

varvarˆvar

βaβc ˆ,ˆcov Also

22 cXXXXXXccXXXIXXXc

2 cXXc

aXXc 2

aXXXyXXXc var

2 aXXXXXXc

general linear model with correlated error terms = 2 v ≠ 2 i

Documents