general linear model with correlated error terms = 2 v ≠ 2 i
TRANSCRIPT
General Linear Model
With correlated error terms
= 2V ≠ 2I
The General Linear Model ≠ 2I
npnn
p
p
pn xxx
xxx
xxx
y
y
y
21
22221
11211
2
1
2
1
,,Let Xβy
known
ondistributi , a has where 2
V
V0εεβXy
N
ηβWεΓβXΓyΓu
111Let
known such that matrix an denoteLet ΓΓVΓ nn
εΓηXΓW 11 and where
on distributi ,0 a has and 21 IεΓη N
yVXXVX
yΓΓXXΓΓX
uWWWβ
β
111
11111
1
ˆ
is of estimate likelihood maximum theThus
on distributi ,0 a has 12 WWN
yVXXVXuWWWβ
1111ˆ Now
111 and XVXWW
ondistributi , has
ˆ Thus112
111
XVXβ
yVXXVXβ
N
yVXXVXXVVy
yΓΓXXΓΓXXΓΓΓΓy
yΓΓXXΓΓXXΓIΓy
uWWWWIu
111111
1111111111
11111111
112
2
:is of MLE unbiased The
pn
pn
pn
pns
Summary
known
ondistributi , a has where
Model The2
V
V0εεβXy
N
yVXXVXXVVy
XVXβ
yVXXVXβ
1111112
112
111
2.
ondistributi , has
ˆ 1.
sMLE'
pns
N
Example
ondistributi , a has where
Model The2V0εεβXy
N
Simple Linear Model where variance is proportional to X2.
2
22
21
1
02
1
2
1
00
00
00
and ,
1
1
1
,
nnn x
x
x
x
x
x
y
y
y
VβXy
2
22
21
1
100
01
0
001
Thus
nx
x
x
V
n
n
n
xxx
xxx
x
x
x
nxxx
111
111
1
1
1
21
1
21
222
21
2
22
21
00
00
00
111 and
VX
n
ixy
n
ix
y
n
xxx
xxx
n
ix
n
ix
n
ix
n
xxx
xxx
i
i
i
n
n
i
ii
n
n
y
y
y
nx
x
x
1
12
1
111
1111
1
1
1
1
1
1
2
1
111
1111
1
2
21
222
21
2
21
222
21
and
1
1
1
Hence
yVX
XVX
n
ixy
n
ix
y
n
ix
n
ix
n
ix
i
i
i
i
ii
n1
1
1
1
1
1
1
1
1
111
1
22
ˆ yVXXVXβ
n
ixy
n
ix
y
n
ix
n
ix
n
ix
n
i
n
ixx i
i
i
ii
i
ii
n
n1
1
1
1
1
1
1
1
1
2
1
111
0
1
2
22
1ˆ
ˆ ˆ
β
n
i
n
ixx
n
i
n
ixy
x
n
ix
y
ii
iii
i
n
n
1
2
1
11
1 1
1
10
2
12
ˆ
n
i
n
ixx
n
i
n
ix
yx
n
ixy
n
ix
ii
i
i
iii
n1
2
1
11
1 1
1
11
1
1
2
21
2
Testing and Confidence Intervals
The Model:
βWuβWu
hβHHWWHhβH
ˆˆ
ˆˆ
1
111
pn
q
F
2 ,~ where 2I0ηηβWu N
1 ,~ where 2V0εεβXy N
can be converted to the model
εΓηXΓWyΓuΓΓV 111 and ,, where
hβH
: gfor testin statistic test the2 modelwith 0H
yVXXVXXVVyβWuβWu
XVXXΓΓXXΓXΓWW
11111
1111
ˆˆ and
Now
Thus
yVXXVXXVyyVy
hβHHXVXHhβH
111111
1111 ˆˆ
pn
q
F
cHXVXHcβHc 11,ˆor spnqqF
cβbβHc
allfor
parameters for the sCI' ussimultaneo %1001 ofset a form
Simultaneous Confidence Intervals (using model (2))
cHWWHcβHc 1,ˆ spnqqF
Example: Simple Linear Model with no intercept
XfXVarXfXVarXYE 2 i.e. and
x
x
x
x
XfXf
ln
1
choices Possible known. is
2
The model
Vεεxy 22
1
22
1
2
1
2
1
00
00
00
var and
nnnn xf
xf
xf
x
x
x
y
y
y
Thus yVxxVx 111ˆ
nxf
xf
xf
nxf
xf
xf
1
1
11
2
1
1
00
00
00
00
00
00
2
1
V
n
i i
i
nxf
xf
xf
n xf
x
x
x
x
xxx
n
1
22
1
1
1
1
211
00
00
00
2
1
xVx
n
i i
ii
nxf
xf
xf
n xf
yx
y
y
y
xxx
n
1
2
1
1
1
1
211
00
00
00
2
1
yVx
n
i i
in
i i
ii
xfx
xfyx
1
2
1
ˆ and
Also
n
i i
i
xfx
1
2
2112ˆvar
xVx
Special cases
1 i.e. ,var 1. 2 ii xf
n
ii
n
i i
i
n
ii
n
iii
n
i i
in
i i
ii
xxf
xxyx
xfx
xfyx
1
2
2
1
2
2
1
2
11
2
1
ˆvar ,ˆ
iiii xxfx i.e. ,var 2. 2
xnx
xfxx
yxy
xfx
xfyx
n
ii
n
i i
i
n
ii
n
ii
n
i i
in
i i
ii2
1
2
1
2
2
111
2
1
ˆvar ,ˆ
222 i.e. ,var 3. iiii xxfx
nxf
xn
xy
xfx
xfyx
n
i i
i
n
i i
in
i i
in
i i
ii2
1
2
2
11
2
1
ˆvar ,ˆ
General Linear Model
Case 2: unknown
The General Linear Model unknown
npnn
p
p
pn xxx
xxx
xxx
y
y
y
21
22221
11211
2
1
2
1
,,Let Xβy
unknown
ondistributi , a has where
Σ
Σ0εεβXy
N
The General Linear Model unknown
yXXXβ 1ˆ Consider
ββXXXXyXXXβ
11ˆ EE
Call this the Ordinary Least Squares (OLS) estimator of β
Note:
Thus the Ordinary Least Squares (OLS) estimator of is always unbiased.
β
1111 varˆvar
XXΣXXXXXXXyXXXβ
cXXΣXXXXcβc 11ˆvar
yXΣXΣXβ 111*ˆLet
ββXΣXXΣXyΣXXΣXβ
111111*ˆ EE
This is the Optimal (UMVU) estimator of β
Note:
This is also an unbiased estimator of β
11
111111
111111*
varˆvar
XΣX
XΣXXΣΣΣXXΣX
XΣXXΣyΣXXΣXβ
cXΣXcβc 11*ˆvar
The Optimal (UMVU) estimator of requires the knowledge of in order to calculate it.
β
) ofestimator UMVU(the ˆ to
equivalent is ) festimatoro OLS (the ˆthen
111*
1
βyΣXXΣXβ
βyXXXβ
Theorem: Equivalence of OLS estimator with UMVU estimator
Σε0εεβXy
var and with Let E
XFΣXF such that matix singular -non a exists there
cXXΣXXXXcβccXΣXcβc 1111* ˆvarˆvar
Proof111 XFXΣΣXFXXFΣX
11111 XFFXXXFFXXXXX
XΣFXX 11
1111 ΣXFXX
111 ΣXXXF
11
1
ΣXXXF
11
1
ΣXXXΣ
111 ΣXXΣX
ˆ ˆ thus *1111 βyΣXXΣXyXXXβ
Application:Consider the general linear model with intercept
11IΣ
12
222
222
222
and , 1
01
ββX1X
In this case the error terms are equally correlated.Also in this case the OLS estimators are equivalent to the UMVU estimators
on,distributi , a has where Σ0εεβXy
N
Proof
1
12
1222
1
I0
X1X1XF
n
1
1Let
21
222
I0
X1F
n
121
222 11 XX111
n
121
2 1 X111X1
n
121
2 1 X11111X1
ΣXX111I 12 1
121
2 1 X111X1
Design Matrix, X, not of full rank
The General Linear Model
npnn
p
p
pn
p
p
n
n
xxx
xxx
xxx
y
y
y
21
22221
11211
2
1
1
2
1
1,,Let Xβy
ondistributi , a has where 2I0εεβXy
N
npr X ofrank that theSuppose
If the rank of X is equal to p then the columns of X are linearly independent and there is a unique way of representing
ppE xxxβXμμy
2211 form in the
If the rank of X is strictly less than p then there is no unique way of representing
βXμμ
form in the
Comment: Usually the situation where the rank of X, r < p, arises in the following instances.
1.The design of the study (the choice of the values of X1, X2, …, Xp) was not careful enough to ensure that X had full rank.
2.Observations were missing causing the model to be altered Elements of are deleted along with corresponding rows of X, reducing the number of linear independent rows from p to r. 3.The model was defined in such a way that:i = 1xi1 + 2xi2 +… +pxip
is not uniquely determined by 1, 2,… ,p.
y
Two Basic approaches:1.Impose p – r linear restrictions on the parameters• This allows us to reduce the number of parameters to r.• will have a unique representation if the
p – r restrictions are added.• This technique is usually used with ANOVA,
MANOVA, ANACOVA models.
2.Live with the singularity. •Restrict our attention to linear combinations of the parameters that have unique estimators.
The two approaches are essentially the same (lead to the same conclusions).
βXμ
βc
Recall: Linear Equations theoryConsider the system of linear equations
11
mnnmbxA
unknown
known ,
x
bA
b
if consistent are equations The M(A),
the linear space spanned by the columns of A
AA of inverse- theLet - g
AAAA
AAAA
AAAA
AAAA
4.
and 3.
, 2.
, 1. i.e
Then the general solution to the system of linear equations
11
mnnmbxA
arbitrary is where zzIAAbAx
is
ondistributi , a has whereLet 2I0εεβXy
N
then ofrank theIf pX
Maximum Likelihood Estimation leads to the system of linear equations
yXβXX ˆ
• p equations in p unknowns • called the Normal equations
issolution theand exists, 1XX
yXXXβ 1ˆ
Theorem The Normal equations
yXβXX ˆ
are consistent.
yX
Since
Proof It can be shown that M(XX) M(X)
M(X) yX
then M(XX)
Theorem The general solution to the Normal equations is
zIXXXXyXXXβ ˆ
arbitrary. is where z
Theorem
βXyβXyβ ˆˆˆ
R
is the same for all solutions of the Normal equations
Proof: the general solution to the Normal equations is
zIXXXXyXXXβ ˆ
arbitrary. is where z
zIXXXXXyXXXXyβXy ˆ
Since M(XX) M(X) there exists a p × n matrix L
such that X = XXL or X = LXX
0IXXXXXXLIXXXXX hence
0XXXXXXXXL
yXXXXIy
XXXXIE where
yEyyEEyβXyβXyβ
ˆˆˆ hence R
since
XXXXIXXXXIEE
XXXXXXXXXXXXXXXXI
EXXXXI
yEyXXXXIyXXXXyβXy
ˆ thus
nixy iii ,,2,1for 10
Definition: (Estimability)The linear function of the parameter vector, is called estimable if there exists a vector such that
βc
a
βcya E
Example The simple linear model
uecommon val some Suppose 21 xxxx n vector.arbitrary an denote Let 21 naaa
a nn yEayEayEaE
2111Then ya
xaxaxa n 10102101
1010101
cxcxcxan
ii
Thus is the only estimable function of 0, 1.
1010 cxcxc
0
40
80
120
0 20 40 60 80 100
x10
x
XY 10
Theorem: The following conditions are equivalent
estimable is 1. βc
2. For some solution, , of the Normal equation, , is a linear (in ) unbiased estimate of
βcβ
βcy
equations.
Normal theof solutions allfor unique is ˆ 3. βc
tobelongs 4. c
tobelongs 5. c
M(XX)
M(X)
Proof: Assume
estimable is 1. βc
Then there exists a vector such thata βcya
E
tobelongs or c
M(X)
βcβXayaya EE i.e.
Thus aXccXa or
Thus 1. implies 5. (as well as 4.)
bXXcb
such that exists therei.e.
bXXcb
such that exists therei.e.Now assume 4.
zIXXXXyXXXβ ˆ
Thus 4. implies 3.
zIXXXXcyXXXcβc ˆ
zIXXXXXXbyXXXXXb
zXXXXbyXXXXXb
yXXXXXb z of choices allfor unique
βc ˆ Also yayXXXXXb
bXXXXXa where
Thus 4. implies 2. and 1.
βcβXXbβXXXXXXb
yXXXXXbya
EE
Example: One-way ANOVA (Analysis of Variance)
Suppose we have k normal populations
Let yi1, yi2, … , yin denote a sample of n from
kiN i ,,1:, 2
2, iN
Let ij = yij - (i), then i1, i2, … , in denotes a sample of n from distribution. 2,0 N
njkiy ijiij ,,1;,,1,
where 11, 12, … , kn are kn independent observations from N(0,2) distribution.
Matrix Notation
Let
knkin
i
i
i
y
y
y
ki
y
y
y
12
11
2
1
2
1
,,,1,
y
y
y
yy
knkin
i
i
i ki
12
11
2
1
2
1
,,,1,
ε
ε
ε
εε
Let
krank
knk
k
k
X
101
001
011
Xβ with ,,1
1
11
kkknk y
y
y
ε
ε
ε
101
001
011
εXβ
y
y
y
y
2
1
112
11
2
1
Then the model is
110000
001100
000011
111111
:
XNote
= then linear space spanned by the vectorsM(X)
1
0
0
1
, ,
0
1
0
1
,
0
0
1
1
vectors
k
11
1 0011
k
βc
Thus the estimable parameters are of the form:
k
k
k
11001βc
The common approach is to add the restriction 01
k
ii
This reduces the number of parameters to k, and converts the model to full rank.
Properties of estimable functions:
βc prank X1. All linear functions are estimable
M(X) = Ep = p-dimensional Euclidean space (which contains all p-dimansional vectors)
Proof If rank(X) = p then
2. is estimable if
Proof
βc XXXXHcHc
_
where
zIHyXXXβ ˆ
yXXXczcHcyXXXcβc ˆ
(unique for all solutions of the normal equations)
Hence is estimable. βc
3. If and are estimable then βc
aXXcβaβccXXcβc
22 ˆ,ˆcov and ˆvar
βa
Proof since and are estimable then βc βa
yXXXaβayXXXcβc ˆ and ˆ
cXXXyXXXcyXXXcβc
varvarˆvar
βaβc ˆ,ˆcov Also
22 cXXXXXXccXXXIXXXc
2 cXXc
aXXc 2
aXXXyXXXc var
2 aXXXXXXc