introduction to longitudinal data analysis · 2012-04-27 · analysis of longitudinal data. oxford...
TRANSCRIPT
![Page 1: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/1.jpg)
Introduction to Longitudinal Data Analysis
Öþôçò ÓéÜííçòÐáíåðéóôÞìéï Áèçíþí, ÔìÞìá Ìáèçìáôéêü
April 27, 2012
![Page 2: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/2.jpg)
Bibliography
• Weiss Robert(2005). Modeling Longitudinal Data. Springer.
• Diggle P.J., Heagerty P., Liang KY and Zeger S.(2002). Analysis of Longitudinal Data.Oxford Statistical Science Series.
• Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied Longitudinal Analysis. Wiley.
• Davis C.(2002). Statistical Methods for the Analysis of Repeated Measures. Springer.
• Crowder M.J. and Hand D.J.(1990) Analysis of Repeated Measures. Chapman & Hall.
Longitudinal Data Analysis 1
![Page 3: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/3.jpg)
Introduction
• We are familiar with the assumptions behind linear regression models, that observationsare independent.
• The de�ning feature of Longitudinal Studies is that measurements of the same individualare collected repeatedly over time.
• As a result, observations on the same individual must be associated.
• Hence, the assumption of independent observations cannot be justi�ed.
Longitudinal Data Analysis 2
![Page 4: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/4.jpg)
• The availability of repeated measurement on the same subjects at several time pointscertainly o�ers more information compared to cross-sectional studies.
• Longitudinal studies allows the study of change over time (within subject change).
• The primary goals is studies of this kind are:
{ characterize the change of response over time{ investigate the factors that in uence it
• Responses can be either univariate or multivariate (here we focus on univariate responses).
Longitudinal Data Analysis 3
![Page 5: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/5.jpg)
●
●
●
●
●
●
●
●
●
●
10 12 14 16 18
6065
7075
8085
9095
Age
Rea
ding
Abi
lity
Longitudinal Data Analysis 4
![Page 6: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/6.jpg)
●
●
●
●
●
●
●
●
●
●
10 12 14 16 18
6065
7075
8085
9095
Age
Rea
ding
Abi
lity
●
●
●
●
●
●
●
●
●
●
10 12 14 16 18
6065
7075
8085
9095
Age
Longitudinal Data Analysis 5
![Page 7: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/7.jpg)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
10 12 14 16 18
6065
7075
8085
9095
Longitudinal Data Analysis 6
![Page 8: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/8.jpg)
Example: CD4+ Cell Numbers (macs data; Diggle et.al.)
The Human Immune de�ciency Virus (HIV) causes AIDS by attacking and reducing CD4+cells and hence reducing a person's ability to �ght infection.
• An uninfected individual has around 1100 cells per millilitre of blood
• CD4+ decrease in number with time from infection
• CD4+ number can be used to monitor disease progression
We have 2376 values of CD4+ cell number from 369 infected individuals. We plot CD4+values against time since seroconversion (time since HIV becomes detectable). [Multi-centerAIDS cohort study of MACS (Kaslow et.al. 1987)]
Longitudinal Data Analysis 7
![Page 9: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/9.jpg)
−2 0 2 4
050
010
0015
0020
0025
0030
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
Longitudinal Data Analysis 8
![Page 10: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/10.jpg)
−2 0 2 4
050
010
0015
0020
0025
0030
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
Longitudinal Data Analysis 9
![Page 11: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/11.jpg)
Example: Treatment of Lead Exposed Children (TLC) Trial
(Fitzmaurice et.al.)
• The TLC trial was a placebo-controlled, randomized study of succimer (a chelating agent)in children with blood lead levels of 20-44 micrograms/dL.
• These data consist of four repeated measurements of blood lead levels obtained at baseline(or week 0), week 1, week 4, and week 6 on 100 children who were randomly assigned tochelation treatment with succimer or placebo.
Group Baseline Week 1 Week 4 Week 6
Succimer 26.5 13.5 15.5 20.8(5.0) (7.7) 7.8) (9.2)
Placebo 26.3 24.7 24.1 23.6(5.0) (5.5) (5.8) (5.6)
Table 1: Mean blood lead levels (sd) from the TLC trial.
Longitudinal Data Analysis 10
![Page 12: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/12.jpg)
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●
Longitudinal Data Analysis 11
![Page 13: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/13.jpg)
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
Longitudinal Data Analysis 12
![Page 14: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/14.jpg)
●
●
●
●
0 1 2 3 4 5 6
1015
2025
30
Time(weeks)
Mea
n B
lood
lead
leve
l●
●●
●
SuccimerPlacebo
Longitudinal Data Analysis 13
![Page 15: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/15.jpg)
Example: Small Mice Data (Weiss)
• Weights in milligrams of new-born male mice.
• All from mothers from a single strain.
• 14 mice measured every 3 days.
• Measurements from day 2 up to day 20.
• Balanced data set.
Longitudinal Data Analysis 14
![Page 16: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/16.jpg)
R Console Page 1
Group id weight.2 weight.5 weight.8 weight.11 weight.14 weight.17 weight.201 3 22 190 388 621 823 1078 1132 11918 3 23 218 393 568 729 839 852 100415 3 24 141 260 472 662 760 885 87822 3 25 211 394 549 700 783 870 92529 3 26 209 419 645 850 1001 1026 106936 3 27 193 362 520 530 641 640 75143 3 28 201 361 502 530 657 762 88850 3 29 202 370 498 650 795 858 91057 3 30 190 350 510 666 819 879 92964 3 31 219 399 578 699 709 822 95371 3 32 225 400 545 690 796 825 83678 3 33 224 381 577 756 869 929 99985 4 34 187 329 441 525 589 621 79692 4 35 278 471 606 770 888 1001 1105
Longitudinal Data Analysis 15
![Page 17: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/17.jpg)
Longitudinal Data Analysis 16
![Page 18: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/18.jpg)
R Console Page 1
Group id weight day1 3 22 190 22 3 22 388 53 3 22 621 84 3 22 823 115 3 22 1078 146 3 22 1132 177 3 22 1191 208 3 23 218 29 3 23 393 510 3 23 568 811 3 23 729 1112 3 23 839 1413 3 23 852 1714 3 23 1004 2015 3 24 141 216 3 24 260 517 3 24 472 818 3 24 662 1119 3 24 760 1420 3 24 885 1721 3 24 878 20
Longitudinal Data Analysis 17
![Page 19: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/19.jpg)
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
Day
Wei
ght
Longitudinal Data Analysis 18
![Page 20: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/20.jpg)
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
Days
Wei
ght
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
Longitudinal Data Analysis 19
![Page 21: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/21.jpg)
Distinctive Feature
Longitudinal Data are clustered. Clusters of data are created from the repeatedmeasurement obtained from the same subject/individual at di�erent times/occassions.
• This feature implies that observations of this kind are correlated, and 'common sense' saysthat they are positively correlated.
• This correlation usually is not of interest.
• However, this correlation needs to be accounted in the analysis, because it invalidated the'common' assumption of independent observations.
• Between subject observations are NOT correlated.
• Clustered data can arise in many di�erent ways. Family, school, hospital, and householddata are clusters that produce correlated data.
Longitudinal Data Analysis 20
![Page 22: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/22.jpg)
Objectives
Important role in health sciences.
• Investigate heterogeneity among individual (genetic, social, behavioral).
• Investigate changes in response over time. This is not possible in cross-sectional studies,where within and between subjects factors that in uence the changes over time cannot bedistinguished.
• Relate changes to covariates.
• Make predictions about how speci�c individuals change over time.
Longitudinal Data Analysis 21
![Page 23: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/23.jpg)
Terminology
• In a LD study, the units being studied are referred to subjects or individuals.
• Individuals are measured at di�erent times or occasions.
• The number of repeated observations and their timing can vary between studies and/orindividuals.
{ A study where all individuals have the same number of observations, usually at the sameoccasions, is called balanced.
{ The opposite leads to an unbalanced study (the 'norm' for LD studies).
• Missing data are very common, leading to incomplete data.
• Data can be collected prospectively (advisable) or retrospectively (often poor quality data).
Longitudinal Data Analysis 22
![Page 24: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/24.jpg)
Balanced Studies
• Clinical trial measuring the e�cacy of an analgesic agent, taking repeated measures ofself-reported pain scale at baseline and at the end of six 15-min intervals.
• Usually when the length of time is short or when humans are not the main subject ofinvestigation (ex. rats).
Unbalanced Studies
• When arthritis patients visit the clinic at 6-month intervals, either miss a visit or the timingis never exactly at 6 months (6-12 months).
• Most health related studies.
Longitudinal Data Analysis 23
![Page 25: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/25.jpg)
Notation
• Let Yij denote the response of the i-individual (i = 1; :::; N) at j-occasion (j = 1; :::; n).(this notation is su�cient of measurements are equally separated)
• Given that we have n repeated measures for each individual, we can group them in a n×1vector
Yi =
Yi1Yi2...Yin
or Yi = (Yi1; Yi2; : : : ; Yin)
′.
• Interest lies on the mean response and how this changes with covariates (treatment group,age, sex,...)
�j = E(Yij):
Longitudinal Data Analysis 24
![Page 26: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/26.jpg)
If we allow the mean response to di�er across individuals, then
�ij = E(Yij):
Longitudinal Data Analysis 25
![Page 27: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/27.jpg)
Data Structures
The general layout is
• N subjects
• from which we get n repeated measures
• at times ti
• Yij response of interest from subject i at occasion j
• with covariates xij = (xij1; xij2; :::; xijp). Generally the number of covariates may varyacross the repeated measurements
• Missing indicator
�ij =
{1; if Yij and xij are observed;0; ...missing.
Longitudinal Data Analysis 26
![Page 28: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/28.jpg)
Layout for the one-sample case
OccasionSubject 1 j n
1 y11 y1j y1t... ... ... ...i yi1 yij yit... ... ... ...N yN1 yNj yNn
Longitudinal Data Analysis 27
![Page 29: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/29.jpg)
Time Missing
Subject Point Indicator Response Covariates
1 1 ä11 y11 x111 : : : x11p...
......
... . . . ...
j ä1j y1j x1j1 : : : x1jp...
......
... . . . ...
t1 ä1t1y1t1
x1t11 : : : x1t1p
........................................................................................
i 1 äi1 yi1 xi11 : : : xi1p...
......
... . . . ...
j äij yij xij1 : : : xijp...
......
... . . . ...
ti äiti yiti xiti1 : : : xitip
........................................................................................
n 1 än1 yn1 xn11 : : : xn1p...
......
... . . . ...
j änj ynj xnj1 : : : xnjp...
......
... . . . ...
tn änt1 ynt1 xntn1 : : : xntnp
Table 2: General layout for repeated measurements
Longitudinal Data Analysis 28
![Page 30: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/30.jpg)
Time Point
Group Subject 1 j t
1 1 y111 y11j y11t...
......
...
i y1i1 y1ij y1it...
......
...
n1 y1n11 y1n1jy1n1t
........................................................................................
h 1 yh11 yh1j yh1t...
......
...
i yhi1 yhij yhit...
......
...
nh yhnh1 yhnhj yhnht
........................................................................................
s 1 ys11 ys1j ys1t...
......
...
i ysi1 ysij ysit...
......
...
ns ysns1 ysnsj ysnst
Table 3: Layout for the special case of multiple samples
Longitudinal Data Analysis 29
![Page 31: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/31.jpg)
Dependence & Correlation
Consider a simple LD design that is balanced and complete, with n measurements of theresponse variable at a common set of occasions on N individuals.
• Expectation: �ij = E(Yij):
• Variance: �2j = E{[Yij − E(Yij)]
2} = E{(Yij − �ij)2}:
• Covariance: �jk = E{(Yij − �ij)(Yik − �ik)}:
• Correlation: �jk =E{(Yij−�ij)(Yik−�ik)}
�j�k:
Longitudinal Data Analysis 30
![Page 32: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/32.jpg)
• We anticipate observations on the same individual to be positively correlated. Thus
Cov
Yi1Yi2...Yin
=
V ar(Yi1) Cov(Yi1; Yi2) : : : Cov(Yi1; Yin)
Cov(Yi2; Yi1) V ar(Yi2) : : : Cov(Yi2; Yin)... ... . . . ...
Cov(Yin; Yi1) Cov(Yin; Yi2) : : : V ar(Yin)
=
�11 �12 : : : �1n
�21 �22 : : : �21... ... . . . ...
�n1 �n2 : : : �nn
where:
• Cov(Yij; Yik) = �jk = �kj = Cov(Yik; Yij);
• �kk = Cov(Yik; Yik) = V ar(Yik) = �2k :
Longitudinal Data Analysis 31
![Page 33: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/33.jpg)
Hence, the covariance matrix takes the simple form
Cov(Yi) =
�2
1 �12 : : : �1n
�21 �22 : : : �21
... ... . . . ...�n1 �n2 : : : �2
n
;
and equally we can de�ne the correlation matrix
Corr(Yi) =
1 �12 : : : �1n
�21 1 : : : �21... ... . . . ...�n1 �n2 : : : 1
;
whereCorr(Yij; Yik) = �jk = �kj = Corr(Yik; Yij):
Longitudinal Data Analysis 32
![Page 34: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/34.jpg)
Example: TLC Trial (cont.)
Objective: Investigate whether treatment with succimer reduced blood lead levels overtime, relative to any changes observed in the placebo group.
H0 : �j(S) = �j(P ); for all j = 1; :::4;
where �j(S) and �j(P ) are the succimer and placebo mean responses at the jth occasion.
Alternatively,
H0 : �j(S)− �1(S) = �j(P )− �1(P ); for all j = 2; :::4;
which states that the changes in the mean response from baseline are equal in the twotreatments.
Note: The second version of the null hypothesis discusses the changes in the means,while there might be di�erences at baseline. Hence, is implied by the �rst null hypothesis,making the second less restrictive.
Longitudinal Data Analysis 33
![Page 35: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/35.jpg)
Restrict attention to the placebo group and let's explore the interdependence of the fourmeasures of blood lead level. First, explore the time plot
●
●
●
●
0 1 2 3 4 5 6
1520
2530
35
Time(weeks)
Mea
n B
lood
lead
leve
l ●
●
●●
Longitudinal Data Analysis 34
![Page 36: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/36.jpg)
while secondly we can explore the pairwise scatter-plots for children in the placebo group
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●●
● ●●
●
●●●
●
●
●
●
●
●
●●
●
●●
●●●●
● ●
●
●
●
●
●●
10 20 30 40 5010
3050
Baseline
Wee
k 1
●●
●
●
●
●
●●●●●
●
●●
●
●
●● ●
●
●
●●
●●
●●●
●
●●
●
●
●
●●
●●
●
●
●●●● ●
●
●
●
● ●
10 20 30 40 50
1030
50
Baseline
Wee
k 4
●●
●
●
●
●
● ●● ●●
●
●●
●
●
●● ●
●
●
●●
●●
●●●
●
●●
●
●
●
●●
●●
●
●
●●●● ●●
●
●
● ●
10 20 30 40 5010
3050
Week 1
Wee
k 4
●●●
●
●
●●
●●●
●
●●
●●
●●
● ●
●
●●
●
●●
●● ●
●●●
●
●
●●
●
●●
●
●
●●
●
● ●●
●
●● ●
10 20 30 40 50
1030
50
Baseline
Wee
k 6
●●●
●
●
●●
●●●
●
●●
●●
●●
● ●
●
●●
●
●●
●● ●
●●●
●
●
●●
●
●●
●
●
●●
●
● ●●
●
●● ●
10 20 30 40 50
1030
50
Week 1
Wee
k 6
●●●
●
●
●●
●●●
●
●●
●●
●●
●●
●
●●
●
●●
●●●
●●●
●
●
●●
●
●●
●
●
●●●
●●●
●
●●●
10 20 30 40 5010
3050
Week 4
Wee
k 6
Longitudinal Data Analysis 35
![Page 37: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/37.jpg)
The estimated covariances are
Cov(Yi) =
25:2 22:8 24:3 21:422:8 29:8 27:0 23:424:3 27:0 33:1 28:221:4 23:4 28:2 31:8
;
and correlations
Corr(Yi) =
1 0:83 0:84 0:76
0:83 1 0:86 0:760:84 0:86 1 0:870:76 0:76 0:87 1
:
Longitudinal Data Analysis 36
![Page 38: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/38.jpg)
What if we ignore the correlation in the analysis?
A natural estimate of the change in the mean response is
� = �2 − �1;
where �j = 1N
∑Ni=1 Yij: For the treatment group we have �2− �1 = 13:5− 26:6 = −13.
To obtain an estimate of it's standard error we calculate
V ar(�) = V ar
{1
N
N∑i=1
(Yi2 − Yi1)
}=
1
N(�2
1 + �22 − 2�12);
which in our case becomes
V ar(�) =1
50(25:2 + 58:9− 2(15:5)) = 1:06:
Longitudinal Data Analysis 37
![Page 39: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/39.jpg)
If we had simply ignored the existing correlation, then
• We would implicitly assume that �12 = 0, and hence
V ar(�) =1
50(25:2 + 58:9) = 1:68;
which is approximately 1.6 times larger
• This lead to wide con�dence intervals and p-values for the test of H0 : � = 0 that aretoo large.
In summary
• the correlation between the observations is a good thing
• failure to take account of the correlation in the analysis could lead to misleading scienti�cinferences
Longitudinal Data Analysis 38
![Page 40: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/40.jpg)
Pros
• Investigate pattern of change
• Subjects serve as their own controls since response variable is measured under control(baseline) and experimental conditions
• Data collected from the same subjects are more reliable
• While we can address the same questions as in a cross-sectional study, in LD analysis wecan separate what is called cohort and age e�ects
Longitudinal Data Analysis 39
![Page 41: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/41.jpg)
Cons
• Complications in the analysis due to the correlation between observations
• The investigator not always controls the circumstances
{ unbalanced designs{ missing data (pattern!)
Longitudinal Data Analysis 40
![Page 42: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/42.jpg)
Start: Plots/Graphical Presentation
Initially we discuss how we can analyze data that come from one population/group. Weintend to explore
• how observations change over time
• what may in uence possible changes over time.
Initially assume that we have a balanced study. This is an important and reasonableassumption for some kind of analyses which cannot adjust for some forms of irregularities.
For example, it is very important to have observations at regular time points, so quantitieslike the mean response at a speci�c occasion can be calculated.
Assume we have the data from the TLC study, only from the Placebo group.
Longitudinal Data Analysis 41
![Page 43: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/43.jpg)
1. No matter what we are planing to do with the analysis of LD data, the �rst step is alwaysthe creation of a scatterplot. For the balanced TLC data we have
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●
Longitudinal Data Analysis 42
![Page 44: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/44.jpg)
while for the unbalanced macs data we have
−2 0 2 4
050
010
0015
0020
0025
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
Longitudinal Data Analysis 43
![Page 45: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/45.jpg)
2. As soon as we explore the scatterplot we plot the time or pro�le plot. For the TLC datathere is little hope of getting something very useful out of it (usually the case),
●
●●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●
●●
●● ●
● ●
●
●●
●
●
●
●
●
●
● ● ●
● ●
●●
●
●
●
●
●
●
● ●
●●
●
●●
●
●
●
●
●
●●
●
●●
●●
● ●
●
● ●● ●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
●●● ●
●
●
●
●●
●●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
● ●●●
●
●●
● ●
●●
●
●●
●
●●
●
●
●● ●
●
●●
● ●
● ●
●●
●●
●
●
● ●
● ●
●
●
● ●
●●
● ●
● ●
●
●●
●●
●
●
●
●●
●
●
● ●
●
●●
●
●
●
●
●
●
● ●
●
●● ●
●
●●
●
●
Longitudinal Data Analysis 44
![Page 46: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/46.jpg)
The smallmice scatterplot de�nitely provides with some intuition about the nature of thedata
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
Days
Wei
ght
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
Longitudinal Data Analysis 45
![Page 47: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/47.jpg)
while for the unbalanced macs data we have a seriously messy situation
●
●
●
−2 0 2 4 6
050
010
0015
0020
0025
0030
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
● ●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
● ● ●
●
●
● ●●
●
●
●
● ●
●●
●●
●
●
●
●●
●●
● ● ●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
● ●●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
●
●
●●
●● ●
●
●
●
●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
● ●
●●
●
● ●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●●
●
●
●
●●
●● ● ●
●
●●
●
●
●
●
●●
● ●●
●
● ●
●●
●
●●
●●
●
● ●
● ●
●
●●
●
●
●
● ●
●
●●
●●
● ●
●
●●● ●
● ●●
● ● ● ●●
● ●
●
●●
●●
● ● ● ●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
● ●●
●
●●
● ●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
● ●
●
●●
●●
●
●
●●
●
●
●● ●
●
●●
●
● ●●
●●
●
●
● ● ●
●
●
●
●
●
●
●
●●
● ●
● ●● ● ●
●
●
● ●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
● ●
●●
●
●
●
●
●
●
● ●
● ●●
● ●
● ●
●
●●
●
●●
● ●
● ●
●● ●
●● ●
●●
●
●●
●
●●
●
●
● ●
●
●
●
●
●
● ●
●● ●
●●
●
●
● ●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
● ● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●●
●●
●
●●
●
●
●
●
● ●
●
●
●
●
● ● ●
●
●● ●
●
●
● ●
●
●
●●
●
●
●
●● ●
●
●
●●
●
●● ● ●
● ●● ●
●
●●●
●
●●
●
●
●
●
●
●
●
● ● ● ●
●●
●
●●
●
●
● ●●
●
● ●●
●
●● ●
● ●● ●
● ●
●●
● ●●
● ●
●●
●
●
● ● ●
● ●
●
●
● ●●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●●
● ●
●
●
●●
● ● ●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ● ●
● ● ●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●● ● ●
● ●● ●
●●
●
●●
●
●
● ● ●●
● ● ●
●●●
●●
● ●●
●
●
●
●
● ●
● ●
●● ●
●●
●
● ●●
●●
●
●
●
●
●
● ●●
● ●●
●●
●● ●
●
●
●
● ●
●
●
●
●●
●
● ●●
● ●
●
●
●
● ●
●
●
●●
●●
●●
● ● ●
●●
●
●
●
●
●●
● ●
●
●●
●
●
● ●
●
●
●
●●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
● ●
●
●● ●
●
●●
●
● ●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●● ●
●
●
●●
● ●●
●●
●
●●
●
●
● ●●
●●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●●
Longitudinal Data Analysis 46
![Page 48: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/48.jpg)
One way of getting some useful information out of it is to simply plot a small, random,sample of time plots
●●
●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●●
●
●
●
●
●
●●
● ●
●
●
● ●
●
●
●
●●
Longitudinal Data Analysis 47
![Page 49: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/49.jpg)
while for the unbalanced macs data we have
●
●
●
●
●
●
−2 0 2 4 6
050
010
0015
0020
0025
0030
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
● ●
●
●
●
●
● ●
●
●
●
●
● ●
●
●
● ●●
●
●
●
●
● ●●
●●
●●
●
Longitudinal Data Analysis 48
![Page 50: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/50.jpg)
However, there is always the danger that the chosen time plots are not representative ofthe population. A possible '�x' to this problem is
• Choose a variable• Observe the time plots for di�erent levels of this variable (if this is a binary or factor
variable)• Observe the time plots for the di�erent quantiles of this variable (if this is continuous
variable)• This variable could be one of the explanatory variables
Longitudinal Data Analysis 49
![Page 51: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/51.jpg)
Consider the time plots for the macs data by age quantiles
●
●● ●
−2 0 2 4 60
500
1500
2500
(1)
●
●● ●
●
●●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●● ● ●
●●
●
●
●●
● ●
● ● ● ●●
● ●
●●
●● ●
● ● ●●
●
●●
●● ●
● ●●
●
●● ●
●
● ●
●
●
●
●
●
●● ●
●●
●
● ● ●
●●
●
●● ●● ● ● ● ● ●●
●
●● ● ●
● ●●
●
●
●
●● ● ●
●
●
●
●
●●
●●
●
●
●●
● ● ●
●
●●
●
● ●
●●
●
●●
●
●
● ●●
●
● ● ●
● ●
●
−2 0 2 4 6
050
015
0025
00
(2)
● ●
● ●
●●
●
●
●
●
● ●●
●
●
●●
●
●
●●
●
● ●
●
●●
●● ●
● ●
●●●
●
●● ●●
●●
●
●
●
● ●●
● ●
●
●●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●●
● ●
●
● ●
●
●
●●
●
● ●
● ●●
●●
●●●
● ● ●
●●
●
●
●●● ● ● ●
●
● ●
●
●
●●
●●
●
●●
●
●● ● ● ● ●●
●
●●
●●
● ●●
●
●●
● ●●
●
●
● ●
●
−2 0 2 4 6
050
015
0025
00
(3)
●●
●
●
●
● ●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ● ●●
●●
●
●
● ●●
●
●
● ●
●●
●●
● ●
●
●
●●
●
●● ●
● ● ●●
●
●
●●
●
● ●
●●
●
●●
●
●●
● ●● ●
● ●
●●
● ● ●
●●
● ●● ● ●
●
● ●
●● ●
●
●●
●
●
● ●
● ●
●●
●● ●
●
●
●
●
●
●●
● ●
●
●
●
●●
●
●
●
●
●
−2 0 2 4 60
500
1500
2500
(4)
●
●
●● ●
●● ●
●●
●
●●
●
●
●●
●
●
●
● ● ●● ●●
●●
●
●
●●
● ● ●
●
●
● ●●
● ●●
●●
●
●
●●
●
●●●
● ●
●●
●
●●
●
●
● ●
●●
●
●
●
●
●
●
●●
●
●●
●
●● ●
●●
● ● ●●
●
●
●
●● ●
●●
●● ●●
● ●
●
●
●
●●
●●
● ●●
●●
●
●
●● ● ●
●
●
●
●●
●
●● ●
●
●
●
●●
●
Longitudinal Data Analysis 50
![Page 52: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/52.jpg)
3. One way to explore changes in response over time is to create boxplots. This is possibleonly in balanced studies, where occasions are common for everybody. For smallmice data
●
●●
●
●
●●
2 5 8 11 14 17 20
200
400
600
800
1000
1200
Days
Wei
ght
Longitudinal Data Analysis 51
![Page 53: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/53.jpg)
Simple Analysis
The apparent complication from the fact that we have repeated measurements could beovercome by summarized the longitudinal data.
1. Perhaps the simplest univariate summary of LD data is the average of the response froma single subject
Yi =
∑nij=1 Yij
ni:
The average Yi is treated as a single response per subject. The analysis then is simpli�edand linear regression and ANOVA techniques can easily be used.
Note: This approach is straight forward in balanced studies. A problem exists in unbalancedstudies where not all of the subjects have the same number of observations. So we could
• average all the available observations per subject and continue• average all the available observations per subject and perform some weighted analysis• ignore them or do something else???
Longitudinal Data Analysis 52
![Page 54: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/54.jpg)
2. Another way of analyzing LD data is by summarizing each pro�le by a slope.
• We treat the set of observations for each subject as a 'separate' population• we regress Yij against tij with ni data points for each of the i subjects• De�ne Y ∗ij = Yij − Yi and t
∗ij = tij − ti, where ti is the mean of the observation times
for subject i. Then the slope can be written
�i =
∑j t∗ijY∗ij∑
j t∗ijt∗ij
:
Then the n slopes are treated as the regular data, and analysis using standard techniquesare being used to analyze these data. For example, two sample t-tests or ANOVA can beused to compare the slopes between two groups or more.
3. Many LD studies are designed to be analyzed as a paired analysis. Hence, if we havedata of the form before treatment and after treatment, then the paired t-test could be theappropriate way for analysis.
Longitudinal Data Analysis 53
![Page 55: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/55.jpg)
Problems with simple analyses
1. E�ciency Lost: This occurs when we do NOT use all the data available to us.
• Omit subjects (NEVER do that)• Omit observations
2. Bias: Can be introduced at many stages and in many di�erent ways.
• by design• by subjects who may drop-out for reasons related to the study• by the analyst through mis-analysis
3. Over-simpli�cation: When we simplify the data ignoring their richness.
Longitudinal Data Analysis 54
![Page 56: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/56.jpg)
Smoothing Techniques
In cases where the occasions of measurement are di�erent, it is helpful to produce a"smoothed" plot of the mean response trend over time, as a summary measure.
• Many of these smoothing techniques estimate the mean response at any time by consideringnot only the observations at this particular occasion, but also the neighboring ones.
• That is, the estimated mean is based on observations takes before, at and after the timeof interest.
• The mean, say, at time t is taken to be a weighted average of the observations in closeproximity or neighborhood of time t.
Longitudinal Data Analysis 55
![Page 57: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/57.jpg)
A: Moving/Running Average
One of the most well-known and simplest approaches is the moving or running average.
• For longitudinal data that are balanced and complete the moving average at time t, saySt, is given by
St =1
N
N∑i=1
k∑j=−k
wjyi;t+j; t = k + 1; :::; n− k
where
{ k is some positive integer (eg k = 1 or k = 2) and
{∑k
j=−k wj = 1.
We refer to 2k + 1 as being the order of the moving average. This expression assumesthat N individuals are measured at the same set of occasions.
• With unbalance and/or incomplete data, a similar expression can be derived.
Longitudinal Data Analysis 56
![Page 58: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/58.jpg)
• The order of the moving average determines a symmetric neighborhood of values used toestimate St.
• The higher the order of the moving average the greater the smoothness of the resultingestimate of the mean time trend.
• Hence, the lower the order of the moving average the greater the roughness of the estimate.
• The wj are positive weights that add up to 1, usually equal. In the case where they arenot equal, they are chosen to decrease symmetrically about some maximum value. Thatis wj = w−j and w0 > w1 > ::: > wk. As a result, observation closer to time t havegreater weight in the calculation of the mean than those further apart.
• Based on this de�nition, the calculation of the moving average is problematic at thebeginning and at the end of time plot. A solution is to amend the summation to rangefrom j = max(−k; 1− t) to j = min(k; n− t) and diving by the by the correspondingsum of the included weights.
Longitudinal Data Analysis 57
![Page 59: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/59.jpg)
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
SmallMice
Day
Wei
ght
●
●
●
●
●
●
●
Mean/DayMoving Average (k=1)
Longitudinal Data Analysis 58
![Page 60: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/60.jpg)
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●
●
●●
●●
● ●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●●
●
● ● ●
●
●
●● ●
● ●
●●
●●●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
● ●●
●
●●
●
●● ●
● ●● ●
● ●●
●
●
● ● ●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●
● ● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
● ●
●
●
●
● ●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●
●●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
−2 0 2 4
050
010
0015
0020
0025
00
Years since seroconversion
CD
+ c
ell n
umbe
rsBandwidth
Longitudinal Data Analysis 59
![Page 61: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/61.jpg)
Similarly, but more e�ciently, we can use the kernel smoother
�(t) =
∑mi=1 w(t; ti; h)yi∑mi=1 w(t; ti; h)
;
which is a weighting function that changes smoothly over time and gives more weight toobservations close to time t.
A common weight function is the the Gaussian (normal) Kernel
K(u) = exp(−0:5u2):
Hence:w(t; ti; h) = K{(t− ti)=h};
where h is the bandwidth of the kernel.
In R: bandwidth = The kernels are scaled so that their quartiles (viewed
as probability densities) are at +/- 0.25*bandwidth
Longitudinal Data Analysis 60
![Page 62: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/62.jpg)
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●
●
●●
●●
● ●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●●
●
● ● ●
●
●
●● ●
● ●
●●
●●●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
● ●●
●
●●
●
●● ●
● ●● ●
● ●●
●
●
● ● ●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●
● ● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
● ●
●
●
●
● ●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●
●●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
−2 0 2 4
050
010
0015
0020
0025
00
Kernel Smoother (Box)
Years since seroconversion
CD
+ c
ell n
umbe
rs
Bandwidth=0.5 (default)Bandwidth=4
Longitudinal Data Analysis 61
![Page 63: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/63.jpg)
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●
●
●●
●●
● ●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●●
●
● ● ●
●
●
●● ●
● ●
●●
●●●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
● ●●
●
●●
●
●● ●
● ●● ●
● ●●
●
●
● ● ●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●
● ● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
● ●
●
●
●
● ●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●
●●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
−2 0 2 4
050
010
0015
0020
0025
00
Kernel Smoother (Gaussian)
Years since seroconversion
CD
+ c
ell n
umbe
rs
Bandwidth=0.5 (default)Bandwidth=3
Longitudinal Data Analysis 62
![Page 64: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/64.jpg)
B: Lowess
One popular method is (robust) LOcally WEighted (polynomial) regreSSion or lowess.
• The lowess estimate at t is understood by imagining there is a 'window' centered at t.
• The lowess estimate of the mean at t is determined by �tting a 'straight' line to the datainside the window and obtaining the predicted value at t from the �tted regression line(using the explanatory variable values for that data point).
• The polynomial is �t using weighted least squares, giving more weight to points near thepoint whose response is being estimated and less weight to points further away.
• The entire lowess curve is obtained by moving the window of �xed width from left to rightand repeating the process at every time.
• The width of the window determines the smoothness. The wider the window the smootherthe curve. This is called the bandwidth.
Longitudinal Data Analysis 63
![Page 65: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/65.jpg)
• The choice of bandwidth involves the classical trade o� between bias and precision.Excessive smoothing decreases the variance of the estimate at the risk of introducing bias.Insu�cient smoothing is unlikely to introduce bias but will produce a variable estimate.
• Many of the details of this method, such as the degree of the polynomial model and theweights, are exible.
References
• Cleveland, W. S. (1979) Robust locally weighted regression and smoothing scatterplots.J. Amer. Statist. Assoc. 74, 829{836.
• Cleveland, W. S. (1981) LOWESS: A program for smoothing scatterplots by robust locallyweighted regression. The American Statistician, 35, 54.
Longitudinal Data Analysis 64
![Page 66: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/66.jpg)
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
SmallMice
Day
Wei
ght
●
●
●
●
●
●
●
lowessMean/Day
Longitudinal Data Analysis 65
![Page 67: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/67.jpg)
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●
●
●●
●●
● ●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●●
●
● ● ●
●
●
●● ●
● ●
●●
●●●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
● ●●
●
●●
●
●● ●
● ●● ●
● ●●
●
●
● ● ●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●
● ● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
● ●
●
●
●
● ●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●
●●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
−2 0 2 4
050
010
0015
0020
0025
00
lowess curve [ macs (placebo) ]
Time
CD
4
Longitudinal Data Analysis 66
![Page 68: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/68.jpg)
* (Robust Regression) *
Major problems in regression are the absence of
• normality (parametric)
• common variance
• independence of the errors
Longitudinal Data Analysis 67
![Page 69: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/69.jpg)
Other problems are
• overly in uential data points
• outliers
• inadequate speci�cation of the functional form of the model
• near-linear dependencies amongst the independent variables (collinearity)
• independent variables being subject to errors
Longitudinal Data Analysis 68
![Page 70: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/70.jpg)
Robust regression is a form of regression analysis designed to circumvent somelimitations of traditional parametric and non-parametric methods.
• A simple method of estimating parameters in a regression model that are less sensitive tooutliers than the least squares estimates, is to use least absolute deviations. Even then,gross outliers can still have a considerable impact on the model.
• Another approach to robust estimation of regression models is to replace the normaldistribution with a heavy-tailed distribution. A t-distribution with between 4 and 6 degreesof freedom has been reported to be a good choice in various practical situations.
Longitudinal Data Analysis 69
![Page 71: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/71.jpg)
Revision
i) Normal Distribution
• Univariate N(�; �2). The probability density function is
�(x) =1
�√
2�exp
{−(x− �)2
2�2
}
• Multivariate Np(�;Σ). Let x = (x1; x2; :::; xp)′ a p-component random vector having
a MVN with mean � = (�1; �2; :::; �p)′ and a p× p covariance matrix
Σ =
�11 : : : �1p... . . . ...�p1 : : : �pp
:
Longitudinal Data Analysis 70
![Page 72: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/72.jpg)
The pdf has the form
f(x1; x2; :::; xp) = (2�)−p=2|Σ|−1=2 exp{−0:5(x− �)′Σ−1(x− �)
}:
ii) Maximum Likelihood Estimation
• Independent Observations (simple linear regression)Suppose the data are collected from a series of cross{sectional studies. We have asample of N -individuals at n-occasions, and the data are of the form (Yij;Xij), for theith individual at the jth occasion. The model has the form
Yij = Xij� + eij;
where eij ∼ N(0; �2): Hence
f(yij) =1
�√
2�exp
{−(yij − �ij)
2
2�2
};
Longitudinal Data Analysis 71
![Page 73: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/73.jpg)
and the likelihood function takes the form
L =
N∏i=1
n∏j=1
f(yij):
The log{likelihood then becomes
l = log
N∏i=1
n∏j=1
f(yij)
= −nN
2log(2��2)− 1
2
N∑i=1
n∑j=1
(yij −X′ij�)2
�2;
Longitudinal Data Analysis 72
![Page 74: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/74.jpg)
while the MLE of � (also the OLS estimate) are
� =
N∑i=1
n∑j=1
(XijX′ij)
−1
N∑i=1
n∑j=1
(Xijyij):
Note: In this process we have ignored �2.• Correlated Observations
In this case we have ni observations for the ith subject. Assume Σi is known, and hence
we do not need to estimate it (later we see how we can estimate it). It is assumed thatYi = (Yi1; Yi2; :::; Yini)
′ has a Nni(�i;Σi) distribution. Hence, the log{likelihood canbe written as
l = −K2
log(2�)− 1
2
N∑i=1
log |Σi| −1
2
N∑i=1
(yi −Xi�)′Σ−1i (yi −Xi�);
where K =∑N
i=1 ni is the total number of observations.
Longitudinal Data Analysis 73
![Page 75: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/75.jpg)
Then the estimator of �, known as GLS estimator, can be expressed as
� =
{N∑i=1
(X′iΣ−1i Xi)
}−1 N∑i=1
(X′iΣ−1i yi);
and has the properties:{ Is unbiased:
E(�) = �:
{ Asymptotically has a MVN with mean � and
Cov(�) =
{N∑i=1
(X′iΣ−1i Xi)
}−1
:
Note: Similar asymptotic properties we have when we estimate Σi. However, with smallsample sizes, the sampling distribution of � is adversely in uenced by the number ofcovariance parameters that need to be estimated.
Longitudinal Data Analysis 74
![Page 76: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/76.jpg)
Modelling the Mean: Pro�le Analysis
• Initially, we introduce no structure on the mean response over time.
• Additionally, we set no structure on the covariance among the repeated measures. Thiswill be dealt in details later.
• In order to perform a Pro�le Analysis, we require balanced data, with the timing of therepeated measures common to all individuals in the study.
• Unbalanced designs due to missing data can be handled.
• This kind of analysis is appealing when there is a single categorical covariate (eg. treatmentgroup) and when a speci�c pattern for the di�erences in the response pro�les cannot bespeci�ed.
Longitudinal Data Analysis 75
![Page 77: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/77.jpg)
●
●
●
●
0 1 2 3 4 5 6
1015
2025
30
Time(weeks)
Mea
n B
lood
lead
leve
l●
●●
●
SuccimerPlacebo
Longitudinal Data Analysis 76
![Page 78: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/78.jpg)
Hypotheses:
For simplicity, assume that we have a two-level categorical covariate (two-group design). Anygeneralization should be straight forward.
Hence, the following questions arise:
• Are the pro�les of the groups parallel? In other words, is there a group× time interaction?
• Is there a time e�ect? (under the assumption that the mean response pro�les are parallel)
• Is there a group e�ect? (under the assumption that the mean response pro�les are parallel)
Longitudinal Data Analysis 77
![Page 79: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/79.jpg)
●
●
●
●
1.0 1.5 2.0 2.5 3.0 3.5 4.0
68
12Time
Mea
n R
espo
nse
● ● ● ●
1.0 1.5 2.0 2.5 3.0 3.5 4.0
68
12
Time
Mea
n R
espo
nse
● ● ● ●
1.0 1.5 2.0 2.5 3.0 3.5 4.0
68
12
Time
Mea
n R
espo
nse
Longitudinal Data Analysis 78
![Page 80: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/80.jpg)
Suppose we have the two-group design, where a new treatment (T) is compared to astandard one (C).
Measurement OccasionGroup 1 2 . . . n
Treatment �1(T ) �2(T ) . . . �n(T )Control �1(C) �2(C) . . . �n(C)Di�erence ∆1 ∆2 . . . ∆n
∆j = �j(T)− �j(C)
The null hypothesis is that there is no treatment × time interaction. This means thatthe di�erence in the means between the treatment groups is the same over time. Hence:
H0 : ∆1 = ∆2 = : : : = ∆n:
This provides with a test on (n− 1) degrees of freedom.
Longitudinal Data Analysis 79
![Page 81: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/81.jpg)
In a General Linear Model formulation we have
E(Yi|Xi) = �i = Xi�;
where Xi is an appropriate design matrix for the kind of interpretation we want for the �'s.
Example: If we have n = 3 measurements from two groups, then we require 2× 3 = 6parameters for the means. For group A we have
�1(A) = �1
�2(A) = �2
�3(A) = �3
while for group B we have
�1(B) = �4
�2(B) = �5
�3(B) = �6
Longitudinal Data Analysis 80
![Page 82: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/82.jpg)
Hence, the design matrix for group A has the form
Xi =
1 0 0 0 0 00 1 0 0 0 00 0 1 0 0 0
while for group B
Xi =
0 0 0 1 0 00 0 0 0 1 00 0 0 0 0 1
where � = (�1; �2; : : : ; �6)′ is a 6× 1 vector of regression coe�cients. Hence:
�(A) =
�1(A)�2(A)�3(A)
=
�1
�2
�3
and �(B) =
�1(B)�2(B)�3(B)
=
�4
�5
�6
:
Longitudinal Data Analysis 81
![Page 83: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/83.jpg)
In this way of parameterization we cannot test H0 by simply setting one of the �'s equal tozero, or something simple like that. The null hypothesis of no treatment× time interactioncan be re-expressed as
H0 : (�1 − �4) = (�2 − �5) = (�3 − �6);
and written in a matrix formH0 : L� = 0;
for
L =
(1 −1 0 −1 1 01 0 −1 −1 0 1
):
This expression leads to the following set of equations
{�1 − �2 − �4 + �5 = 0�1 − �3 − �4 + �6 = 0
⇒{
�1 − �4 = �2 − �5
�1 − �4 = �3 − �6
Longitudinal Data Analysis 82
![Page 84: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/84.jpg)
A slightly di�erent way of parameterization is when we use a group as a reference group (thisis the preferred way of parameterization of many statistical software). In this approach thedesign matrices have the form
Xi =
1 0 0 0 0 01 1 0 0 0 01 0 1 0 0 0
for group A, while for group B
Xi =
1 0 0 1 0 01 1 0 1 1 01 0 1 1 0 1
;
where in this case the reference group is the �rst one (group A).
Longitudinal Data Analysis 83
![Page 85: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/85.jpg)
Hence:
�(A) =
�1(A)�2(A)�3(A)
=
�1
�1 + �2
�1 + �3
and
�(B) =
�1(B)�2(B)�3(B)
=
�1 + �4
(�1 + �4) + (�2 + �5)(�1 + �4) + (�3 + �6)
:
As a result, the null hypothesis for no treatment× time interaction now takes the form
H0 : �5 = �6 = 0;
which is a simpler and a more straight forward way of testing H0.
Additionally, testing for the main e�ects (group and time) is straight forward. Therefore,under the assumption of no interaction e�ect, the hypothesis of no time e�ect can be assessedthrough
H′0 : �2 = �3 = 0;
Longitudinal Data Analysis 84
![Page 86: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/86.jpg)
while the group e�ect can be assessed through
H′′0 : �4 = 0:
General Case
In a similar way we can have the parameterization for the case where we have G groupsto compare over n occasions. In the '�rst' parameterization' (no reference group) we canintroduce G dummy (binary) variables, indicators for each one of the G treatment groups
Zig =
{1; if the ith subject belongs to group g;0; otherwise.
Longitudinal Data Analysis 85
![Page 87: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/87.jpg)
Hence: �i(1)�i(2)...
�i(G− 1)�i(G)
=
�1
�2...
�G−1
�G
If, however, we choose to introduce an intercept, say �1, then we need G − 1 dummy
variables. Hence, if we allow group G to be our reference group, we get�i(1)�i(2)...
�i(G− 1)�i(G)
=
�1 + �2
�1 + �3...
�1 + �G�1
Longitudinal Data Analysis 86
![Page 88: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/88.jpg)
How is Done!
• In the Pro�le Analysis we basically have two covariates, both factors! One, say Z1,represents treatment group (G ≥ 2) while the second one, say Z2, is for the n occasionsfor which we have measurements.
• Assume we have G = 2 treatment groups and n = 3 occasions. The 'usual' approach(most stats software) is to introduce by default an intercept into the model. In this casewe need G− 1 = 1 dummy variables for treatment and n− 1 = 2 for occasions.
{ For treatment we have (assuming standard treatment is the reference)
Zi1 =
{1; if the ith subject is on new treatment;0; if the ith subject is on standard treatment.
Longitudinal Data Analysis 87
![Page 89: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/89.jpg)
{ For the occasions (assuming the �rst one is the reference) we have
Zi21 =
{1; indicate observation at the second occasion;0; otherwise.
and
Zi22 =
{1; indicate observation at the third occasion;0; otherwise.
• The model takes the form
{ with no treatment× time interaction
�i = �1 + �2Zi1 + �3Zi21 + �4Zi22
{ with interaction
�i = �1 + �2Zi1 + �3Zi21 + �4Zi22 + �5Zi1Zi21 + �6Zi1Zi22
Longitudinal Data Analysis 88
![Page 90: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/90.jpg)
• For example (model with interaction):
{ If the ith subject is in new treatment and for the second occasion, we have
�i = �1 + �2 + �3 + �5:
{ If the ith subject is in standard treatment and for the third occasion, we have
�i = �1 + �4:
{ While, if the ith subject is in standard treatment and for the �rst occasion, we have
�i = �1:
Longitudinal Data Analysis 89
![Page 91: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/91.jpg)
• The design matrices based on our model are
{ for patients on the new treatment
Xi =
1 1 0 0 0 01 1 1 0 1 01 1 0 1 0 1
;
{ for patients on the standard treatment
Xi =
1 0 0 0 0 01 0 1 0 0 01 0 0 1 0 0
Longitudinal Data Analysis 90
![Page 92: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/92.jpg)
Missing Data
We have mentioned that this type of analysis requires balanced structures. However,missing data can be easily dealt with by constructing the appropriate design matrix.
For example, if a subject attends two of the arrange occasions and misses the third one, thenwe can simply remove the appropriate line from the design matrix.
Hence, if a patient from group A (previous example) misses the third visit, then the designmatrix becomes
Xi =
(1 0 0 0 0 00 1 0 0 0 0
)
Longitudinal Data Analysis 91
![Page 93: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/93.jpg)
Tools & Concepts
Now, we consider how to make inferences about �. More speci�cally this has to do withcon�dence intervals and hypothesis testing.
A. Statistical Inference:In order to estimate � we use the ML in order to get �, with estimated covariance matrix
Cov(�) =
{N∑i=1
(X′iΣiXi
)}−1
;
where Σ, the ML estimate of Σ is being used.
Longitudinal Data Analysis 92
![Page 94: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/94.jpg)
1. Con�dence Intervals: For every single component �k of � we have
�k ± 1:96
√V ar(�k)
for a 95 % Con�dence Interval.
Generally, if L is a vector or matrix of known weights, then
L� ± 1:96
√LCov(�)L′
Longitudinal Data Analysis 93
![Page 95: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/95.jpg)
2. Wald Test: Whenever a relationship within or between data items can be expressed asa statistical model with parameters to be estimated from a sample, the Wald test canbe used to test the true value of the parameter based on the sample estimate. Hence,for testing the hypothesis
H0 : �k = 0
HA : �k 6= 0;
we calculate the following Wald Statistic
Z =�k√
V ar(�k)
can be compared with N(0; 1).In general, con�dence intervals can be constructed for linear combinations of thecomponents of �. Hence, assume that L� represent a set of contrasts of interest.The hypothesis testing takes the form
H0 : L� = 0
Longitudinal Data Analysis 94
![Page 96: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/96.jpg)
HA : L� 6= 0;
and the Ward Statistic becomes
Z ′ =L�√
LCov(�)L′:
Now, if L is a single row vector then LCov(�)L′ is scalar and hence we compare Z ′ tothe standard normal distribution.Furthermore, since Z ′ ∼ N(0; 1), then Z ′2 has a �2 distribution with 1 degree offreedom (df). As a result, an identical test of the above hypothesis uses the statistic
W 2 = (L�)′{LCov(�)L′
}−1
(L�);
and compare W 2 to �21.
However, this formulation helps to generalize (when L has more than one rows), allowing
Longitudinal Data Analysis 95
![Page 97: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/97.jpg)
the simultaneous testing of a multivariate hypothesis. Hence, if L has r rows then asimultaneous test
H0 : L� = 0
HA : L� 6= 0;
is given by
W 2 = (L�)′{LCov(�)L′
}−1
(L�);
which follows a �2 distribution with r df.
This is often referred to as the multivariate Wald test.
Longitudinal Data Analysis 96
![Page 98: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/98.jpg)
3. Likelihood Ratio Test:• The LRT can be used to compare two models, when one model is a special case
(nested) to the other.• The alternative or full model allows some parameters to vary, whereas the null or
reduced model �xes those parameters at known values.• The LRT is then 2 times the di�erence of the log miximized likelihoods for each
model. The alternative of full model (larger model) will always have the larger log-likelihood (lfull), whereas the null or reduced model has lred < lfull. Hence, the teststatistic
G2 = 2(lfull − lred)
is constructed to answer how much larger lfull is from lred. The larger G2 is thestronger the evidence that the smaller model (null) is inadequate.• We compare G2 to a �2 distribution with df equal to the di�erence between the
number of parameters in the two models.
Longitudinal Data Analysis 97
![Page 99: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/99.jpg)
Note 1:Likelihood-based con�dence intervals can be constructed with the use of of the pro�le
likelihood. More speci�cally, for a single component �k of �, the pro�le log-likelihood isobtained by maximizing the log-likelihood over the remaining parameters while keeping�k �xed. Then a 95 % CI is constructed by obtaining the values of �k that satisfy
2{lp(�k)− lp(�k)} ≤ critical value:
Note 2:LRT can be used for covariance parameters. Due to problems with the samplingdistribution of variance parameters, Wald test is not recommended. Even with LRTthere are some problems in comparing nested models for covariance parameters.
Longitudinal Data Analysis 98
![Page 100: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/100.jpg)
B. Restricted (residual) Maximum Likelihood (REML) Estimation:
• Introduced by Patterson & Thompson (1971) as a way of estimating variancecomponents in a GLM.• In ML estimation the log-likelihood function has the form
l = −K2
log(2�)− 1
2
N∑i=1
log |Σi| −1
2
N∑i=1
(yi −Xi�)′Σ−1i (yi −Xi�);
where K =∑N
i=1 ni is the total number of observations.• It is known that the ML estimate of Σi is biased in small samples.• To illustrate, consider the case where observations are independent (from cross-sectional
studies) with constant variance �2. Estimates of both � and �2 come from the
Longitudinal Data Analysis 99
![Page 101: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/101.jpg)
maximization of the log-likelihood function
l = −K2
log(2��2)− 1
2
N∑i=1
n∑j=1
(yij −X′ij�)2
�2
• The MLE of �2 is
�2 =
N∑i=1
n∑j=1
(yij −X′ij�)2
K;
and we know that �2 is a biased estimate of �2
E(�2) =
(K − p
K
)�2;
where p is the dimension of �.
Longitudinal Data Analysis 100
![Page 102: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/102.jpg)
• An unbiased estimate of �2 is
�2 =
N∑i=1
n∑j=1
(yij −X′ij�)2
K − p;
which is known as the REML estimate.• In e�ect, the bias arises from the fact that the ML estimate does not take into account
the fact that � is also being estimated from the same data.
Longitudinal Data Analysis 101
![Page 103: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/103.jpg)
As a result, Restricted (residual) Maximum Likelihood Estimation was developed toaddress this particular problem.
• The main idea is to separate the part of the data that is being used for the estimationof variance parameters.• Hence, we need to eliminate � from the likelihood, so only Σi is left in the likelihood
to be estimated.• One way of doing that is by transforming the data to a set of linear combinations of
observations that have a distribution that does not depend on �.• In the case of GLM with dependent errors the REML estimator is de�ned as a MLE
based on a linearly transformed set of data
Y∗ = AY;
such that the distribution of Y∗ does not depend on �.• For example the residuals after estimating � by OLS can be used to estimate Σi. Hence,
A = I −X(X ′X)−1X ′:
Longitudinal Data Analysis 102
![Page 104: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/104.jpg)
• Then, Y∗ has a singular multivariate Gaussian distribution with mean zero, whateverthe value of �:• The REML estimator of Σi is less biased than the ML estimator. When N is much
larger than p the di�erence becomes less important.• The REML estimator is being used for Σ, while � is estimated by the GLS estimator
� =
{N∑i=1
(X′iΣ−1i Xi)
}−1 N∑i=1
(X′iΣ−1i yi);
by plugging in the REML estimate of Σi.• REML is the default in R (and in many statistical software).
Longitudinal Data Analysis 103
![Page 105: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/105.jpg)
Model Selection
Model selection involves the choice of an appropriate model among a set of candidatemodels.
1. Nested Models: The likelihood ratio (LR) test is used in nested models. This means thatthe reduced model is a special case of the full model. In this case LR test can be seen asa model selection tool, since we can decide whether the additional complication of the fullmodel is worthwhile or the simpler model is equally good in describing the data.
2. Generally: Model selection techniques are useful for screening through many di�erentcovariance models. The goal is to choose the 'best' model for use in further analysis. The(log-) likelihood is once again the driving force behind any selection tool. More speci�cally
• Criterion-based approaches compare adjusted log-likelihoods penalized for the numberof parameters in the model.• The penalty increases with the number of parameters. This is because models with many
parameters should �t better (higher log-likelihood) than models with fewer parameters.
Longitudinal Data Analysis 104
![Page 106: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/106.jpg)
• The penalty is used to level o� this discrepancy.• The most popular selection criteria are{ The Akaike Information Criterion (AIC). For a given model m the AIC is de�ned as
AIC(m) = −2 loglikelihood(m) + 2qm;
where qm is the number of parameters in the model.{ The Bayes Information Criterion (BIC), de�ned as
BIC(m) = −2 loglikelihood(m) + log(N)qm;
where N is the number of observations (sample size).• For covariance models the log-REML is being used.• Model selection proceeds similarly in both criteria.{ We �t the models of interest to the data and then they are ranked according either
their AIC or BIC value.{ The model with the smallest value is selected as best.
Longitudinal Data Analysis 105
![Page 107: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/107.jpg)
Example: TLC Data
• Reshape the datatlc:long = reshape(tlc; idvar = ”id”; varying = c(”lead0”; ”lead1”; ”lead4”; ”lead6”); v:names =
”lead”; direction = ”long”)
• Model 1:
fm1 = lmer(lead ∼ factor(time) + factor(group) + (1|id); data = tlc:long)
• Model 2:
fm2 = lmer(lead ∼ factor(time) + (1|id); data = tlc:long)
Longitudinal Data Analysis 106
![Page 108: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/108.jpg)
R Console Page 1
> fm1Linear mixed-effects model fit by REML Formula: lead ~ factor(time) + factor(group) + (1 | id) Data: tlc.long AIC BIC logLik MLdeviance REMLdeviance 2576 2600 -1282 2569 2564Random effects: Groups Name Variance Std.Dev. id (Intercept) 24.475 4.9472 Residual 24.417 4.9414 number of obs: 400, groups: id, 100
Fixed effects: Estimate Std. Error t value(Intercept) 23.6173 0.8915 26.493factor(time)1 -7.3150 0.6988 -10.468factor(time)4 -6.6140 0.6988 -9.465factor(time)6 -4.2020 0.6988 -6.013factor(group)P 5.5775 1.1060 5.043
Correlation of Fixed Effects: (Intr) fct()1 fct()4 fct()6factor(tm)1 -0.392 factor(tm)4 -0.392 0.500 factor(tm)6 -0.392 0.500 0.500 factr(grp)P -0.620 0.000 0.000 0.000
Longitudinal Data Analysis 107
![Page 109: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/109.jpg)
R Console Page 1
> fm2Linear mixed-effects model fit by REML Formula: lead ~ factor(time) + (1 | id) Data: tlc.long AIC BIC logLik MLdeviance REMLdeviance 2599 2619 -1294 2592 2589Random effects: Groups Name Variance Std.Dev. id (Intercept) 32.022 5.6588 Residual 24.417 4.9414 number of obs: 400, groups: id, 100
Fixed effects: Estimate Std. Error t value(Intercept) 26.4060 0.7513 35.15factor(time)1 -7.3150 0.6988 -10.47factor(time)4 -6.6140 0.6988 -9.46factor(time)6 -4.2020 0.6988 -6.01
Correlation of Fixed Effects: (Intr) fct()1 fct()4factor(tm)1 -0.465 factor(tm)4 -0.465 0.500 factor(tm)6 -0.465 0.500 0.500
Longitudinal Data Analysis 108
![Page 110: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/110.jpg)
R Console Page 1
> anova(fm1,fm2)Data: tlc.longModels:fm2: lead ~ factor(time) + (1 | id)fm1: lead ~ factor(time) + factor(group) + (1 | id) Df AIC BIC logLik Chisq Chi Df Pr(>Ch isq) fm2 5 2602.4 2622.4 -1296.2 fm1 6 2581.4 2605.3 -1284.7 23.069 1 1.563 e-06 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
Longitudinal Data Analysis 109
![Page 111: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/111.jpg)
Modelling the mean: Parametric Curves
• As the number of occasions increase and the number of irregular observations increase,Pro�le Analysis becomes less and less appealing.
• Furthermore, it is reasonable in many circumstances to expect that the mean response islikely to change smoothly (monotonically) over time, at least for the duration of the study.
• Fitting parsimonious models for the mean response leads to statistical tests with greaterpower than the Pro�le Analysis (narrower range of alternative hypotheses).
• This, however, is true only if the assumed structure for the mean is 'correct'.
Longitudinal Data Analysis 110
![Page 112: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/112.jpg)
Linear Trends over time
Assume the model
E(Yij) = �1 + �2Timeij + �3Groupi + �4Timeij × Groupi;
• Groupi =
{1; new treatment;0; otherwise.
• Timeij has two indices to allow for mistimed observations.
• Hence:
{ for the control group we have: E(Yij) = �1 + �2Timeij:
{ for experimental treatment group we have: E(Yij) = (�1 + �3) + (�2 + �4)Timeij:
Longitudinal Data Analysis 111
![Page 113: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/113.jpg)
Time
Mea
n R
espo
nse
0 2 4 6 8 10
01
23
45
Control
Treatment
Longitudinal Data Analysis 112
![Page 114: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/114.jpg)
Quadratic Trends over time
Assume the model
E(Yij) = �1 +�2Timeij +�3Time2ij +�4Groupi+�5Timeij×Groupi+�6Time
2ij×Groupi:
• Changes in the mean response are no longer constant. The rate of change now dependson time (earlier/later).
• Hence:
{ for the control group we have:
E(Yij) = �1 + �2Timeij + �3Time2ij:
{ for experimental treatment group we have:
E(Yij) = (�1 + �4) + (�2 + �5)Timeij + (�3 + �6)Time2ij:
Longitudinal Data Analysis 113
![Page 115: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/115.jpg)
Time
Mea
n R
espo
nse
0 2 4 6 8 10
02
46
810
12
Control
Treatment
Longitudinal Data Analysis 114
![Page 116: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/116.jpg)
• There is a natural hierarchy in higher order models. In the quadratic model
E(Yij) = �1 + �2Timeij + �3Time2ij;
�rst we test the quadratic trend (�3 = 0) before we move on to the linear term (�2 = 0).
• It is very important to see how variables enter the model. Centering variables to their meanvalue o�er a simple interpretation to the intercept. Additionally collinearity problems areavoided. For example consider Timej. If Timej ∈ {0; 1; 2; :::; 10} then the correlationbetween Timej and Time2
j is 0.96. However, if we center Time by subtracting its meanvalue 5, then the correlation goes down to zero.
Longitudinal Data Analysis 115
![Page 117: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/117.jpg)
Time-Varying Covariates
• So far we have discussed cases where variables remain unchanged over time.
• The common case where at the �rst visit all subjects are in the same state (say untreated)and any intervention is given from the second visit onwards (TLC data).
• Furthermore, in many trials patients tend to switch treatments for various reasons, usuallyside e�ects or even personal choice (when the treatment cannot be disclosed). As a resultwe need to allow for this change over the duration of the study (cross-over trials).
Longitudinal Data Analysis 116
![Page 118: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/118.jpg)
• In the case of a time-varying treatment indicator the treatment variable, say Gi, will notbe constant over time. Vector
Gi =
00111
;
indicates that patient i started with placebo for the �rst two occasions and then (s)heswitched to active treatment. Extension to cases where we have more than two treatmentgroups are possible with the inclusion of the right number of indicator (dummy) variables.
• In exactly the same way we model a continuous time-varying covariate, where at eachoccasion the right value for this covariate is included in the model.
Longitudinal Data Analysis 117
![Page 119: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/119.jpg)
Other Approaches: Splines
There are cases where longitudinal trends in the mean response cannot be characterized by�rst and second degree polynomials in time. Additionally, there are cases where non-lineartrends cannot be well approximated by polynomials in time of any order. This can happenwhen the mean response can rapidly increase or decrease for some duration and then continuemore slowly. A class of models called Splines are then used to describe these complicatedcurves.
A. Step Function:
• The simplest spline model for the population mean is a sequence of at steps.• In this approach, each step approximates the mean response over a small interval of
time. The result is a step function that approximates the smooth curve of the meanresponse.• The step function parameterization is quite straight forward. Suppose that we have
observations in 9 time points (occasions) from t = 1 to t = 9 and we have three step
Longitudinal Data Analysis 118
![Page 120: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/120.jpg)
functions with steps at 2.5 and 5.5. Then
time V 1 V 2
123456789
=
1 0 01 0 00 1 00 1 00 1 00 0 10 0 10 0 10 0 1
or
1 0 01 0 01 1 01 1 01 1 01 1 11 1 11 1 11 1 1
• In the V 1 parameterization the parameters represent the mean response in the intervals
(1,2.5), (2.5,5.5) and (5.5,9). In the V 2 parameterization, parameter �1 represent themean response in the �rst interval (1,2.5), however �2 represent the di�erence in themean response in the �rst two intervals and �3 the di�erence between the means in the
Longitudinal Data Analysis 119
![Page 121: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/121.jpg)
third and second intervals.
Note A: This parametrization is similar to the one for unstructured mean, with the onlydi�erence being in the fact that a single parameter is the mean for multiple time points.
B. Bent Line (piece-wise linear):
• Another approach, slightly more complicated, is to assume continuous functions, linearon intervals of time, with the slope allowed to change from one interval to the next.• As a result,connected line segments approximate the continuous curve of the mean
response.• The bent line requires two parameters for the �rst interval and one additional parameter,
for the change in slope, for every additional interval.• Hence, a model with two break points at t∗1 and t∗2 can be written as
E(Yij = �1 + �2Timeij + �3(Timeij − t∗1)+ + �4(Timeij − t∗2)+;
where (x)+ is equal to x when x > 0 and zero otherwise.
Longitudinal Data Analysis 120
![Page 122: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/122.jpg)
• The covariate matrix then takes the form
time Bent Line
123456789
=
1 1 0 01 2 0 01 3 0:5 01 4 1:5 01 5 2:5 01 6 3:5 0:51 7 4:5 1:51 8 5:5 2:51 9 6:5 3:5
Parameter �1 is the intercept, �2 is the slope up to time 2.5, �2 + �3 is the slopebetween 2.5 and 5.5 and �nally �2 + �3 + �4 is the slope after 5.5.
Note B: The break points at t = 2:5 and t = 5:5 are formally called knots. For thestep function, the number of parameters we require are 1 plus the number of knots. For
Longitudinal Data Analysis 121
![Page 123: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/123.jpg)
the bent line model the number of parameters required are 2 plus the number of knots.Note C: For the bent line model we often require fewer knots than the step function. Asa result, in practice, the total number of parameters required for the bent line model arefewer than the step function.
C. Higher Order Polynomial Splines:
• Spline models can become even more complicated by using piece-wise quadratic or cubicmodels.• Two parameters characterize spline models{ the order of the piece-wise polynomial on each interval{ the number of knots• If a spline is of kth-order, then for each knot there is a covariate that allows the coe�cient
of the kth-order term tKij to change. For example, at each knot there is a jump at thestep function and in the bent line model there is a change at the slope. The cubic splinemodel has an intercept, slope, a quadratic and a cubic term. At each knot, say t0k, thecubic spline model has a covariate of the form (tij − t0k)
3+. This allows the coe�cient
of the time cubed to change at each knot.
Longitudinal Data Analysis 122
![Page 124: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/124.jpg)
Note D: Generally, the number of parameters is equal to the degree of the polynomialplus the number of knots plus 1.
Longitudinal Data Analysis 123
![Page 125: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/125.jpg)
Modelling the Covariance
• Although the covariance between observations is not of primary interest, accounting forthe covariance among repeated measures usually increases the precision with which theparameters are being estimated.
• Furthermore, when we have missing data, the 'correct' speci�cation of the covariancestructure is often a requirement for valid estimates of the regression parameters.
• There are two aspects that require modelling: the mean and the covariance structure.Although they appear to be independent, an interdependence exist based on the fact thatthe covariance between any pairs of residuals {Yij−�ij(�)} and {Yik−�ik(�)} dependson the model of the mean. As a result, a model for the covariance should be chosen onthe basis of some model for the mean.
Longitudinal Data Analysis 124
![Page 126: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/126.jpg)
A. Unstructured
Cov(Yi) =
�11 �12 · · · �1n
�21 �22 · · · �2n... ... . . . ...
�n1 �n2 · · · �nn
• The above structure is reasonable when the number of occasions is relatively small and
all individuals are measured at the same set of occasions.• Formal requirements:{ symmetric{ positive de�nite• Advantage: No structure in the covariance matrix.
• Drawback 1: many parameters to estimate. We have to estimate n(n+1)2 parameters,
growing rapidly with n. As a result, estimation process can be unstable.• Drawback 2: Problem when we have mistimed observations.
Longitudinal Data Analysis 125
![Page 127: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/127.jpg)
B. Compound Symmetry
Cov(Yi) = �2
1 � · · · �
� 1 · · · �... ... . . . ...� � · · · 1
• Variance is assumed constant �2 and Corr(Yij; Yik) = �.• Advantage: Only two parameters to estimate.• Drawback 1: Makes the strong assumption that the correlation between any pair of
observations is the same, regardless of the time interval between measurements. Thisis rather unappealing for most Longitudinal data, since correlation is expected to decaywith time.• Drawback 2: The assumption of constant variance is also unrealistic. We have seen
that variance increases with time
Longitudinal Data Analysis 126
![Page 128: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/128.jpg)
C. Toeplitz
Cov(Yi) = �2
1 �1 �2 · · · �n−1
�1 1 �1 · · · �n−2
�2 �1 1 · · · �n−3... ... ... . . . ...
�n−1 �n−2 �n−3 · · · 1
• Assume that any pair of responses equally separated in time have the same correlation.• Variance is constant �2 and Corr(Yij; Yij+k) = �k.• Appropriate only when measurements are made at (approximately) equal intervals of
time.• There are n parameters to be estimated.
Longitudinal Data Analysis 127
![Page 129: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/129.jpg)
D. Autoregressive
Cov(Yi) = �2
1 � �2 · · · �n−1
� 1 � · · · �n−2
�2 � 1 · · · �n−3
... ... ... . . . ...�n−1 �n−2 �n−3 · · · 1
• A special case of the Toeplitz covariance structure.• Variance is constant �2 and Corr(Yij; Yij+k) = �k.• Advantage: Only two parameters to estimate.
Longitudinal Data Analysis 128
![Page 130: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/130.jpg)
E. Banded
Cov(Yi) = �2
1 �1 0 · · · 0�1 1 �1 · · · 00 �1 1 · · · 0... ... ... . . . ...0 0 0 · · · 1
• Makes the assumption that the correlation is zero beyond some point.• The above is a banded Toeplitz covariance pattern with a band size of 2.• Variance is constant �2 and Corr(Yij; Yij+k) = 0 for k ≥ 2.• Disadvantage: Makes a very strong assumption about how quickly the correlation
decays.
Longitudinal Data Analysis 129
![Page 131: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/131.jpg)
F. Exponential
• When measurement occasions are not equally spaced, we can generalize theautoregressive pattern by assuming
Corr(Yij; Yij) = �|tij−tik|;
for � > 0.• Thus, correlation decrease exponentially with the time separation between models.• Called exponential because
Corr(Yij; Yij) = �|tij−tik| = exp{−�|tij − tik|};
where � = − log(�).• Invariant under liner transformations.
Longitudinal Data Analysis 130
![Page 132: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/132.jpg)
Example: TLC Data
glsTLC:
leadij = �0 + �1groupi + �2timej + �3(groupi × timej)
with the covariance matrix having the compound symmetry form
Cov(Yi) = �2
1 � � �
� 1 � �
� � 1 �
� � � 1
>gls(lead ∼ factor(group)*factor(time),data=tlc.long,correlation=corCompSymm(form= 1|id))
Longitudinal Data Analysis 131
![Page 133: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/133.jpg)
R Console Page 1
> glsTLC=gls(lead~factor(group)*factor(time),data=tlc.long,correlation=corCompSymm(form=~1|id))> summary(glsTLC)Generalized least squares fit by REML Model: lead ~ factor(group) * factor(time) Data: tlc.long AIC BIC logLik 2480.621 2520.334 -1230.311
Correlation Structure: Compound symmetry Formula: ~1 | id Parameter estimate(s): Rho 0.5954401
Coefficients: Value Std.Error t-value p-value(Intercept) 26.540 0.9370175 28.323911 0.0000factor(group)P -0.268 1.3251428 -0.202242 0.8398factor(time)1 -13.018 0.8428574 -15.445080 0.0000factor(time)4 -11.026 0.8428574 -13.081691 0.0000factor(time)6 -5.778 0.8428574 -6.855252 0.0000factor(group)P:factor(time)1 11.406 1.1919804 9.568950 0.0000factor(group)P:factor(time)4 8.824 1.1919804 7.402807 0.0000factor(group)P:factor(time)6 3.152 1.1919804 2.644339 0.0085
Correlation: (Intr) fct()P fct()1 fct()4 fct()6 f()P:()1 f()P:()4factor(group)P -0.707 factor(time)1 -0.450 0.318 factor(time)4 -0.450 0.318 0.500 factor(time)6 -0.450 0.318 0.500 0.500 factor(group)P:factor(time)1 0.318 -0.450 -0.707 -0.354 -0.354 factor(group)P:factor(time)4 0.318 -0.450 -0.354 -0.707 -0.354 0.500 factor(group)P:factor(time)6 0.318 -0.450 -0.354 -0.354 -0.707 0.500 0.500
Standardized residuals: Min Q1 Med Q3 Max -2.5147478 -0.6973588 -0.1498706 0.5542799 6.5106944
Residual standard error: 6.625714 Degrees of freedom: 400 total; 392 residual
Longitudinal Data Analysis 132
![Page 134: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/134.jpg)
glsTLC2:
leadij = �0 + �1groupi + �2timej + �3(groupi × timej)
with the covariance matrix having the compound symmetry form
Cov(Yi) = �2
s1 � � �
� s2 � �
� � s3 �
� � � s4
>gls(lead ∼ factor(group)*factor(time),data=tlc.long,correlation=corCompSymm(form= 1|id),
weight=varIdent(form= 1|time))
Longitudinal Data Analysis 133
![Page 135: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/135.jpg)
R Console Page 1
> # Compound Symmetry with different variances at different occasions> glsTLC2=gls(lead~factor(group)*factor(time),data=tlc.long,correlation=corCompSymm(form=~1|id),weight=varIdent(form=~1|time))> summary(glsTLC2)Generalized least squares fit by REML Model: lead ~ factor(group) * factor(time) Data: tlc.long AIC BIC logLik 2459.960 2511.587 -1216.980
Correlation Structure: Compound symmetry Formula: ~1 | id Parameter estimate(s): Rho 0.6102797 Variance function: Structure: Different standard deviations per stratum Formula: ~1 | time Parameter estimates: 0 1 4 6 1.000000 1.279651 1.323192 1.519196
Coefficients: Value Std.Error t-value p-value(Intercept) 26.540 0.7238068 36.66724 0.0000factor(group)P -0.268 1.0236174 -0.26182 0.7936factor(time)1 -13.018 0.7506743 -17.34174 0.0000factor(time)4 -11.026 0.7713904 -14.29367 0.0000factor(time)6 -5.778 0.8726864 -6.62094 0.0000factor(group)P:factor(time)1 11.406 1.0616138 10.74402 0.0000factor(group)P:factor(time)4 8.824 1.0909108 8.08865 0.0000factor(group)P:factor(time)6 3.152 1.2341649 2.55395 0.0110
Correlation: (Intr) fct()P fct()1 fct()4 fct()6 f()P:()1 f()P:()4factor(group)P -0.707 factor(time)1 -0.211 0.149 factor(time)4 -0.181 0.128 0.402 factor(time)6 -0.060 0.043 0.383 0.383 factor(group)P:factor(time)1 0.149 -0.211 -0.707 -0.285 -0.270 factor(group)P:factor(time)4 0.128 -0.181 -0.285 -0.707 -0.271 0.402 factor(group)P:factor(time)6 0.043 -0.060 -0.270 -0.271 -0.707 0.383 0.383
Standardized residuals: Min Q1 Med Q3 Max -2.1429187 -0.6927684 -0.1528875 0.5263104 5.5480270
Residual standard error: 5.118087 Degrees of freedom: 400 total; 392 residual
Longitudinal Data Analysis 134
![Page 136: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/136.jpg)
glsTLC3:
leadij = �0 + �1groupi + �2timej + �3(groupi × timej)
with the covariance matrix having the symmetric form
Cov(Yi) =
�2 �12 �13 �14
�21 �2 �24 �24
�31 �32 �2 �34
�41 �42 �43 �2
>gls(lead ∼ factor(group)*factor(time),data=tlc.long,correlation=corSymm(form= 1|id))
Longitudinal Data Analysis 135
![Page 137: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/137.jpg)
R Console Page 1
> summary(glsTLC3)Generalized least squares fit by REML Model: lead ~ factor(group) * factor(time) Data: tlc.long AIC BIC logLik 2471.632 2531.201 -1220.816
Correlation Structure: General Formula: ~1 | id Parameter estimate(s): Correlation: 1 2 3 2 0.596 3 0.582 0.769 4 0.536 0.552 0.551
Coefficients: Value Std.Error t-value p-value(Intercept) 26.540 0.9374730 28.310148 0.0000factor(group)P -0.268 1.3257871 -0.202144 0.8399factor(time)1 -13.018 0.8425878 -15.450023 0.0000factor(time)4 -11.026 0.8576242 -12.856447 0.0000factor(time)6 -5.778 0.9034129 -6.395747 0.0000factor(group)P:factor(time)1 11.406 1.1915990 9.572012 0.0000factor(group)P:factor(time)4 8.824 1.2128637 7.275343 0.0000factor(group)P:factor(time)6 3.152 1.2776188 2.467090 0.0140
Correlation: (Intr) fct()P fct()1 fct()4 fct()6 f()P:()1 f()P:()4factor(group)P -0.707 factor(time)1 -0.449 0.318 factor(time)4 -0.457 0.323 0.719 factor(time)6 -0.482 0.341 0.485 0.492 factor(group)P:factor(time)1 0.318 -0.449 -0.707 -0.508 -0.343 factor(group)P:factor(time)4 0.323 -0.457 -0.508 -0.707 -0.348 0.719 factor(group)P:factor(time)6 0.341 -0.482 -0.343 -0.348 -0.707 0.485 0.492
Standardized residuals: Min Q1 Med Q3 Max -2.5135258 -0.6970199 -0.1497978 0.5540105 6.5075307
Residual standard error: 6.628935 Degrees of freedom: 400 total; 392 residual
Longitudinal Data Analysis 136
![Page 138: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/138.jpg)
glsTLC4:
leadij = �0 + �1groupi + �2timej + �3(groupi × timej)
with the covariance matrix having the compound symmetry form
Cov(Yi) =
�2
1 �12 �13 �14
�21 �22 �24 �24
�31 �32 �23 �34
�41 �42 �43 �24
>gls(lead ∼ factor(group)*factor(time),data=tlc.long,correlation=corSymm(form= 1|id),
weight=varIdent(form= 1|time))
Longitudinal Data Analysis 137
![Page 139: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/139.jpg)
R Console Page 1
> summary(glsTLC4)Generalized least squares fit by REML Model: lead ~ factor(group) * factor(time) Data: tlc.long AIC BIC logLik 2452.076 2523.559 -1208.038
Correlation Structure: General Formula: ~1 | id Parameter estimate(s): Correlation: 1 2 3 2 0.571 3 0.570 0.775 4 0.577 0.582 0.581Variance function: Structure: Different standard deviations per stratum Formula: ~1 | time Parameter estimates: 0 1 4 6 1.000000 1.325887 1.370453 1.524826
Coefficients: Value Std.Error t-value p-value(Intercept) 26.540 0.7102888 37.36508 0.0000factor(group)P -0.268 1.0045001 -0.26680 0.7898factor(time)1 -13.018 0.7919194 -16.43854 0.0000factor(time)4 -11.026 0.8149168 -13.53022 0.0000factor(time)6 -5.778 0.8885252 -6.50291 0.0000factor(group)P:factor(time)1 11.406 1.1199432 10.18445 0.0000factor(group)P:factor(time)4 8.824 1.1524663 7.65662 0.0000factor(group)P:factor(time)6 3.152 1.2565644 2.50843 0.0125
Correlation: (Intr) fct()P fct()1 fct()4 fct()6 f()P:()1 f()P:()4factor(group)P -0.707 factor(time)1 -0.218 0.154 factor(time)4 -0.191 0.135 0.680 factor(time)6 -0.096 0.068 0.386 0.385 factor(group)P:factor(time)1 0.154 -0.218 -0.707 -0.481 -0.273 factor(group)P:factor(time)4 0.135 -0.191 -0.481 -0.707 -0.272 0.680 factor(group)P:factor(time)6 0.068 -0.096 -0.273 -0.272 -0.707 0.386 0.385
Standardized residuals: Min Q1 Med Q3 Max -2.1756391 -0.6849959 -0.1515546 0.5294172 5.6327402
Residual standard error: 5.0225 Degrees of freedom: 400 total; 392 residual
Longitudinal Data Analysis 138
![Page 140: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/140.jpg)
glsTLC5:
leadij = �0 + �1groupi + �2timej
with the covariance matrix having the compound symmetry form
Cov(Yi) =
�2
1 �12 �13 �14
�21 �22 �24 �24
�31 �32 �23 �34
�41 �42 �43 �24
>gls(lead ∼ factor(group)+factor(time),data=tlc.long,correlation=corSymm(form= 1|id),
weight=varIdent(form= 1|time))
Longitudinal Data Analysis 139
![Page 141: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/141.jpg)
R Console Page 1
> summary(glsTLC5)Generalized least squares fit by REML Model: lead ~ factor(group) + factor(time) Data: tlc.long AIC BIC logLik 2525.171 2584.854 -1247.585
Correlation Structure: General Formula: ~1 | id Parameter estimate(s): Correlation: 1 2 3 2 0.334 3 0.407 0.822 4 0.551 0.512 0.550Variance function: Structure: Different standard deviations per stratum Formula: ~1 | time Parameter estimates: 0 1 4 6 1.000000 1.567427 1.478107 1.484946
Coefficients: Value Std.Error t-value p-value(Intercept) 25.399921 0.7104368 35.75254 0.0000factor(group)P 2.012157 0.9786873 2.05598 0.0404factor(time)1 -7.315000 0.7993320 -9.15139 0.0000factor(time)4 -6.614000 0.7247826 -9.12549 0.0000factor(time)6 -4.202000 0.6448471 -6.51627 0.0000
Correlation: (Intr) fct()P fct()1 fct()4factor(group)P -0.689 factor(time)1 -0.222 0.000 factor(time)4 -0.205 0.000 0.814 factor(time)6 -0.105 0.000 0.437 0.446
Standardized residuals: Min Q1 Med Q3 Max -2.23560001 -0.64349384 -0.04593555 0.61603808 5.58341364
Residual standard error: 5.150371 Degrees of freedom: 400 total; 395 residual
Longitudinal Data Analysis 140
![Page 142: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/142.jpg)
R Console Page 1
> anova(glsTLC,glsTLC2) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC 1 10 2480.621 2520.334 -1230.311 glsTLC2 2 13 2459.960 2511.587 -1216.980 1 vs 2 26.66058 <.0001> > anova(glsTLC,glsTLC3) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC 1 10 2480.621 2520.334 -1230.311 glsTLC3 2 15 2471.632 2531.200 -1220.816 1 vs 2 18.98944 0.0019> > anova(glsTLC,glsTLC4) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC 1 10 2480.621 2520.334 -1230.311 glsTLC4 2 18 2452.076 2523.559 -1208.038 1 vs 2 44.54507 <.0001> > anova(glsTLC2,glsTLC3) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC2 1 13 2459.960 2511.587 -1216.980 glsTLC3 2 15 2471.632 2531.200 -1220.816 1 vs 2 7.671143 0.0216> > anova(glsTLC2,glsTLC4) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC2 1 13 2459.960 2511.587 -1216.980 glsTLC4 2 18 2452.076 2523.559 -1208.038 1 vs 2 17.88450 0.0031> > anova(glsTLC3,glsTLC4) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC3 1 15 2471.632 2531.200 -1220.816 glsTLC4 2 18 2452.076 2523.559 -1208.038 1 vs 2 25.55564 <.0001> > anova(glsTLC4,glsTLC5) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC4 1 18 2452.076 2523.559 -1208.038 glsTLC5 2 15 2525.171 2584.854 -1247.585 1 vs 2 79.09486 <.0001Warning message:In anova.lme(object = glsTLC4, glsTLC5) : Fitted objects with different fixed effects. REML comparisons are not meaningful.> > anova(update(glsTLC4,method='ML'),update(glsTLC5, method='ML')) Model df AIC BIC logLik Test L.Ratio p-valueupdate(glsTLC4, method = "ML") 1 18 2461.368 25 33.214 -1212.684 update(glsTLC5, method = "ML") 2 15 2529.555 25 89.427 -1249.778 1 vs 2 74.18778 <.0001
Longitudinal Data Analysis 141
![Page 143: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/143.jpg)
Random E�ects
• We need to understand (at least qualitatively) what are the likely sources of randomvariation
• One possible source is Random Effects, when units are sampled at random from apopulation and various aspects of their behavior may show stochastic variation betweenunits
• We introduce Linear Random E�ects model where
{ the response is assumed to be a linear function of exploratory variables with regressioncoe�cients that vary from one individual to the next
{ variability re ects natural heterogeneity due to unmeasured factors
Longitudinal Data Analysis 142
![Page 144: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/144.jpg)
Example: Children birth weight and growth rate.
• A random e�ects model is a reasonable description if the set of coe�cients from apopulation of children can be thought of as a sample from a distribution
• Given the actual coe�cient for a children, the linear Random E�ects model assumes thatrepeated observations for that person are independent
• Correlation arises because we cannot observe the underlying growth curve, that is theregression coe�cient, but we have only imperfect measurements of weight on each infant
• So the model takes the form
E(Yij|Ui) = (�0 + Ui) + �1(time)ij
• Typically, a parametric model such as Gaussian with mean=0 and unknown variance �2 isused for Ui.
Longitudinal Data Analysis 143
![Page 145: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/145.jpg)
Linear Mixed Models
• The Usual Linear Modely = X� + e;
where
{ y = (y1; :::; yn)′ is an n× 1 vector of independent observations
{ � is a p× 1 vector of unknown parameters{ X an n× p design (model) matrix{ e = (e1; :::; en)
′ is an n× 1 vector of independent errors
Longitudinal Data Analysis 144
![Page 146: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/146.jpg)
• The linear mixed model (general)
Yi = Xi� + Zibi + ei;
where
{ Yi, � and e as before with∗ E(ei) = 0n∗ V ar(ei) = W
{ Matrix Z is a given n× q matrix (the columns of Z is a subset of the columns of X){ bi is an unobservable random vector of dimensions q × 1, following (theoretically) any
multivariate distribution with the following assumptions∗ E(bi) = 0q∗ V ar(bi) = B
In practice bi follow a multivariate normal distribution.{ In addition, vectors bi and ei are assumed uncorrelated.{ E(Yi) = Xi�
{ V ar(Yi) = V ar(X� + Zb + e) = ZBZ ′ + W .
Longitudinal Data Analysis 145
![Page 147: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/147.jpg)
Random Intercept Model
Consider the model
Yij = X′ij� + bi + eij
= (�1 + bi) + Xij2�2 + ::: + Xijp�p + eij
• Each subject's pro�le appears at (across occasions) - [or parallel]
• Observations Yij vary around a di�erent value for each subject. These values are theintercepts of the line each subject's responses vary around, where bi represents thedeviations of subject's i intercept from the population one (�1).
• The set of intercepts are a sample from the population of intercepts.
• This implies that there is between-subject variability (equivalent to within-subject
correlation)
Longitudinal Data Analysis 146
![Page 148: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/148.jpg)
1 2 3 4 5
−2
−1
01
2
Time
Res
pons
e
Longitudinal Data Analysis 147
![Page 149: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/149.jpg)
• Furthermore, the variance of Yij takes the form
V ar(Yij) = V ar(X′ij� + bi + eij)
= V ar(bi) + V ar(eij)
= �2b + �2
and the covariance between any pair of observations of the same subject
Cov(Yij; Yik) = Cov(X′ij� + bi + eij; X
′ik� + bi + eik)
= Cov(bi; bi)
= �2b :
Longitudinal Data Analysis 148
![Page 150: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/150.jpg)
The covariance matrix then becomes
Cov(Yi) =
�2b + �2 �2
b �2b · · · �2
b
�2b �2
b + �2 �2b · · · �2
b
�2b �2
b �2b + �2 · · · �2
b... ... ... . . . ...�2b �2
b �2b · · · �2
b + �2
;
and the correlation between two observations becomes
� = Corr(Yij; Yik) =�2b
�2b + �2
:
• The presence of random e�ect induce correlation among repeated measurements. This isalso known as intra-class correlation.
Longitudinal Data Analysis 149
![Page 151: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/151.jpg)
Note: In statistics, the intraclass correlation is a descriptive statistic that can be used whenquantitative measurements are made on units that are organized into groups. It describeshow strongly units in the same group resemble each other. While it is viewed as a typeof correlation, unlike most other correlation measures it operates on data structured asgroups, rather than data structured as paired observations.
• The model
E(Yij|bi) = X′ij� + bi
is referred to as the conditional or subject speci�c mean model
• The model
E(Yij) = X′ij�
is referred to as the marginal or population averaged mean model
Longitudinal Data Analysis 150
![Page 152: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/152.jpg)
1 2 3 4 5
02
46
810
Time
Res
pons
e
Longitudinal Data Analysis 151
![Page 153: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/153.jpg)
Example: Orthodont Data [included in nlme package]
• A set of measurements of the distance from the pituitary gland to the pterygomaxillary�ssure taken every 2 years.
• Measurements taken from 8 till 14 years of age.
• We have 27 children: 16 males - 11 females
• Data collected from x-rays.
Longitudinal Data Analysis 152
![Page 154: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/154.jpg)
Age (yr)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
810 13
● ●
●
●
M16
●
●●
●
M05
810 13
●● ●
●
M02
● ● ●
●
M11
810 13
● ●
●
●
M07
●
●
●●
M08
810 13
● ●
●
●
M03
●
● ●
●
M12
810 13
●
●
●
●
M13
●
● ● ●
M14
●
●
●
●
M09
●
●
●
●
M15
●●
●
●
M06
●
●● ●
M04
●●
●
●
M01
● ●
● ●
M10
●
● ● ●
F10
20
25
30
●●
● ●
F09
20
25
30
●● ●
●
F06
810 13
●●
●
●
F01
●
● ●●
F05
810 13
●● ●
●
F07
● ●
●
●
F02
810 13
● ● ● ●
F08
●
● ●
●
F03
810 13
●● ●
●
F04
● ●
● ●
F11
Longitudinal Data Analysis 153
![Page 155: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/155.jpg)
R Console Page 1
> levels(Orthodont$Sex)[1] "Male" "Female"> OrthoFem=Orthodont[Orthodont$Sex=="Female",]> lmF=lmList(distance ~ age, data=OrthoFem)> coef(lmF) (Intercept) ageF10 13.55 0.450F09 18.10 0.275F06 17.00 0.375F01 17.25 0.375F05 19.60 0.275F07 16.95 0.550F02 14.20 0.800F08 21.45 0.175F03 14.40 0.850F04 19.65 0.475F11 18.95 0.675
Longitudinal Data Analysis 154
![Page 156: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/156.jpg)
R Console Page 1
> intervals(lmF), , (Intercept)
lower est. upperF10 10.07138 13.55 17.02862F09 14.62138 18.10 21.57862F06 13.52138 17.00 20.47862F01 13.77138 17.25 20.72862F05 16.12138 19.60 23.07862F07 13.47138 16.95 20.42862F02 10.72138 14.20 17.67862F08 17.97138 21.45 24.92862F03 10.92138 14.40 17.87862F04 16.17138 19.65 23.12862F11 15.47138 18.95 22.42862
, , age
lower est. upperF10 0.14009962 0.450 0.7599004F09 -0.03490038 0.275 0.5849004F06 0.06509962 0.375 0.6849004F01 0.06509962 0.375 0.6849004F05 -0.03490038 0.275 0.5849004F07 0.24009962 0.550 0.8599004F02 0.49009962 0.800 1.1099004F08 -0.13490038 0.175 0.4849004F03 0.54009962 0.850 1.1599004F04 0.16509962 0.475 0.7849004F11 0.36509962 0.675 0.9849004
Longitudinal Data Analysis 155
![Page 157: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/157.jpg)
Sub
ject
F10
F09
F06
F01
F05
F07
F02
F08
F03
F04
F11
10 15 20 25
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(Intercept)
0.0 0.5 1.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
age
Longitudinal Data Analysis 156
![Page 158: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/158.jpg)
R Console Page 1
> lmF2=update(lmF,distance~I(age-11))> intervals(lmF2), , (Intercept)
lower est. upperF10 17.80704 18.500 19.19296F09 20.43204 21.125 21.81796F06 20.43204 21.125 21.81796F01 20.68204 21.375 22.06796F05 21.93204 22.625 23.31796F07 22.30704 23.000 23.69296F02 22.30704 23.000 23.69296F08 22.68204 23.375 24.06796F03 23.05704 23.750 24.44296F04 24.18204 24.875 25.56796F11 25.68204 26.375 27.06796
, , I(age - 11)
lower est. upperF10 0.14009962 0.450 0.7599004F09 -0.03490038 0.275 0.5849004F06 0.06509962 0.375 0.6849004F01 0.06509962 0.375 0.6849004F05 -0.03490038 0.275 0.5849004F07 0.24009962 0.550 0.8599004F02 0.49009962 0.800 1.1099004F08 -0.13490038 0.175 0.4849004F03 0.54009962 0.850 1.1599004F04 0.16509962 0.475 0.7849004F11 0.36509962 0.675 0.9849004
Longitudinal Data Analysis 157
![Page 159: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/159.jpg)
Sub
ject
F10
F09
F06
F01
F05
F07
F02
F08
F03
F04
F11
18 20 22 24 26
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(Intercept)
0.0 0.5 1.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
I(age − 11)
Longitudinal Data Analysis 158
![Page 160: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/160.jpg)
R Console Page 1
> lmeF=lme(distance~age,data=OrthoFem,random=~1)# Using REML> summary(lmeF)Linear mixed-effects model fit by REML Data: OrthoFem AIC BIC logLik 149.2183 156.169 -70.60916
Random effects: Formula: ~1 | Subject (Intercept) ResidualStdDev: 2.06847 0.7800331
Fixed effects: distance ~ age Value Std.Error DF t-value p-value(Intercept) 17.372727 0.8587419 32 20.230440 0age 0.479545 0.0525898 32 9.118598 0 Correlation: (Intr)age -0.674
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.2736479 -0.7090164 0.1728237 0.4122128 1.6325181
Number of Observations: 44Number of Groups: 11
Longitudinal Data Analysis 159
![Page 161: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/161.jpg)
R Console Page 1
> lmeF0=lme(distance~I(age-11),data=OrthoFem,random=~1)> summary(lmeF0)Linear mixed-effects model fit by REML Data: OrthoFem AIC BIC logLik 149.2183 156.169 -70.60916
Random effects: Formula: ~1 | Subject (Intercept) ResidualStdDev: 2.06847 0.7800331
Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 22.647727 0.6346568 32 35.6850 0I(age - 11) 0.479545 0.0525898 32 9.1186 0 Correlation: (Intr)I(age - 11) 0
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.2736479 -0.7090164 0.1728237 0.4122128 1.6325181
Number of Observations: 44Number of Groups: 11
Longitudinal Data Analysis 160
![Page 162: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/162.jpg)
Random Intercept and Slope Model
Consider the model
Yij = (�1 + b1i) + (�2 + b2i)tij + eij:
• Each subject varies with respect(i) baseline level when ti1 = 0 and(ii) rate of change of response over time.
• In this particular case we have q = p = 2 and
Xi = Zi =
1 ti11 ti2... ...1 tini
:
Longitudinal Data Analysis 161
![Page 163: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/163.jpg)
• Additionally, consider the variance
V ar(Yij) = V ar(X′ij� + Z
′ijbi + eij)
= V ar(Z′ijbi + eij)
= V ar(b1i + b2itij + eij)
= V ar(b1i) + 2tijCov(b1i; b2i) + t2ijV ar(b2i) + V ar(eij):
and the covariance among the repeated observations of the same subject becomes
Cov(Yij; Yik) = V ar(b1i) + (tij + tik)Cov(b1i; b2i) + tijtikV ar(b2i):
• Hence, the covariance matrix can be expressed as a function of time.
Longitudinal Data Analysis 162
![Page 164: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/164.jpg)
Covariance Structure
In the linear mixed model
Yi = Xi� + Zibi + ei;
the matrix Wi = Cov(ei) introduces the covariance between the repeated observations whenfocusing on the conditional mean response pro�le of a speci�c individual. In other words, itis the covariance of the ith individual's deviations from the response pro�le
E(Yi|bi) = Xi� + Zibi:
• The usual assumption is W = �2In. This is referred as the conditional independence
assumption.
• The conditional covariance becomes
Cov(Yi|bi) = Cov(ei) = Wi
Longitudinal Data Analysis 163
![Page 165: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/165.jpg)
• The marginal then takes the form
Cov(Yi) = ZiBZ′i + Wi
.
• The Cov(Yi) allows for between-subject (B) and within-subject (Wi) sources of variation.
• Due to the fact that Cov(Yi) is a function of times of measurements (when time is in Zi),in principle each subject may have its own measurement times.
• The comparison of random e�ects models for the covariance is based on the likelihoodratio test (REML). A test of two nested models, one with q and another one with q + 1correlated random e�ects lead to a chi-square test on q + 1 df (1 for variance and q
covariances). However, caution is needed when the null hypothesis is on the boundary ofthe parameter space.
Longitudinal Data Analysis 164
![Page 166: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/166.jpg)
Some Characteristics
• There is no need of balanced data.
• The covariances are functions of time. As a result, if time is included in Zi, each patientcan have his own sequence of measurement times. This property makes these modelssuitable for the analysis of real life longitudinal data.
• The number of covariance parameters that need to be estimated remains unchangedregardless of the number of measurements.
• The random e�ects covariance structure allows the variances and covariances to change(increase or decrease) as a function of measurement times, without introducing restrictivestructures as the covariance pattern models do.
Longitudinal Data Analysis 165
![Page 167: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/167.jpg)
Prediction
• In the analysis of longitudinal data the interest in �xed e�ects � is obvious. Theinterpretation of the parameters is clear and associated with the mean response over timeand changes in covariates.
• In many cases, however, subject-speci�c trajectories are of interest.
• Under the linear mixed-e�ects model patient speci�c response trajectories can bepredicted/estimated.
• This is possible by obtaining predictions of the subject-speci�c e�ects bi (random e�ects),or
Xi� + Zibi:
Longitudinal Data Analysis 166
![Page 168: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/168.jpg)
• Generally, the issue of predicting a random variable and as a result the patient speci�cresponse trajectory is that of predicting its conditional mean given the available data.
• There are two pieces of information that contribute in the estimation/prediction of bi.
{ The �rst is the statement thatbi ∼ N(0; B)
(the prior of bi).{ The second is the likelihood of the data Yi, which say that
Yi|bi ∼ N(Xi� + Zibi; Wi)
.
Longitudinal Data Analysis 167
![Page 169: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/169.jpg)
• We combine information by multiplying the two densities (joint) and ...after some maths...we get
E(bi|Yi) = BZ′iΣ−1i (Yi −Xi�);
where Σi = Cov(Yi) = ZiBZ′i + Wi: This is known as the BLUP.
• The predictor of bi depends on B. Hence, when this is replaced by its REML estimator,we have
bi = BZ′iΣ−1i (Yi −Xi�);
also known as the empirical BLUP (or empirical Bayes estimate).
• Given bi we obtainYi = Xi� + Zibi:
Longitudinal Data Analysis 168
![Page 170: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/170.jpg)
• As a result we have
Yi = Xi� + Zibi
= Xi� + ZiBZ′iΣ−1i (Yi −Xi�)
= (Ini − ZiBZ′iΣ−1i )Xi� + ZiBZ
′iΣ−1i Yi
= (WiΣ−1i )Xi� + (Ini − WiΣ
−1i )Yi
whereΣiΣ
−1i = Ini = (ZiBZ
′i + Wi)Σ
−1i = ZiBZ
′iΣ−1i + WiΣ
−1i :
This expression shows that Yi is a weighted mean of Xi� , the population-averaged meanresponse pro�le and Yi the i
th patient's observed response pro�le.
• As a result the predicted response pro�le is pulled (shrinks) towards the population-averaged mean response pro�le.
Longitudinal Data Analysis 169
![Page 171: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/171.jpg)
• The amount of shrinkage depends on Wi and Σi.
• If Wi is "large" then the within-subject variability is greater that the between subjectvariability and hence more weight is given on the population averaged mean responsepro�le Xi�.
• The opposite holds when Wi is "small".
Longitudinal Data Analysis 170
![Page 172: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/172.jpg)
Example: Orthodont (cont.)
>lmeOrth1=lme(distance ∼ I(age-11),data=Orthodont,random=∼1)>lmeOrth1ml=update(lmeOrth1,method='ML')
>lmeOrth2=lme(distance ∼ I(age-11),data=Orthodont)
>lmeOrth2ml=update(lmeOrth2,method='ML')
>lmeOrth3=update(lmeOrth2,fixed=distance ∼ Sex*I(age-11))
Longitudinal Data Analysis 171
![Page 173: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/173.jpg)
R Console Page 1
> summary(lmeOrth1)Linear mixed-effects model fit by REML Data: Orthodont AIC BIC logLik 455.0025 465.6563 -223.5013
Random effects: Formula: ~1 | Subject (Intercept) ResidualStdDev: 2.114724 1.431592
Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 24.023148 0.4296605 80 55.91193 0I(age - 11) 0.660185 0.0616059 80 10.71626 0 Correlation: (Intr)I(age - 11) 0
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.66453932 -0.53507984 -0.01289591 0.48742859 3.72178465
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 172
![Page 174: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/174.jpg)
R Console Page 1
> OrthRE1ml=random.effects(lmeOrth1ml)> OrthRE1ml (Intercept)M16 -0.9152788M05 -0.9152788M02 -0.5798146M11 -0.3561719M07 -0.2443505M08 -0.1325291M03 0.2029351M12 0.2029351M13 0.2029351M14 0.7620421M09 0.9856849M15 1.6566133M06 2.1038989M04 2.3275416M01 3.3339342M10 4.8994337F10 -4.9408491F09 -2.5925998F06 -2.5925998F01 -2.3689570F05 -1.2507430F07 -0.9152788F02 -0.9152788F08 -0.5798146F03 -0.2443505F04 0.7620421F11 2.1038989
Longitudinal Data Analysis 173
![Page 175: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/175.jpg)
R Console Page 1
> coef(lmeOrth1)#subject specific coefficients (random intercept only) (Intercept) I(age - 11)M16 23.10517 0.6601852M05 23.10517 0.6601852M02 23.44163 0.6601852M11 23.66593 0.6601852M07 23.77808 0.6601852M08 23.89023 0.6601852M03 24.22668 0.6601852M12 24.22668 0.6601852M13 24.22668 0.6601852M14 24.78744 0.6601852M09 25.01174 0.6601852M15 25.68464 0.6601852M06 26.13325 0.6601852M04 26.35755 0.6601852M01 27.36691 0.6601852M10 28.93702 0.6601852F10 19.06774 0.6601852F09 21.42291 0.6601852F06 21.42291 0.6601852F01 21.64721 0.6601852F05 22.76872 0.6601852F07 23.10517 0.6601852F02 23.10517 0.6601852F08 23.44163 0.6601852F03 23.77808 0.6601852F04 24.78744 0.6601852F11 26.13325 0.6601852
Longitudinal Data Analysis 174
![Page 176: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/176.jpg)
R Console Page 1
> summary(lmeOrth1ml)Linear mixed-effects model fit by maximum likelihood Data: Orthodont AIC BIC logLik 451.3895 462.1181 -221.6948
Random effects: Formula: ~1 | Subject (Intercept) ResidualStdDev: 2.072142 1.422728
Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 24.023148 0.4255878 80 56.44699 0I(age - 11) 0.660185 0.0617993 80 10.68272 0 Correlation: (Intr)I(age - 11) 0
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.68695130 -0.53862941 -0.01232442 0.49100161 3.74701483
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 175
![Page 177: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/177.jpg)
R Console Page 1
> OrthRE1ml=random.effects(lmeOrth1ml)> OrthRE1ml (Intercept)M16 -0.9152788M05 -0.9152788M02 -0.5798146M11 -0.3561719M07 -0.2443505M08 -0.1325291M03 0.2029351M12 0.2029351M13 0.2029351M14 0.7620421M09 0.9856849M15 1.6566133M06 2.1038989M04 2.3275416M01 3.3339342M10 4.8994337F10 -4.9408491F09 -2.5925998F06 -2.5925998F01 -2.3689570F05 -1.2507430F07 -0.9152788F02 -0.9152788F08 -0.5798146F03 -0.2443505F04 0.7620421F11 2.1038989
Longitudinal Data Analysis 176
![Page 178: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/178.jpg)
R Console Page 1
> coef(lmeOrth1ml)#subject specific coefficients (random intercept only) (Intercept) I(age - 11)M16 23.10787 0.6601852M05 23.10787 0.6601852M02 23.44333 0.6601852M11 23.66698 0.6601852M07 23.77880 0.6601852M08 23.89062 0.6601852M03 24.22608 0.6601852M12 24.22608 0.6601852M13 24.22608 0.6601852M14 24.78519 0.6601852M09 25.00883 0.6601852M15 25.67976 0.6601852M06 26.12705 0.6601852M04 26.35069 0.6601852M01 27.35708 0.6601852M10 28.92258 0.6601852F10 19.08230 0.6601852F09 21.43055 0.6601852F06 21.43055 0.6601852F01 21.65419 0.6601852F05 22.77241 0.6601852F07 23.10787 0.6601852F02 23.10787 0.6601852F08 23.44333 0.6601852F03 23.77880 0.6601852F04 24.78519 0.6601852F11 26.12705 0.6601852
Longitudinal Data Analysis 177
![Page 179: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/179.jpg)
>plot(compareFits(coef(lmeOrth1),coef(lmeOrth1ml)))
M16M05M02M11M07M08M03M12M13M14M09M15M06M04M01M10F10F09F06F01F05F07F02F08F03F04F11
20 22 24 26 28
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
(Intercept)
0.2 0.4 0.6 0.8 1.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
I(age − 11)
● ●coef(lmeOrth1) coef(lmeOrth1ml)
Longitudinal Data Analysis 178
![Page 180: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/180.jpg)
>plot(augPred(lmeOrth1),aspect="xy",grid=T)
Age (yr)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
8 1114
● ●
●
●
M16
●
●●
●
M05
8 1114
●● ●
●
M02
● ● ●
●
M11
8 1114
● ●
●
●
M07
●
●
●●
M08
8 1114
● ●
●
●
M03
●
● ●
●
M12
8 1114
●
●
●
●
M13
●
● ● ●
M14
●
●
●
●
M09
●
●
●
●
M15
●●
●
●
M06
●
●● ●
M04
●●
●
●
M01
● ●
● ●
M10
●
● ● ●
F10
20
25
30
●●
● ●
F09
20
25
30
●● ●
●
F06
8 1114
●●
●
●
F01
●
● ●●
F05
8 1114
●● ●
●
F07
● ●
●
●
F02
8 1114
● ● ● ●
F08
●
● ●
●
F03
8 1114
●● ●
●
F04
● ●
● ●
F11
Longitudinal Data Analysis 179
![Page 181: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/181.jpg)
R Console Page 1
> summary(lmeOrth2)Linear mixed-effects model fit by REML Data: Orthodont AIC BIC logLik 454.6367 470.6173 -221.3183
Random effects: Formula: ~I(age - 11) | Subject Structure: General positive-definite StdDev Corr (Intercept) 2.1343327 (Intr)I(age - 11) 0.2264275 0.503 Residual 1.3100394
Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 24.023148 0.4296608 80 55.91189 0I(age - 11) 0.660185 0.0712532 80 9.26534 0 Correlation: (Intr)I(age - 11) 0.294
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.223106405 -0.493761198 0.007316808 0.472151143 3.916034231
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 180
![Page 182: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/182.jpg)
>plot(compareFits(ranef(lmeOrth2),ranef(lmeOrth2ml)),mark=c(0,0))
M16M05M02M11M07M08M03M12M13M14M09M15M06M04M01M10F10F09F06F01F05F07F02F08F03F04F11
−4 −2 0 2 4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
(Intercept)
−0.2 0.0 0.2 0.4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
I(age − 11)
● ●ranef(lmeOrth2) ranef(lmeOrth2ml)
Longitudinal Data Analysis 181
![Page 183: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/183.jpg)
R Console Page 1
> summary(lmeOrth3)Linear mixed-effects model fit by REML Data: Orthodont AIC BIC logLik 458.9891 498.655 -214.4945
Random effects: Formula: ~Sex + I(age - 11) + Sex:I(age - 11) | Subject Structure: General positive-definite StdDev Corr (Intercept) 1.7178454 (Intr) SexFml I(-11)SexFemale 1.6956351 -0.307 I(age - 11) 0.2937695 -0.009 -0.146 SexFemale:I(age - 11) 0.3160597 0.168 0.290 -0.964Residual 1.2551778
Fixed effects: distance ~ Sex + I(age - 11) + Sex:I(age - 11) Value Std.Error DF t-value p-value(Intercept) 24.968750 0.4572240 79 54.60945 0.0000SexFemale -2.321023 0.7823126 25 -2.96687 0.0065I(age - 11) 0.784375 0.1015733 79 7.72226 0.0000SexFemale:I(age - 11) -0.304830 0.1346293 79 -2.26421 0.0263 Correlation: (Intr) SexFml I(-11)SexFemale -0.584 I(age - 11) -0.006 0.004 SexFemale:I(age - 11) 0.005 0.144 -0.754
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.96534486 -0.38609670 0.03647795 0.43142668 3.99155835
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 182
![Page 184: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/184.jpg)
R Console Page 1
> OrthRE3=random.effects(lmeOrth3)> OrthRE3 (Intercept) SexFemale I(age - 11) SexFemale:I(age - 11)M16 -1.73612668 0.63199885 -0.121203414 0.0748642681M05 -1.73713471 0.49796730 0.035630448 -0.0876368146M02 -1.40604191 0.43103963 -0.003830025 -0.0370896958M11 -1.18396932 0.56512991 -0.239248823 0.2132764937M07 -1.07528511 0.31943477 0.008987456 -0.0407096045M08 -0.96357680 0.47583428 -0.213277852 0.1928075453M03 -0.63399603 0.20785928 -0.017487532 -0.0003969599M12 -0.63483606 0.09616632 0.113207353 -0.1358145288M13 -0.63802816 -0.32826691 0.609847916 -0.6504012907M14 -0.08183867 0.14099033 -0.135532941 0.1380152657M09 0.13720981 -0.12701403 0.099549847 -0.0991217929M15 0.79838740 -0.39490093 0.177462762 -0.1605286380M06 1.24102052 -0.32776769 -0.058124041 0.0964521169M04 1.46326110 -0.17133882 -0.319681817 0.3739018202M01 2.45317943 -0.81889368 0.084716304 -0.0161270990M10 3.99777519 -1.19823860 -0.021015641 0.1385089141F10 -1.91258504 -1.84210386 0.071770763 -0.2293495874F09 -0.72087067 -0.69430276 0.027050737 -0.0864435068F06 -0.71120815 -0.68499782 0.026688309 -0.0852850411F01 -0.59610113 -0.57413261 0.022368854 -0.0714818606F05 -0.03022851 -0.02911148 0.001134008 -0.0036244236F07 0.16900395 0.16277491 -0.006341852 0.0202661280F02 0.19316023 0.18603726 -0.007247922 0.0231622923F08 0.30543005 0.29417922 -0.011461928 0.0366266523F03 0.54331257 0.52328536 -0.020387500 0.0651510668F04 1.02505976 0.98728531 -0.038465941 0.1229211327F11 1.73502694 1.67108646 -0.065107527 0.2080571474
Longitudinal Data Analysis 183
![Page 185: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/185.jpg)
>plot(augPred(lmeOrth3),aspect="xy",grid=T)
Age (yr)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
8 11
● ●
●
●
M16
●
●●
●
M05
8 11
●● ●
●
M02
● ● ●
●
M11
8 11
● ●
●
●
M07
●
●
●●
M08
8 11
● ●
●
●
M03
●
● ●
●
M12
8 11
●
●
●
●
M13
●
● ● ●
M14
●
●
●
●
M09
●
●
●
●
M15
●●
●
●
M06
●
●● ●
M04
●●
●
●
M01
● ●
● ●
M10
●
● ● ●
F10
20
25
30
●●
● ●
F09
20
25
30
●● ●
●
F06
8 11
●●
●
●
F01
●
● ●●
F05
8 11
●● ●
●
F07
● ●
●
●
F02
8 11
● ● ● ●
F08
●
● ●
●
F03
8 11
●● ●
●
F04
● ●
● ●
F11
Longitudinal Data Analysis 184
![Page 186: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/186.jpg)
R Console Page 1
> newOrth=data.frame(Subject=rep(c("M11","F03"),c(3,3)),Sex=rep(c("Male","Female"),c(3,3)),age=rep(16:18,2) )> newOrth Subject Sex age1 M11 Male 162 M11 Male 173 M11 Male 184 F03 Female 165 F03 Female 176 F03 Female 18> predict(lmeOrth3,newdata=newOrth,level=0:1) Subject predict.fixed predict.Subject1 M11 28.89063 26.510412 M11 29.67500 27.055543 M11 30.45938 27.600664 F03 25.04545 26.335875 F03 25.52500 26.860186 F03 26.00455 27.38449
Longitudinal Data Analysis 185
![Page 187: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/187.jpg)
>lmListOrth=lmList(distance I(age-11), data=Orthodont)
>compFOrth=compareFits(coef(lmListOrth),coef(lmeOrth2))
M16M05M02M11M07M08M03M12M13M14M09M15M06M04M01M10F10F09F06F01F05F07F02F08F03F04F11
18 20 22 24 26 28 30
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
(Intercept)
0.5 1.0 1.5 2.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
I(age − 11)
● ●coef(lmListOrth) coef(lmeOrth2)
Longitudinal Data Analysis 186
![Page 188: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/188.jpg)
>plot(comparePred(lmListOrth,lmeOrth2,length.out=2),layout=c(9,3))
Age (yr)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
8 10 13
● ●
●●
M16
●
●●
●
M05
8 10 13
●● ●
●
M02
● ● ●●
M11
8 10 13
● ●
●
●
M07
●
●
●●
M08
8 10 13
● ●
●
●
M03
●
● ●
●
M12
8 10 13
●
●
●
●
M13
●
● ● ●
M14
●
●
●
●
M09
●●
●
●
M15
●●
●
●
M06
●
●● ●
M04
●●
●
●
M01
● ●
● ●
M10
●
● ● ●
F10
20
25
30
●●
● ●
F09
20
25
30
●● ●
●
F06
8 10 13
●●
●●
F01
●● ●
●
F05
8 10 13
●● ●
●
F07
● ●
●●
F02
8 10 13
● ● ● ●
F08
●
● ●
●
F03
8 10 13
●● ●
●
F04
● ●
● ●
F11
lmListOrth lmeOrth2
Longitudinal Data Analysis 187
![Page 189: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/189.jpg)
Examining a Fitted Model
There are two basic assumptions that need to be assessed
1. the within-group errors are assumed independent and identically normally distributed withmean zero and variance �2 (since Wi = �2I), and they are independent of the randome�ects
2. the random e�ects are normally distributed with mean zero and covariance matrix B (notdepending on the group) and are independent for di�erent groups.
Longitudinal Data Analysis 188
![Page 190: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/190.jpg)
Assessing assumptions on the within-group error
• The primary quantities used to assess the adequacy of the �rst assumption are the within-group residuals, de�ned as the di�erence between the observed and the within-group �ttedvalue.
• The plot method of lme class is the primary tool for obtaining diagnostics for the �rstassumption.
Longitudinal Data Analysis 189
![Page 191: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/191.jpg)
Example: Orthodont (cont.)
• Initially we consider the box plot of the residuals, by group.
• We add a vertical line at zero so we can assess whether
{ the residuals are centered at zero{ have constant variance across groups{ are independent of the group level
Longitudinal Data Analysis 190
![Page 192: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/192.jpg)
>plot(lmeOrth2,Subject ∼ resid(.),abline=0)
Residuals (mm)
Sub
ject
M16M05M02M11M07M08M03M12M13M14M09M15M06M04M01M10F10F09F06F01F05F07F02F08F03F04F11
−4 −2 0 2 4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Longitudinal Data Analysis 191
![Page 193: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/193.jpg)
>plot(lmeOrth2,resid(.,type='p') ∼ fitted(.)|Sex,id=0.05,adj=-0.3)
Fitted values (mm)
Sta
ndar
dize
d re
sidu
als
−2
0
2
4
20 25 30
●
●
●●
●
●
●
●●
● ●
●●
●
●
●●
●
●
●●
● ● ●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
M09
M09
M13
Male
20 25 30
●
●
●
●●
●
●●
●
●
● ●
●●
●●
●
●
● ●
● ●
●
●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
●
●
Female
Longitudinal Data Analysis 192
![Page 194: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/194.jpg)
R Console Page 1
> lmeOrth5=lme(distance~I(age-11),data=Orthodont,weights=varIdent(form=~1|Sex))> summary(lmeOrth5)Linear mixed-effects model fit by REML Data: Orthodont AIC BIC logLik 435.6466 454.2907 -210.8233
Random effects: Formula: ~I(age - 11) | Subject Structure: General positive-definite StdDev Corr (Intercept) 2.1590091 (Intr)I(age - 11) 0.1980627 0.617 Residual 1.6452598
Variance function: Structure: Different standard deviations per stratum Formula: ~1 | Sex Parameter estimates: Male Female 1.0000000 0.4040981 Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 23.97377 0.4341697 80 55.21752 0I(age - 11) 0.60686 0.0594260 80 10.21203 0 Correlation: (Intr)I(age - 11) 0.391
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.02779067 -0.48052007 0.04214476 0.51813201 3.18632228
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 193
![Page 195: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/195.jpg)
R Console Page 1
> anova(lmeOrth2,lmeOrth5) Model df AIC BIC logLik Test L.Ratio p-valuelmeOrth2 1 6 454.6367 470.6173 -221.3183 lmeOrth5 2 7 435.6466 454.2907 -210.8233 1 vs 2 20.99004 <.0001
Longitudinal Data Analysis 194
![Page 196: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/196.jpg)
>plot(lmeOrth5,resid(.,type='p') ∼ fitted(.)|Sex,id=0.05,adj=-0.3)
Fitted values (mm)
Sta
ndar
dize
d re
sidu
als
−2
0
2
20 25 30
●
●
●
●
●●
●
●
●
●●
●●
●
●
●●
●
●
●
●
● ● ●●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●● ● ●
●
●
●
●
●
M09
M09
M13
Male
20 25 30
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Female
Longitudinal Data Analysis 195
![Page 197: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/197.jpg)
>plot(lmeOrth5,distance ∼ fitted(.),id=0.05,adj=-0.3)
Fitted values (mm)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
20 25 30
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
M09
M09
M13
Longitudinal Data Analysis 196
![Page 198: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/198.jpg)
>qqnorm(lmeOrth5, ∼ resid(.)|Sex)
Residuals (mm)
Qua
ntile
s of
sta
ndar
d no
rmal
−2
−1
0
1
2
−4 −2 0 2 4
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
Male
−4 −2 0 2 4
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Female
Longitudinal Data Analysis 197
![Page 199: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/199.jpg)
Assessing assumptions on the random e�ects
• The ranef method is used to obtain the estimated BLUP of the random e�ects for lmeobjects.
• Two types of diagnostic plots will be used to assess the second assumption
{ qqnorm: normal plot{ pairs: scatter plot
Longitudinal Data Analysis 198
![Page 200: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/200.jpg)
>qqnorm(lmeOrth2, ∼ ranef(.),id=0.10,cex=0.7)
Random effects
Qua
ntile
s of
sta
ndar
d no
rmal
−2
−1
0
1
2
−4 −2 0 2 4
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
M10
F10
(Intercept)
−0.2 0.0 0.2 0.4
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
M13
I(age − 11)
Longitudinal Data Analysis 199
![Page 201: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/201.jpg)
>pairs(lmeOrth2,∼ranef(.)|Sex,id= ∼ Subject=='M13',adj=-0.3)
(Intercept)
I(ag
e −
11)
−0.2
0.0
0.2
0.4
−4 −2 0 2 4
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
M13
Male
−4 −2 0 2 4
●
●
●●●
●
●
●
●
●
●
Female
Longitudinal Data Analysis 200
![Page 202: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/202.jpg)
>qqnorm(lmeOrth5, ∼ ranef(.),id=0.10,cex=0.7)
Random effects
Qua
ntile
s of
sta
ndar
d no
rmal
−2
−1
0
1
2
−4 −2 0 2 4
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
M10
F10
(Intercept)
−0.2 −0.1 0.0 0.1 0.2
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
I(age − 11)
Longitudinal Data Analysis 201
![Page 203: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/203.jpg)
Revision: Generalized Liner Models
• So far we have discussed methods for analyzing continuous data
• When the response is discrete (e.g. binary, count), linear models are no longer appropriate
• Instead, we use Generalized Liner Models (GLM)
• Extensions of GLMs will be considered for the analysis of Longitudinal data
Longitudinal Data Analysis 202
![Page 204: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/204.jpg)
Features:
1. We have a response variable Yi for the ith subject, i = 1; :::; N , with an associated p× 1
vector of covariates
Xi =
Xi1...
Xip
2. Distributional assumption:
In the linear models, the distribution of the response variable is assumed normal. In theGLM, an extension is considered by assuming that the distribution of the response variablebelongs to the exponential family of distributions
f(yi; �i; �) = exp[{yi�i − �(�i)}=� + b(yi; �)]:
The speci�c functions a() and b() distinguish one member of the family from the other.Parameter �i is called the location parameter and � is the dispersion parameter.
Longitudinal Data Analysis 203
![Page 205: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/205.jpg)
It can be shown thatV ar(Yi) = �v(�i);
where v(�i) is the variance function, a known function of the mean �i, and � > 0.Members of this family are the Normal, Bernulli and Poisson distribution.
3. Systematic Component:In GLM, the mean is a function of the linear predictor �i;
�i = �1Xi1 + �2Xi2 + : : : + �pXip;
where usually Xi1 = 1.Note: In this context, linear means that �i is linear to the regression parameters � but notnecessarily to the covariates.
Longitudinal Data Analysis 204
![Page 206: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/206.jpg)
4. Link Function:The �nal thing is to relate the mean �i to linear predictor �i. This can be done byintroducing the link function g(),
g(�i) = �i = �1Xi1 + �2Xi2 + : : : + �pXip:
The link function is a known function, e.g. log(�i), that transforms the mean to changelinearly with changes in the covariates.
Distribution v(�) Link Function
Normal 1 Identity: � = �
Bernoulli �(1− �) Logit: log(
�1−�
)= �
Poisson � Log: log(�) = �
Longitudinal Data Analysis 205
![Page 207: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/207.jpg)
Logistic Regression: Binary outcomes
• Response Yi is a binary outcome with P (Yi = 1) = �i
• The mean is related to the covariates through
logit(�i) = log
(�i
1− �i
)= �1 + �2Xi
• Responses are Bernoulli variables with
V ar(Yi) = �i(1− �i)
• It can also be expressed as
�i =exp(�1 + �2Xi)
1 + exp(�1 + �2Xi)
Longitudinal Data Analysis 206
![Page 208: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/208.jpg)
Log-Linear Model for Count data
• Response Yi is a count assuming that has a Poisson distribution
P (Yi = yi) = e�i�yii
yi!:
• The mean is related to the covariates through
log(�i) = �1 + �2Xi:
• If the rate of occurrence is of interest, we get
log(�i=Ti) = �1 + �2Xi ⇒log(�i) = log(Ti) + �1 + �2Xi;
Longitudinal Data Analysis 207
![Page 209: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/209.jpg)
where Ti is the relevant time period. Ti is known as an o�set, and enters the model witha �xed parameters equal to 1.
• Responses are Poisson variables with
V ar(Yi) = v(�i) = �i:
Longitudinal Data Analysis 208
![Page 210: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/210.jpg)
Classes of model for dependent non-normal data
In the current settings with longitudinal data, two classes of models are widely used
1. marginal or population average (PA) models
2. subject-speci�c (SS) models
Longitudinal Data Analysis 209
![Page 211: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/211.jpg)
i. Marginal or Population Average Models (PA)
• Consider the logistic model
logit(E[Yij]) = X′ij�1
E[Yij] =exp(X
′ij�1)
1 + exp(X′ij�1)
• Also called population-average model• Models the mean at each time• Changes represent changes at the average level, not within subject change• Does not induce any within subject dependence
Longitudinal Data Analysis 210
![Page 212: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/212.jpg)
ii. Subject-Speci�c Models (SS)
• Consider the logistic model
logit(E[Yij|bi]) = X′ij�2 + bi
E[Yij|bi] =exp(X
′ij�2 + bi)
1 + exp(X′ij�2 + bi)
• bi is the e�ect associated with subject i• repeated measurements are assumed independent conditional on bi• Taking the expectation with respect bi induces correlation among repeated measures
and de�nes the marginal expectation
E[Yij] = ES{E[Yij|bi]}
= ES
{exp(X
′ij�2 + bi)
1 + exp(X′ij�2 + bi)
}
Longitudinal Data Analysis 211
![Page 213: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/213.jpg)
To summarize:
• with dependent data, when we move away from the normal linear model, we no longerhave a uni�ed modeling framework• deferent approaches are de�ned for di�erent distributions• as a result, parameter represent di�erent things in di�erent models• extra care is needed about the scale
Longitudinal Data Analysis 212
![Page 214: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/214.jpg)
Comparison between PA and SS models
• This is a matter of scale
{ eg. in logistic regression for the SS model the linear predictor
logit(E[Yij|bi]) = X′ij�2 + bi
operate on the logit scale
Longitudinal Data Analysis 213
![Page 215: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/215.jpg)
{ but, marginalizing involves averaging on the probability scale
E[Yij] = ES{E[Yij|bi]}
= ES
{exp(X
′ij�2 + bi)
1 + exp(X′ij�2 + bi)
}
6=exp(X
′ij�2 + E[bi])
1 + exp(X′ij�2 + E[bi])
=exp(X
′ij�2)
1 + exp(X′ij�2)
• The �nal expression is the probability for a subject with zero subject e�ect and is not thesame thing as the average probability over the subjects
Longitudinal Data Analysis 214
![Page 216: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/216.jpg)
Marginal Models: Generalized Estimating Equations (GEE)
• marginal models are primarily used to provide inferences about the population means
• GEEs provide an extension to the GLMs to longitudinal data
• no distributional assumption for the response variable is required
• only the speci�cation of a regression model for the mean is required
• the response variable can be continuous, binary or count
• furthermore, as a regression model easily handles unbalanced data
Longitudinal Data Analysis 215
![Page 217: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/217.jpg)
Notation:
• the notation is similar to what we have already introduced
• the response variable
Yi =
Yi1Yi2...
Yini
doesn't have to be continuous any more
• ni is the number of observations for subject i
Longitudinal Data Analysis 216
![Page 218: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/218.jpg)
• associated with Yi is a vector of covariates
Xij =
Xij1
Xij2...
Xijp
where i = 1; 2; :::; N and j = 1; 2; :::; ni.
• Two types of covariates are included among Xij
1. between-subject covariates, which are covariates that do not change over time (gender,treatment, etc)
2. within subject covariates, which are those that change over time (time since baseline,current status, etc)
Longitudinal Data Analysis 217
![Page 219: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/219.jpg)
Note: Since marginal models primarily care for population means, marginal models forlongitudinal data model separately the mean response and the within subject associationbetween the repeated responses.
• the former if of interest
• the latter is treated as nuisance
Longitudinal Data Analysis 218
![Page 220: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/220.jpg)
A marginal model has the following three part speci�cation
1. The mean structure is the following
g(�ij) = �ij = X′ij�;
where the conditional mean �ij = E[Yij|Xij] depends on the linear predictor �ij throughthe link function g().
2. The variance is assume to have the form
V ar(Yij) = �v(�ij);
where v(�ij) is a known function of the mean and � is a scale parameter that may be knownor need to be estimated. The scale parameter could be di�erent for di�erent occasions(balanced data) or could depend on time.
Longitudinal Data Analysis 219
![Page 221: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/221.jpg)
3. The within subject association among the repeated responses, given Xij, is a function ofa separate set of parameters, say �, that could also depend on the means. This could bethe pairwise correlations or log-odds ratios, depending on the type of the data
furthermore:
1. in marginal models, the mean response and the within-subject association is modeledseparately
2. the avoidance of distributional assumption for Yij leads to a method of estimation knownas Generalized Estimation Equations (GEE)
Longitudinal Data Analysis 220
![Page 222: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/222.jpg)
e.g. Marginal Model for Continuous Response
• The mean of Yij associates with the covariates through the identity link
�ij = �ij = X′ij�:
• The variance has the formV ar(Yij) = �v(�ij) = �;
where v(�ij) = 1 and � to be estimated.
• The within-subject association among repeated measures can be models using any of theways of modeling the covariance structure already discussed (autoregressive, unstructured,etc). We can assume a �rst order autoregressive correlation structure
Corr(Yij; Yik) = a|k−j|;
where 0 ≤ a ≤ 1.
Longitudinal Data Analysis 221
![Page 223: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/223.jpg)
• The already discussed linear model can be seen as a special case of the marginal model
• The marginal model provide a broad class of models for continuous data, largely based onthe choice of the link function
Longitudinal Data Analysis 222
![Page 224: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/224.jpg)
e.g. Marginal Model for Binary Response
• Responses are considered Bernoulli variables
• The mean of Yij associates with the covariates through the logit link
log
(�ij
1− �ij
)= �ij = X
′ij�:
• The variance has the formV ar(Yij) = �ij(1− �ij);
where � = 1.
Longitudinal Data Analysis 223
![Page 225: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/225.jpg)
• The within-subject association among repeated measures can be models using anunstructured pairwise log-odds ratio pattern (or any other available pattern)
logOR(Yij; Yik) = ajk;
where
OR(Yij; Yik) =P (Yj = 1; Yk = 1)P (Yj = 0; Yk = 0)
P (Yj = 1; Yk = 0)P (Yj = 0; Yk = 1).
Longitudinal Data Analysis 224
![Page 226: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/226.jpg)
e.g. Marginal Model for Counts
• Responses are considered to follow Poisson Distribution
• The mean of Yij associates with the covariates through the log link function
log (�ij) = �ij = X′ij�:
• The variance has the formV ar(Yij) = ��ij;
where � does not depend on time and has to be estimated.
• The within-subject association among repeated measures can be models using anunstructured pairwise correlation pattern (or any other available pattern)
Corr(Yij; Yik) = ajk:
Longitudinal Data Analysis 225
![Page 227: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/227.jpg)
Here a balanced design has been assumed.
• In the model speci�cation, the Poisson variance is multiplied by a parameter �. Hence,variance is in ated when � > 1. It is very common that count data have variability greaterthan the predicted variance from Poisson, and this is called overdispersion.
Longitudinal Data Analysis 226
![Page 228: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/228.jpg)
Estimation
• GEE approach is based on estimating equations
• The idea is to extend the usual likelihood equation for GLM by incorporating the covariancematrix of the responses
• Assume the following marginal model
1. g (�ij) = �ij = X′ij�:
2. V ar(Yij) = �v(�ij);where v(�ij) is a known function of the mean and � can be di�erent for each occasion(balanced data) or depend on time.
3. The pairwise within subject association is assume a function of the means �ij and a setof association parameters �, such that
Vi = A12i Corr(Yi)A
12i ;
Longitudinal Data Analysis 227
![Page 229: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/229.jpg)
where A12i is a diagonal matrix with elements V ar(Yij) = �v(�ij) along the diagonal
and Corr(Yi) is a correlation matrix, a function of �. We tend to call Vi a working
covariance matrix, to distinguish it from the true underline covariance matrix.
• The GLS estimator of � is
� =
{N∑i=1
(X′iΣ−1i Xi)
}−1 N∑i=1
(X′iΣ−1i yi);
obtained by solvingN∑i=1
X′iΣ−1i (yi − �i) = 0
as part of the minimization of
N∑i=1
(yi −Xi�)′Σ−1i (yi −Xi�):
Longitudinal Data Analysis 228
![Page 230: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/230.jpg)
• The GEE estimator of � is obtained from the the minimization of
N∑i=1
(yi − �i(�))′V −1i (yi − �i(�))
with respect to �, where Vi is assumed known (ignoring its dependence on �) and �i isthe vector of �ij = g−1(Xij�). This results to the generalized estimating equations
N∑i=1
D′iV−1i (yi − �i) = 0;
where Vi is the working covariance matrix and Di = @�i@�.
Longitudinal Data Analysis 229
![Page 231: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/231.jpg)
Iterative estimation procedure
The GEE have no closed-form solution.
Step 1: Given current (initial) estimated for � and �, Vi is estimated and an estimate of �is obtained from
N∑i=1
D′iV−1i (yi − �i) = 0:
Step 2: Given the current estimate of �, estimates of � and � can be obtained fromstandardized residuals
eij =Yij − �ij√v(�ij)
:
Longitudinal Data Analysis 230
![Page 232: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/232.jpg)
Notes:
1. We iterate between the above steps until convergence.
2. Initial values for � can be obtained from �tting a GLM assuming independent observations
3. Algorithm is simple
Longitudinal Data Analysis 231
![Page 233: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/233.jpg)
Properties:
1. The estimate � is a consistent estimate of � (large sample property). This is trueirrespectively of the choice of Vi. Hence, all we need is that the model for the meanis correctly speci�ed.
2. In large sample, � has a MVN with mean � and
Cov(�) = B−1MB−1;
where
B =
N∑i=1
D′iV−1i Di;
M =
N∑i=1
D′iV−1i Cov(Yi)V
−1i Di:
Longitudinal Data Analysis 232
![Page 234: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/234.jpg)
These matrices can be estimated by substituting �, � and � by their estimates and replacingCov(Yi) = Σi by
(Yi − �i)(Yi − �i)′:
3. Hence
Cov(�) =
(N∑i=1
D′iV−1i Di
)−1{N∑i=1
D′iV−1i (Yi − �i)(Yi − �i)
′V−1i Di
}(N∑i=1
D′iV−1i Di
)−1
:
This is the so called sandwich estimator.
4. Finally, if we model correctly, Vi = Σi and
Cov(�) = B−1:
Longitudinal Data Analysis 233
![Page 235: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/235.jpg)
Pros:
1. The GEE estimator � is as precise as the MLE.
2. The GEE estimator is consistent estimate of � even when the within-subject associationsare misspeci�ed.
3. In this case, valid* estimates of the standard errors can be obtained from the sandwichestimator
* Reliance on the sandwich estimator is not appealing when the number of subjects is notvery big compared to the number of repeated observations, when the design is unbalanced.In these cases, it is preferably to obtain the model based covariance
Cov(�) = B−1;
which provides valid estimates when the working covariance matrix is a good approximationof the true covariance Σi.
Longitudinal Data Analysis 234
![Page 236: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/236.jpg)
Example: Respiratory Data (Binary)
• In each of two centers patients were randomized to active treatment or placebo
• During treatment, the respiratory status (poor or good) was determined at each of fourmonthly visits
• There were 111 patiens (54 vs 57)
• Question of interest is to asses the treatment is e�ective and estimate its e�ect
Longitudinal Data Analysis 235
![Page 237: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/237.jpg)
resp glm = glm(status ∼ centre + treatment + sex + baseline + age, data
= resp, family = "binomial")
Longitudinal Data Analysis 236
![Page 238: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/238.jpg)
R Console Page 1
> summary(resp_glm)
Call:glm(formula = status ~ centre + treatment + sex + b aseline + age, family = "binomial", data = resp)
Deviance Residuals: Min 1Q Median 3Q Max -2.3146 -0.8551 0.4336 0.8953 1.9246
Coefficients: Estimate Std. Error z value Pr( >|z|) (Intercept) -0.900171 0.337653 -2.666 0. 00768 ** centre2 0.671601 0.239567 2.803 0. 00506 ** treatmenttreatment 1.299216 0.236841 5.486 4.1 2e-08 ***sexmale 0.119244 0.294671 0.405 0. 68572 baselinegood 1.882029 0.241290 7.800 6.2 0e-15 ***age -0.018166 0.008864 -2.049 0. 04043 * ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 608.93 on 443 degrees of freed omResidual deviance: 483.22 on 438 degrees of freed omAIC: 495.22
Number of Fisher Scoring iterations: 4
Longitudinal Data Analysis 237
![Page 239: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/239.jpg)
resp gee1 = gee(nstat ∼ centre + treatment + sex + baseline + age, data
= resp, family = "binomial", id = subject,corstr = "independence", scale.fix
= TRUE, scale.value = 1)
Longitudinal Data Analysis 238
![Page 240: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/240.jpg)
R Console Page 1
> summary(resp_gee1)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logit Variance to Mean Relation: Binomial Correlation Structure: Independent
Call:gee(formula = nstat ~ centre + treatment + sex + baseline + age, id = subject, data = resp, family = "binomial", corstr = "independence", scale.fix = TRUE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -0.93134415 -0.30623174 0.08973552 0.33018952 0.84307712
Coefficients: Estimate Naive S.E. Naive z Robust S.E.(Intercept) -0.90017133 0.337653052 -2.665965 0.46032700centre2 0.67160098 0.239566599 2.803400 0.35681913treatmenttreatment 1.29921589 0.236841017 5.485603 0.35077797sexmale 0.11924365 0.294671045 0.404667 0.44320235baselinegood 1.88202860 0.241290221 7.799854 0.35005152age -0.01816588 0.008864403 -2.049306 0.01300426 Robust z(Intercept) -1.9555041centre2 1.8821889treatmenttreatment 3.7038127sexmale 0.2690501baselinegood 5.3764332age -1.3969169
Estimated Scale Parameter: 1Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1 0 0 0[2,] 0 1 0 0[3,] 0 0 1 0[4,] 0 0 0 1
Longitudinal Data Analysis 239
![Page 241: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/241.jpg)
resp gee2 = gee(nstat ∼ centre + treatment + sex + baseline + age, data
= resp, family = "binomial", id = subject,corstr = "exchangeable", scale.fix
= TRUE, scale.value = 1)
Longitudinal Data Analysis 240
![Page 242: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/242.jpg)
R Console Page 1
> summary(resp_gee2)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logit Variance to Mean Relation: Binomial Correlation Structure: Exchangeable
Call:gee(formula = nstat ~ centre + treatment + sex + baseline + age, id = subject, data = resp, family = "binomial", corstr = "exchangeable", scale.fix = TRUE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -0.93134415 -0.30623174 0.08973552 0.33018952 0.84307712
Coefficients: Estimate Naive S.E. Naive z Robust S.E.(Intercept) -0.90017133 0.47846344 -1.8813796 0.46032700centre2 0.67160098 0.33947230 1.9783676 0.35681913treatmenttreatment 1.29921589 0.33561008 3.8712064 0.35077797sexmale 0.11924365 0.41755678 0.2855747 0.44320235baselinegood 1.88202860 0.34191472 5.5043802 0.35005152age -0.01816588 0.01256110 -1.4462014 0.01300426 Robust z(Intercept) -1.9555041centre2 1.8821889treatmenttreatment 3.7038127sexmale 0.2690501baselinegood 5.3764332age -1.3969169
Estimated Scale Parameter: 1Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1.0000000 0.3359883 0.3359883 0.3359883[2,] 0.3359883 1.0000000 0.3359883 0.3359883[3,] 0.3359883 0.3359883 1.0000000 0.3359883[4,] 0.3359883 0.3359883 0.3359883 1.0000000
Longitudinal Data Analysis 241
![Page 243: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/243.jpg)
R Console Page 1
> # Confidence Interval for estimated treatment effect [logOR scale]> se <- summary(resp_gee2)$coefficients["treatmenttreatment","Robust S.E."]> coef(resp_gee2)["treatmenttreatment"] + c(-1, 1) * se * qnorm(0.975)[1] 0.6117037 1.9867281> > # Confidence Interval for estimated treatment effect [OR scale]> exp(coef(resp_gee2)["treatmenttreatment"] + c(-1, 1) * se * qnorm(0.975))[1] 1.843570 7.291637
Longitudinal Data Analysis 242
![Page 244: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/244.jpg)
Example: Epilepsy Data (Counts)
• 59 patients with epilepsy were randomized to receive either "Progabide" or "Placebo".
• Numbers of seizures observed in each of four 2-week periods were recorded along with thebaseline seizure count for the 8 weeks prior randomization
• Question of interest is whether taking the anti-epileptic drug reduces the number of seizurescompares to placebo
Longitudinal Data Analysis 243
![Page 245: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/245.jpg)
R Console Page 1
> data("epilepsy", package = "HSAUR")> itp <- interaction(epilepsy$treatment, epilepsy$period)> tapply(epilepsy$seizure.rate, itp, mean) placebo.1 Progabide.1 placebo.2 Progabide.2 placebo.3 Progabide.3 placebo.4 Progabide.4 9.357143 8.580645 8.285714 8.419355 8.785714 8.129032 7.964286 6.709677 > tapply(epilepsy$seizure.rate, itp, var) placebo.1 Progabide.1 placebo.2 Progabide.2 placebo.3 Progabide.3 placebo.4 Progabide.4 102.75661 332.71828 66.65608 140.65161 215.28571 193.04946 58.18386 126.87957
Longitudinal Data Analysis 244
![Page 246: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/246.jpg)
●●
●
●
●
●
●
●
●
1 2 3 4
020
4060
8010
0
Placebo
Period
Num
ber
of s
eizu
res
●●
●
● ●●
●
●
●
●
●
●
●●
●
1 2 3 4
020
4060
8010
0
Progabide
Period
Num
ber
of s
eizu
res
Longitudinal Data Analysis 245
![Page 247: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/247.jpg)
●
1 2 3 4
01
23
4
Placebo
Period
Log
num
ber
of s
eizu
res
●
●
●
●
1 2 3 4
01
23
4
Progabide
Period
Log
num
ber
of s
eizu
res
Longitudinal Data Analysis 246
![Page 248: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/248.jpg)
fm <- seizure.rate ∼ base + age + treatment + offset(per)
epilepsy glm <- glm(fm, data = epilepsy, family = "poisson")
Longitudinal Data Analysis 247
![Page 249: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/249.jpg)
R Console Page 1
> summary(epilepsy_glm)
Call:glm(formula = fm, family = "poisson", data = epilep sy)
Deviance Residuals: Min 1Q Median 3Q Max -4.4360 -1.4034 -0.5029 0.4842 12.3223
Coefficients: Estimate Std. Error z value Pr (>|z|) (Intercept) -0.1306156 0.1356191 -0.963 0 .33549 base 0.0226517 0.0005093 44.476 < 2e-16 ***age 0.0227401 0.0040240 5.651 1. 59e-08 ***treatmentProgabide -0.1527009 0.0478051 -3.194 0 .00140 ** ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to b e 1)
Null deviance: 2521.75 on 235 degrees of free domResidual deviance: 958.46 on 232 degrees of free domAIC: 1732.5
Number of Fisher Scoring iterations: 5
Longitudinal Data Analysis 248
![Page 250: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/250.jpg)
epilepsy gee1 <- gee(fm, data = epilepsy, family = "poisson",id = subject,
corstr = "independence", scale.fix = TRUE,scale.value = 1)
Longitudinal Data Analysis 249
![Page 251: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/251.jpg)
R Console Page 1
> summary(epilepsy_gee1)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Independent
Call:gee(formula = fm, id = subject, data = epilepsy, family = "poisson", corstr = "independence", scale.fix = TRUE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -4.9195387 0.1808059 1.7073405 4.8850644 69.9658560
Coefficients: Estimate Naive S.E. Naive z Robust S.E. Robust z(Intercept) -0.13061561 0.1356191185 -0.9631062 0.365148155 -0.3577058base 0.02265174 0.0005093011 44.4761250 0.001235664 18.3316325age 0.02274013 0.0040239970 5.6511312 0.011580405 1.9636736treatmentProgabide -0.15270095 0.0478051054 -3.1942393 0.171108915 -0.8924196
Estimated Scale Parameter: 1Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1 0 0 0[2,] 0 1 0 0[3,] 0 0 1 0[4,] 0 0 0 1
Longitudinal Data Analysis 250
![Page 252: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/252.jpg)
epilepsy gee2 <- gee(fm, data = epilepsy, family = "poisson",id = subject,
corstr = "exchangeable", scale.fix = TRUE,scale.value = 1)
Longitudinal Data Analysis 251
![Page 253: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/253.jpg)
R Console Page 1
> summary(epilepsy_gee2)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Exchangeable
Call:gee(formula = fm, id = subject, data = epilepsy, family = "poisson", corstr = "exchangeable", scale.fix = TRUE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -4.9195387 0.1808059 1.7073405 4.8850644 69.9658560
Coefficients: Estimate Naive S.E. Naive z Robust S.E. Robust z(Intercept) -0.13061561 0.2004416507 -0.651639 0.365148155 -0.3577058base 0.02265174 0.0007527342 30.092612 0.001235664 18.3316325age 0.02274013 0.0059473665 3.823564 0.011580405 1.9636736treatmentProgabide -0.15270095 0.0706547450 -2.161227 0.171108915 -0.8924196
Estimated Scale Parameter: 1Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1.0000000 0.3948033 0.3948033 0.3948033[2,] 0.3948033 1.0000000 0.3948033 0.3948033[3,] 0.3948033 0.3948033 1.0000000 0.3948033[4,] 0.3948033 0.3948033 0.3948033 1.0000000
Longitudinal Data Analysis 252
![Page 254: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/254.jpg)
epilepsy gee3 <- gee(fm, data = epilepsy, family = "poisson",id = subject,
corstr = "exchangeable", scale.fix = FALSE,scale.value = 1)
Longitudinal Data Analysis 253
![Page 255: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/255.jpg)
R Console Page 1
> summary(epilepsy_gee3)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Exchangeable
Call:gee(formula = fm, id = subject, data = epilepsy, family = "poisson", corstr = "exchangeable", scale.fix = FALSE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -4.9195387 0.1808059 1.7073405 4.8850644 69.9658560
Coefficients: Estimate Naive S.E. Naive z Robust S.E. Robust z(Intercept) -0.13061561 0.452199543 -0.2888451 0.365148155 -0.3577058base 0.02265174 0.001698180 13.3388301 0.001235664 18.3316325age 0.02274013 0.013417353 1.6948302 0.011580405 1.9636736treatmentProgabide -0.15270095 0.159398225 -0.9579840 0.171108915 -0.8924196
Estimated Scale Parameter: 5.089608Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1.0000000 0.3948033 0.3948033 0.3948033[2,] 0.3948033 1.0000000 0.3948033 0.3948033[3,] 0.3948033 0.3948033 1.0000000 0.3948033[4,] 0.3948033 0.3948033 0.3948033 1.0000000
Longitudinal Data Analysis 254
![Page 256: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/256.jpg)
Generalized Linear Mixed E�ects Models
• GLMs can be extended, with the inclusion of random parameters, to allow variation betweensubjects
• Random e�ects follow multivariate normal distribution
• Conditional on random e�ects, responses are independent following a distribution thatbelongs to the exponential family.
Longitudinal Data Analysis 255
![Page 257: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/257.jpg)
Model Speci�cation:
• The distribution of Yij, conditional to random e�ects, belongs to the exponential familyof distributions.
• It's variance isV ar(Yi) = �v(E[Yij|bi])
• Given bi, Yij are independent from one another
• In matrix notation, the linear predictor can be written
�ij = X′ij� + Z
′ijbi;
and for some known link function g()
g(E[Yij|bi]) = �ij = X′ij� + Z
′ijbi;
Longitudinal Data Analysis 256
![Page 258: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/258.jpg)
• Random e�ects, in theory, can follow any multivariate distribution. In practice, they followmultivariate normal with mean equal zero and a covariance matrix G.
Longitudinal Data Analysis 257
![Page 259: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/259.jpg)
GLMM for Continuous Response:
• Responses Yij are independent, conditional on bi, and normally distributed
• Variance has the formV ar(Yij|bi) = �2;
where � = �2 and v(�) = 1.
• The linear predictor is
�ij = X′ij� + Z
′ijbi;
where X′ij = Z
′ij = (1; tij) (illustration). Then
E(Yij|bi) = �ij = X′ij� + Z
′ijbi
= (�1 + b1i) + (�2 + b2i)tij:
• Although the link is the identity function, more options are available
Longitudinal Data Analysis 258
![Page 260: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/260.jpg)
• Random e�ects have a bi-variate Normal with covariance matrix G2×2
Longitudinal Data Analysis 259
![Page 261: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/261.jpg)
GLMM for Binary Response:
• Responses Yij are independent, conditional on bi, Bernoulli variables
• Variance has the form
V ar(Yij|bi) = E(Yij|bi)(1− E(Yij|bi)):
This means that � = 1.
• The linear predictor is given by
�ij = X′ij� + Z
′ijbi
= X′ij� + bi;
Longitudinal Data Analysis 260
![Page 262: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/262.jpg)
where Z′ij = 1 for all i; j (illustration). Then
log
[P (Yij = 1|bi)P (Yij = 0|bi)
]= �ij = X
′ij� + bi
• bi ∼ N(0; �2).
• This is a random intercept model, equivalent to the compound symmetry model.
Longitudinal Data Analysis 261
![Page 263: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/263.jpg)
GLMM for Counts:
• Responses Yij are independent, conditional on bi, following Poisson distribution
• Variance has the formV ar(Yij|bi) = E(Yij|bi):
This means that � = 1.
• The linear predictor is given by
�ij = X′ij� + Z
′ijbi;
where Z′ij = (1; tij) for all i; j (illustration). Then
logE(Yij|bi) = �ij = X′ij� + Z
′ijbi
• Random e�ects follow bivariate normal with zero mean and 2x2 covariance matrix
Longitudinal Data Analysis 262
![Page 264: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/264.jpg)
Parameter Interpretation
• Parameters in the linear predictor are now interpreted in terms of conditional probabilities,given subject (random) e�ects
• Regression parameters � in GLMM have di�erent interpretation than in marginal models
• In GLMM, � represent subject-speci�c interpretation
• Speci�cally, � represent the impact of covariates on changes in an individual's transformedmean response
Longitudinal Data Analysis 263
![Page 265: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/265.jpg)
• Consider the example with the logistic regression model
log
[P (Yij = 1|bi)P (Yij = 0|bi)
]= X
′ij� + bi;
where bi ∼ N(0; g11). Furthermore, consider covariate Xijk takes some value x, leadingto the log-odds
log
[P (Yij = 1|bi; Xij1; Xij2; :::; Xijk = x; :::; Xijp)
P (Yij = 0|bi; Xij1; Xij2; :::; Xijk = x; :::; Xijp)
]= �1Xij1 + �2Xij2 + ::: + �kx + ::: + �pXijp + bi:
Additionally, if Xijk = x + 1, then the log-odds takes the form
log
[P (Yij = 1|bi; Xij1; Xij2; :::; Xijk = x + 1; :::; Xijp)
P (Yij = 0|bi; Xij1; Xij2; :::; Xijk = x + 1; :::; Xijp)
]= �1Xij1 + �2Xij2 + ::: + �k(x + 1) + ::: + �pXijp + bi;
Longitudinal Data Analysis 264
![Page 266: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/266.jpg)
and hence �k measures the changes in the log-odds resulted from a unit change in covariateXijk while the remaining ones were held �xed. In terms of interpretation:
{ If the covariate Xijk varies within individual (subject-speci�c, time-varying) then
log
[P (Yij′ = 1|bi; Xij′1; Xij′2; :::; Xij′k = x + 1; :::; Xij′p)
P (Yij′ = 0|bi; Xij′1; Xij′2; :::; Xij′k = x + 1; :::; Xij′p)
]− log
[P (Yij = 1|bi; Xij1; Xij2; :::; Xijk = x; :::; Xijp)
P (Yij = 0|bi; Xij1; Xij2; :::; Xijk = x; :::; Xijp)
]= �k;
where the interpretation is quite straight forward since all other covariates as well asrandom e�ects are the same and hence removed. Hence,
log[P(Yij′=1|bi;:::)=P(Yij′=0|bi;:::)P(Yij=1|bi;:::)=P(Yij=0|bi;:::)
]= logOR = �k ⇒
OR = exp(�k)
is the within subject OR.
Longitudinal Data Analysis 265
![Page 267: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/267.jpg)
{ If the covariate Xijk is time invariant (between-subject), like treatment group,interpretation becomes complicated. Hence
log
[P (Yij = 1|bi; Xij1; Xij2; :::; Xijk = 1; :::; Xijp)
P (Yij = 0|bi; Xij1; Xij2; :::; Xijk = 1; :::; Xijp)
]− log
[P (Yi′j = 1|bi′; Xi′j1; Xi′j2; :::; Xi′jk = 0; :::; Xi′jp)
P (Yi′j = 0|bi′; Xi′j1; Xi′j2; :::; Xi′jk = 0; :::; Xi′jp)
]= �k + (bi − bi′);
and as a result the change in log-odds is confounded by bi− bi′. It is misleading to giveto this change a subject-speci�c interpretation. It is seen as a model based extrapolation(no data available) and could be sensitive to various assumptions concerning the randome�ects.
Longitudinal Data Analysis 266
![Page 268: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/268.jpg)
Estimation and Inference
• The distribution of the random e�ects as well as the distribution of the responses areknown
• As a result, the joint distribution of random e�ects and responses is fully speci�ed
f(Yi; bi) = f(Yi|bi)f(bi);
wheref(Yi|bi) = f(Yi1|bi) f(Yi2|bi) : : : f(Yini|bi)
under the conditional independence assumption.
Longitudinal Data Analysis 267
![Page 269: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/269.jpg)
• Then, the likelihood function takes the form
L(�; �; G) =
N∏i=1
∫f(Yi|bi)f(bi)dbi;
where the random e�ects are integrated out of the likelihood, obtaining in that way amarginal likelihood averaged over the bi.
• There is now way the likelihood can be written in a closed form
• As a result, numerical integration techniques are required
Longitudinal Data Analysis 268
![Page 270: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/270.jpg)
Prediction of bi
• Given the MLE of �, � and G, bi can be predicted as
bi = E(bi|Yi; �; �; G)
• This is the empirical Bayes or BLUP used before
• Numerical integration techniques are also required
Longitudinal Data Analysis 269
![Page 271: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/271.jpg)
The lmer function (R: lme4 package)
Longitudinal Data Analysis 270
![Page 272: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/272.jpg)
Fit (Generalized) Linear Mixed-Effects Models
Description
Fit a linear or generalized linear mixed-effects model with nested or crossed grouping factors for the random effects.
Usage
lmer(formula, data, family, method, control, start, subset, weights, na.action, offset, contrasts, model, ...) lmer2(formula, data, family, method, control, start, subset, weights, na.action, offset, contrasts, model, ...)
Arguments
Details
lmer(lme4) R Documentation
formula a two-sided linear formula object describing the fixed-effects part of the model, with the response on the left of a ~ operator and the terms, separated by + operators, on the right. The vertical bar character "|" separates an expression for a model matrix and a grouping factor.
data an optional data frame containing the variables named in formula. By default the variables are taken from the environment from which lmer is called.
family a GLM family, see glm. If family is missing then a linear mixed model is fit; otherwise a generalized linear mixed model is fit.
method a character string. For a linear mixed model the default is "REML" indicating that the model should be fit by maximizing the restricted log-likelihood. The alternative is "ML" indicating that the log-likelihood should be maximized. (This method is sometimes called "full" maximum likelihood.) For a generalized linear mixed model the criterion is always the log-likelihood but this criterion does not have a closed form expression and must be approximated. The default approximation is "PQL" or penalized quasi-likelihood. Alternatives are "Laplace" or "AGQ" indicating the Laplacian and adaptive Gaussian quadrature approximations respectively. The "PQL" method is fastest but least accurate. The "Laplace" method is intermediate in speed and accuracy. The "AGQ" method is the most accurate but can be considerably slower than the others.
control a list of control parameters. See below for details.start a list of relative precision matrices for the random effects. This has the same form
as the slot "Omega" in a fitted model. Only the upper triangle of these symmetric matrices should be stored.
subset, weights, na.action, offset, contrasts
further model specification arguments as in lm; see there for details.
model logical indicating if the model component should be returned (in slot frame).... potentially further arguments for methods. Currently none are used.
Page 1 of 3Fit (Generalized) Linear Mixed-Effects Models
17/04/2008mk:@MSITStore:C:\PROGRA~1\R\R-26~1.1\library\lme4\chtml\lme4.chm::/lmer.html
Longitudinal Data Analysis 271
![Page 273: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/273.jpg)
Example: Respiratory Data
Longitudinal Data Analysis 272
![Page 274: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/274.jpg)
resp lmer1 =lmer(status ∼ centre + treatment + sex + baseline + age +
(1|subject), data = resp, family = "binomial")
Longitudinal Data Analysis 273
![Page 275: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/275.jpg)
R Console Page 1
> summary(resp_lmer1)Generalized linear mixed model fit using Laplace Formula: status ~ centre + treatment + sex + baseli ne + age + (1 | subject) Data: resp Family: binomial(logit link) AIC BIC logLik deviance 443 471.7 -214.5 429Random effects: Groups Name Variance Std.Dev. subject (Intercept) 3.8402 1.9596 number of obs: 444, groups: subject, 111
Estimated scale (compare to 1 ) 0.7770601
Fixed effects: Estimate Std. Error z value Pr(> |z|) (Intercept) -1.64382 0.75668 -2.172 0. 0298 * centre2 1.04635 0.53075 1.971 0. 0487 * treatmenttreatment 2.16087 0.51652 4.183 2.87 e-05 ***sexmale 0.20740 0.65969 0.314 0. 7532 baselinegood 3.07037 0.52499 5.848 4.96 e-09 ***age -0.02549 0.01994 -1.278 0. 2012 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
Correlation of Fixed Effects: (Intr) centr2 trtmnt sexmal bslngdcentre2 -0.054 trtmnttrtmn -0.407 0.018 sexmale -0.008 -0.151 0.222 baselinegod -0.347 -0.236 0.206 0.101 age -0.753 -0.226 -0.015 -0.255 0.069
Longitudinal Data Analysis 274
![Page 276: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/276.jpg)
resp lmer2 = lmer(status ∼ centre + treatment + sex + baseline + age
+ (age|subject), data = resp, family = "binomial")
Longitudinal Data Analysis 275
![Page 277: Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford Statistical Science Series. Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied](https://reader030.vdocument.in/reader030/viewer/2022041116/5f287ee51048e5323f4cace2/html5/thumbnails/277.jpg)
R Console Page 1
> summary(resp_lmer2)Generalized linear mixed model fit using Laplace Formula: status ~ centre + treatment + sex + baseli ne + age + (age | subject) Data: resp Family: binomial(logit link) AIC BIC logLik deviance 445.8 482.7 -213.9 427.8Random effects: Groups Name Variance Std.Dev. Corr subject (Intercept) 1.964799 1.401713 age 0.001584 0.039799 0.003 number of obs: 444, groups: subject, 111
Estimated scale (compare to 1 ) 0.7859826
Fixed effects: Estimate Std. Error z value Pr(> |z|) (Intercept) -1.29487 0.72534 -1.785 0. 0742 . centre2 0.99755 0.50953 1.958 0. 0503 . treatmenttreatment 2.01372 0.50179 4.013 5.99 e-05 ***sexmale 0.24017 0.68883 0.349 0. 7273 baselinegood 2.97704 0.51023 5.835 5.39 e-09 ***age -0.03354 0.02107 -1.592 0. 1114 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
Correlation of Fixed Effects: (Intr) centr2 trtmnt sexmal bslngdcentre2 -0.084 trtmnttrtmn -0.396 0.013 sexmale 0.053 -0.130 0.215 baselinegod -0.337 -0.226 0.217 0.076 age -0.753 -0.173 -0.038 -0.316 0.042
Longitudinal Data Analysis 276