advances in longitudinal methods in the social and ...€¦ · nonlinear mixed- effects mixture...
TRANSCRIPT
Advances in Longitudinal Methods Conference J. Harring 1
Advances in Longitudinal Methods in the Social and Behavioral Sciences
Finite Mixtures of Nonlinear Mixed-Effects Models
Jeff Harring Department of Measurement, Statistics and Evaluation The Center for Integrated Latent Variable Research University of Maryland, College Park [email protected]
Advances in Longitudinal Methods Conference J. Harring 2
Overview
� Mixture modeling
– univariate and multivariate applications
� Characteristics of repeated measures learning data
� Nonlinear mixed-effects models – model description and analysis
� Nonlinear mixed-effects mixture (NLMM) model – model description
Advances in Longitudinal Methods Conference J. Harring 3
Overview
� Learning data example revisited
– analytic decision points
� Issues, challenges & considerations
Advances in Longitudinal Methods Conference J. Harring 4
Finite Mixture Models
� Karl Pearson (1894)
� Primary purposes… – model the density of complex distributions – model population heterogeneity
� Modeling heterogeneity � mixture of distributions from the same parametric family
� Inferential goals
Advances in Longitudinal Methods Conference J. Harring 5
Applications of Finite Mixture Models
� Univariate mixtures
Advances in Longitudinal Methods Conference J. Harring 6
Applications of Finite Mixture Models
� Regression mixtures
� Regression mixture modeling involves estimating separate regression coefficients and error for each latent class
0 1i k k i iky x e� �� � �
Advances in Longitudinal Methods Conference J. Harring 7
Applications of Finite Mixture Models
� Multivariate normal mixtures
– MVN mixtures � cluster analysis
Advances in Longitudinal Methods Conference J. Harring 8
Applications of Finite Mixture Models
� Multivariate normal mixtures
1
( | , ) ( | )K
i k k i kk
f f��
� �y � � y �
1 22 11( | ) (2 ) exp ( ) ( )2
pk i k k i k k i kf � �� ��� � � �y � � y � � y �
Advances in Longitudinal Methods Conference J. Harring 9
Applications of Finite Mixture Models
� Multivariate normal mixtures
1
( | , ) ( | )K
i k k i kk
f f��
� �y � � y �
1 22 11( | ) (2 ) exp ( ) ( )2
pk i k k i k k i kf � �� ��� � � �y � � y � � y �
Advances in Longitudinal Methods Conference J. Harring 10
Applications of Finite Mixture Models
� Multivariate normal mixtures
1
( | , ) ( | )K
i k k i kk
f f��
� �y � � y �
1 22 11( | ) (2 ) exp ( ) ( )2
pk i k k i k k i kf � �� ��� � � �y � � y � � y �
Advances in Longitudinal Methods Conference J. Harring 11
Applications of Finite Mixture Models
� Multivariate normal mixtures
1
( | , ) ( | )K
i k k i kk
f f��
� �y � � y �
1 22 11( | ) (2 ) exp ( ) ( )2
pk i k k i k k i kf � �� ��� � � �y � � y � � y �
Advances in Longitudinal Methods Conference J. Harring 12
Applications of Finite Mixture Models
� Growth mixture modeling
Advances in Longitudinal Methods Conference J. Harring 13
Applications of Finite Mixture Models
� Growth mixture modeling
y �� e
�
i i i
ki iki
ki ik
��
� ��� �
� �
� � � �� �� � �� � � �� �� � � �� �
1 22 11( | ) (2 ) exp ( ) ( )2
pk i k k i k k i kf � �� ��� � � �y � � y � � y �
� �� � � � k k k k k
�� � �
Advances in Longitudinal Methods Conference J. Harring 14
Applications of Finite Mixture Models
Advances in Longitudinal Methods Conference J. Harring 15
Applications of Finite Mixture Models
� Handle nonlinear data with a nonlinear function
� Modeling potential population heterogeneity
Advances in Longitudinal Methods Conference J. Harring 16
Applications of Finite Mixture Models
Intrinsically Nonlinear Functions Nonlinear Mixed- Effects Model
Modeling Population Heterogeneity Finite Mixture Model
Advances in Longitudinal Methods Conference J. Harring 17
Applications of Finite Mixture Models
Intrinsically Nonlinear Functions Nonlinear Mixed- Effects Model
+
Modeling Population Heterogeneity Finite Mixture Model �
Nonlinear Mixed- Effects Mixture
Model
Advances in Longitudinal Methods Conference J. Harring 18
Nonlinear Learning Data – Speech Errors
� Burchinal & Appelbaum (1991)
– Repeated measures data are recorded speech errors on a standard passage of text from an instrument of language proficiency
– A sample of 43 young
children ranging in ages from 2 to 8 years
Advances in Longitudinal Methods Conference J. Harring 19
Nonlinear Learning Data – Speech Errors
� Burchinal & Appelbaum (1991)
– A maximum number of 6 repeated measures were taken
– Ages at time of testings
differed for each child – incomplete cases were
observed
– independent rating of overall speech intelligibility used as a covariate
Advances in Longitudinal Methods Conference J. Harring 20
Model for Nonlinear Learning Data
� The nonlinear mixed-effects (NLME) model is well- suited to handle both intrinsically nonlinear functions as well as key data and design characteristics – continuous response – compellingly strong individual difference in trajectories – substantial variability in time-response across subjects – distinct measurement occasions for each subject – interesting nonlinear change
1( , , , )ij i pi ij ijy f x e� �� ��
1, , ; 1, ,ij n i m� �� �
Advances in Longitudinal Methods Conference J. Harring 21
NLME Model for Burchinal & Appelbaum Data
� Following Davidian & Giltinan (2003) : two-stage hierarchy
� A subject-specific decelerating, decreasing exponential
function is proposed � Stage 1: Individual-Level Model
1 2exp ( 3)i iij ij ijy x e� �� � �
– 1i� : the average number of speech errors at age 3 – 2i� : rate parameter that governs functional decline
Advances in Longitudinal Methods Conference J. Harring 22
NLME Model for Burchinal & Appelbaum Data
� Stage 2: Population-Level Model
1 1
2
1
2 2
i
i
ii
i
bb
��
��
�� � � �� �� � � ��� � � �
� unconditional
1 1
2 2
1 1
22
i ii
i i
i
i
z bz b
� ����
�� �� � � �
� �� � � �� �� � � �� conditional
Advances in Longitudinal Methods Conference J. Harring 23
NLME Model for Burchinal & Appelbaum Data
� Distributional assumptions
( , )i Nb 0 � ( , ( ))e 0 � i iN� cov( , )i i
� �e b 0 cov( , )e e 0i i�
� � cov( , )b b 0i i�
� �
1
2 1 2
var( )cov( )
cov( , ) var( )i
ii i i
bb b b
� �� � � �
� �b
Advances in Longitudinal Methods Conference J. Harring 24
NLME Model for Burchinal & Appelbaum Data
� Distributional assumptions
( , )i Nb 0 � ( , ( ))e 0 � i iN� cov( , )i i
� �e b 0 cov( , )e e 0i i�
� � cov( , )b b 0i i�
� � 2
ii n��� I
Advances in Longitudinal Methods Conference J. Harring 25
Inference NLME Model : Maximum Likelihood
� The marginal distribution of y i
( ) ( , ( )) () |y y b y bb bbi i i i iii i ph p d p d� �� �
� Let � �, , ( )vech �� � ��� �
� Parameter estimation is carried out by maximizing the log-likelihood AGHQ (Pinheiro & Bates),GHQ (Davidian & Gallant), Linearization (Lindstrom & Bates), GTS (Davidian & Giltinan),Bayesian…
1
( ) ln ( | )
ln ( )
� � y
ym
ii
l L
h�
�
� �
Advances in Longitudinal Methods Conference J. Harring 26
Analysis
� The unconditional model was fitted using SAS PROC
NLMIXED (Gaussian Quadrature – 30 points)
1
2
ˆ .ˆˆ .��
�
�
� � � �� �� � � ��� �� �
18 48�
0 98
ˆ�� �
� � �� �0.1
77.025 0.05
2� �� 9.73
2ln 1227.0 1249.6 L BIC� � �
Advances in Longitudinal Methods Conference J. Harring 27
Analysis
� The conditional model was fitted with mean-centered
covariate – SAS PROC NLMIXED (GH - 30 points)
1
2
ˆˆ
ˆ��
�
�
� � � �� �� � � ��� �� �
18.22�
0.97 1
2
ˆˆ
ˆ��
��� �� �� � � �� �� � � � �
2.76�
0.06
ˆ�� �
� � �� ��0.2
76.849 0.04
2� �� 9.73
2ln 1217.2 1246.3 L BIC� � �
Advances in Longitudinal Methods Conference J. Harring 28
Potentially Clustered Data
� In subject-specific models, like the NLME model, regression
parameters are allowed to vary across individuals resulting in differing within-subject profiles
� ( )iE �b 0 & ( )e 0iE � ��assumption all subjects were
sampled from a single populations with common parameters
� Finite mixture models relax the single population distributional assumption of the random effects and the conditional distribution for the data to allow for parameter differences across unobserved populations
Verbeke & Lesaffre, 1996, Verbeke & Molenberghs, 2000 Muthén & Shedden, 1999; Muthén, 2001, 2003, 2004
Hall & Wang, 2005
Advances in Longitudinal Methods Conference J. Harring 29
NLMM Model
� The exponential NLME model can be extended to a NLMM
model for K latent classes where in class k (k = 1, 2,…, K )
1 2exp ( 3)y ei i i ij ix� �� � � ( , )e 0 �i kN�
� � � z bi k k i i� � � ( , )b 0 i kN�
1 2k k� �and � k� is the probability of belonging to class k 1k
k�� ��
� 0 1k�� �
Advances in Longitudinal Methods Conference J. Harring 30
NLMM Model – Getting Started
� Fit a series of models suppressing random effects : k � 0
& � �k � (LCGM – Nagin (1999))
� Method of deriving starting values for the mean structure of
the NLMM model
� Use NLME output to suggest starting values for and �
Advances in Longitudinal Methods Conference J. Harring 31
NLMM Model – How Many Latent Classes?
� Several statistics have been proposed and recommended in
practice: AIC, BIC, SBIC, CLC, LMR-LRT (Lo, Mendell & Rubin, 2001), BLRT, Multivariate skewness and kurtosis indices
� Compute and plot BIC or SBIC values against number of
classes
Advances in Longitudinal Methods Conference J. Harring 32
NLMM Model – How Many Latent Classes?
……………… No Within-Class Variation LCGM
____________ Within-Class
Variation NLMM
Advances in Longitudinal Methods Conference J. Harring 33
NLMM Model – How Many Latent Classes?
� Decision based on…
– theoretical defensibility – model fit – parsimony – class separation and class
incidence
� Expertise of the substantive researcher
� Do the classes represent substantively meaningful groups?
Advances in Longitudinal Methods Conference J. Harring 34
NLMM Model – Analysis of Learning Data
� Parameter estimates for the two-class model
NLME Estimates
�
Class Means
ˆk�
Class-Specific Covariate Estimates
ˆ k�
.
.18 22
0 97
�
�
� �� ��� �
. .ˆ
. 1
13 240 33
0 66�
��
�
� ��� ��� �
.
.ˆ. 2
19 450 67
1 07�
�
�
� ��� ��� �
11 12. .ˆ ˆ 00 04 10� � �� ��
1
2
ˆ .ˆ
ˆ��
�� �� �� � � �� �� � ���
*
0.082 76
�
21 22. .ˆ ˆ2 72 0 25� �� �� � � �
Advances in Longitudinal Methods Conference J. Harring 35
NLMM Model – Analysis of Learning Data
� Parameter estimates for the two-class model NLME Estimate
Covariance Matrix
Class-Specific Covariance
Matrix ˆ ˆk �
NLME Estimate Residual Variance
2�
Class-Specific Residual Variance
2 2ˆ ˆk �� �
�� �� �� �0.15 0.05
77.02
.. .
�
� �
� �� �� �
53 652 35 0 62
9.73*
2.73*
Advances in Longitudinal Methods Conference J. Harring 36
NLMM Model – Assessing Model Fit?
� The quality of the mixture based on the precision of the
classification – classification is based on estimated posterior probabilities
� For 2K � , average posterior probabilities can be computed.
A K K� matrix should have high diagonal and low off-diagonal values indicating good classification quality – Average Latent Class Probabilities for Most Likely Latent
Class Membership (Row) by Latent Class (Column) 1 2 1 0.854 0.146 2 0.204 0.796
Advances in Longitudinal Methods Conference J. Harring 37
NLMM Model – Assessing Model Fit?
� Entropy – a summary measure of the classification based
on individuals’ estimated posterior probabilities can be computed with values close to 1 indicating near perfect classification
1 1
( ln )ˆ ˆ1
ln
m K
ik iki k
KE m K
� �� �
�� ���
2 0.73KE � �
Advances in Longitudinal Methods Conference J. Harring 38
Graphical Summaries
� Plot predicted random coefficients on a contour plot or
surface plot
Advances in Longitudinal Methods Conference J. Harring 39
Graphical Summaries
� Plot predicted random coefficients on a contour plot or
surface plot
Good Separation Modest Separation
Advances in Longitudinal Methods Conference J. Harring 40
Graphical Summaries
Two Latent Classes: Mean Trajectories
Advances in Longitudinal Methods Conference J. Harring 41
Graphical Summaries
Fitted Curves for 4 Individuals Fitted Curves for 4 Individuals in Class 1 in Class 2
Advances in Longitudinal Methods Conference J. Harring 42
Practical Issues…
� Estimation
� Computing standard errors
Advances in Longitudinal Methods Conference J. Harring 43
Log-likelihood
� Let 1 1( , , )K� � �
� �� �
� Let ( , ( ), ( ))k k kvech vech� ��� � �
� Let ( , )� � ��� � � all model parameters, then the log-likelihood
� �� �
11
1 1
( ) ln ( | )
ln ( )
ln ( )
� � y
y
y
m K
k k iki
m K
k k ii k
l L
h
h
�
�
��
� �
�
�
�
�
� �
where ( ) ( | ) ( )y y b b bk i k i i k i ih p p d� �
Advances in Longitudinal Methods Conference J. Harring 44
Estimation
� If the nonlinear regression coefficients are fixed across
individuals : Mplus (however – does not handle unique measurement occasions)
21 exp ( 3)ij ij ijiy x e��� � �
� Directly maximize the log-likelihood using gradient methods
like N-R or Quasi-Newton
� Use EM (Expectation – Maximization) algorithm treating class membership as missing data
Advances in Longitudinal Methods Conference J. Harring 45
Standard Errors
� Direct maximization uses the diagonal elements of the
Hessian matrix at convergence for a model-based estimate of the standard errors.
� EM algorithm – no standard errors are computed as a by-product of the algorithm – At convergence, use a direct maximization step to
produce SE
Advances in Longitudinal Methods Conference J. Harring 46
More Practical Issues and Future Considerations…
� Evidence of local extrema
� Covariates predicting individual coefficients & class
membership
� Does the method of handling the intractable integration influence the number of latent classes?
� Normality of the random effects distribution: non-parametric alternatives?
Advances in Longitudinal Methods Conference J. Harring 47
Advances in Longitudinal Methods in the Social and Behavioral Sciences
Finite Mixtures of Nonlinear Mixed-Effects Models
Jeff Harring Department of Measurement, Statistics and Evaluation The Center for Integrated Latent Variable Research University of Maryland, College Park [email protected]