Causal effects in mediation analysiswith limited-dependent variables
By: Marten Schultzberg
Department of Statistics
Uppsala University
Supervisor: Fan Yang-Wallentin
2016
Contents
1 Introduction 3
1.1 Mediation analysis in general . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Direct and indirect effects . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Counterfactual-based causal effects in mediation analysis . . . . . . . . . 5
1.4 Limited-dependent variable . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Methodology 6
2.1 Mediation analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 The simple mediation model and its motivation . . . . . . . . . . 7
2.1.2 Adding relations to the simple mediation model . . . . . . . . . . 8
2.1.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Two-group regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Limited-dependent variable analysis . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 The Two-part model . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Mediation analysis with limited-dependent variable(s) . . . . . . . . . . . 13
2.5 The counterfactual framework . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5.1 Effect notation and calculations for mediation . . . . . . . . . . . 15
2.5.2 Assumptions for causal effect of mediation models . . . . . . . . . 17
3 Mediation, two-part M 19
3.1 Model formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Derivation of effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.1 Conditional expected value of Y . . . . . . . . . . . . . . . . . . . 22
3.3.2 Causal effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Mediation, two-part M and two-part Y 23
4.1 Model formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Derivation of causal effects . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.1 Conditional expected values . . . . . . . . . . . . . . . . . . . . . 26
1
4.3.2 Causal effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5 Monte Carlo simulations 30
5.1 Synthetic models and true data generating processes . . . . . . . . . . . 30
5.1.1 Weak and Moderately strong effects model . . . . . . . . . . . . . 30
5.1.2 Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.3 Model 1 - Two-part M . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1.4 Model 1 - Two-part M, Weak . . . . . . . . . . . . . . . . . . . . 33
5.1.5 Model 1 - Two-part M, Moderately strong . . . . . . . . . . . . . 34
5.1.6 Model 2 - Two-part M, Two-part Y . . . . . . . . . . . . . . . . . 34
5.1.7 Model 2 - Two-part M, Two-part Y, Weak . . . . . . . . . . . . . 34
5.1.8 Model 2 - Two-part M, Two-part Y, Moderately strong . . . . . . 34
5.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3.1 Outcome variables . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3.2 Two-part M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3.3 Two-part M, Two-part Y . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6 Discussion 52
7 Conclusion 54
A Appendix - Derivation - Two-part M 59
B Appendix - Derivation - Twopart M, twopart Y 65
C Appendix - Mplus syntax 80
2
Abstract
Mediation is used to separate direct and indirect effects of an exposure variable on an
outcome variable. In this thesis, a mediation model is extended to account for censored
mediator and outcome variable. The two-part framework is used to account for the
censoring. The counterfactual based causal effects of this model are derived. A Monte
Carlo study is performed to evaluate the behaviour of the causal effects accounting for
censoring, together with a comparison with methods for estimating the causal effects
without accounting for censoring. The results of the Monte Carlo study show that the
effects accounting for censoring have substantially smaller bias when censoring is present.
The proposed effects also seem to have a low cost with unbiased estimates for sample
sizes as small as 100 for the two-part mediator model. In the case of limited mediator
and outcome, sample sizes larger than 300 is required for reliable improvements. A small
sensitivity analysis stresses the need of further development of the two-part models.
Keywords: counterfactuals, two-part model, potential outcome
3
1 Introduction
The introduction of this study will give a quick overview motivation of the study followed
by the research questions.
1.1 Mediation analysis in general
Mediation analysis is used to quantify the effects that an exposure variable has on an
outcome variable, mediated by some intermediate variable. For example a gene that
causes cancer, also causes increased cigarette usage that in turn causes cancer. The
effect of the gene on cancer is mediated by cigarette usage. The intermediate variable
(cigarette usage) is often called the mediator variable, or the mediator. The hypothesised
relationships of a simple mediation model is that an exposure variable (X) causes some
change in a mediator variable (M) that in turn causes a change in an outcome variable
(Y) (Hayes, 2013). The mediation analysis has become widely used in social sciences and
biomedical studies especially since the influential paper by Baron and Kenny (1986). The
claim of a possibility to open up the ”black-box”, answering question such as ”Through
what mechanism does X affect Y?” or ”How does a change in X affect Y?” is probably
an explanation of the vast usage. For a thorough overview of traditional mediation
analysis see Hayes (2013). In recent years the causal claims of these models and their
limitations has been investigated in detail. The potential outcome and the counterfactual
framework has been developed and applied contributing to general definitions of causal
effects and inference of mediation analysis. The causal mediation literature has also
focused on acknowledging and assessing the strong assumptions on which the causal
interpretations of these effects rely (Imai et al., 2010; Pearl, 2001; Robins and Greenland,
1992; VanderWeele, 2015).
1.2 Direct and indirect effects
The need to separate direct and indirect effects in mediation is essentially a tool to
make complex relations comprehensible. If a variable X affects both M and Y, but
M also affects Y, then how should the effect of X on Y be separated from X on Y
trough M? The corresponding question in the the cancer example would be ”How is
the direct effect of the gene on cancer separated from the effect of the gene through
4
cigarette usage on cancer?” As will be demonstrated several research questions can be
answered once the set of causal effects are defined. The traditional way of calculating the
indirect effect is called the product method and is credited to Baron and Kenny (1986),
oftentimes the method is even referred to as the Baron and Kenny-method. Baron and
Kenny (1986) has been one of the most influential papers in the mediation field, making
the product method commonly applied. The product method is adequate for linear
mediation models with continuous mediators and outcome. In some research areas this is
the most commonly applied mediation model (Rucker et al., 2011). However, as Robins
and Greenland (1992) and Pearl (2001) pointed out the product method is unable to
account for non-continuous mediators and outcomes, as well as mediation models with
moderation and other non-linear functional forms. General effect definitions, building on
the potential outcome framework (Rubin, 1974) was suggested by Robins and Greenland
(1992) and Pearl (2001).
1.3 Counterfactual-based causal effects in mediation analysis
The causal effects based on counterfactuals offer general causal effects definition. The
definition does not assume any functional form or model and can be applied to a wide
range of mediation models with varying complexity (Pearl, 2001). More recently, causal
effect in many special cases of mediation models has been derived from these definitions
(Muthen et al., 2016; Vanderweele, 2012; VanderWeele and Vansteelandt, 2010; Wang
and Albert, 2012).
1.4 Limited-dependent variable
In many research situations limited dependent variables are encountered. Figure 1 shows
an example of a sample from a limited variable, censored from below. This is characteristic
histogram for a censored variable with many observations at one point and no observations
below that point, as if the range of the observations was limited by something. The
importance of accounting for censoring has been pointed out by e.g Tobin (1958), Cragg
(1971), Jones (1989) and Brown et al. (2005). If limited-dependent variables are not
handled, model estimates will be biased. Thus, to estimate effects without bias in a
mediation model where some dependent variable is limited, special methods are required.
Limited dependent outcome variables in mediation is recently handled in Muthen et al.
5
(2016). However, also the mediator in a mediation analysis is dependent in the regression
on the exposure.
Figure 1: A sample of 1000 observation from a limited normal variable with mean andvariance equal to 1. Censored in the point 0.22.
If the bias found in regression analysis with limited-dependent variables transfers to
the mediation analysis this might have severe consequences on the effect estimates and
conclusions from mediation analysis. It is of interest to investigate the impact of the
ability to account for limited dependent variables as mediators and outcome variables.
1.5 Research questions
The aims of this study is to answer the following questions:
1. (a) How can the mediation model be formulated to account for a limited mediator
and/or outcome variable?
(b) What are the additional assumption(s) for the two-part mediation models
compared to the simple mediation model?
2. How are the counterfactual based causal effects for the two-part mediation models
derived?
3. Does acknowledging and accounting for limited mediators and/or outcome variable
improve the accuracy of the causal effect estimates?
4. What are the sampling behaviours of the causal effects for the two-part mediation
models?
6
The remaining parts of the thesis will have the following structure. Section 2 will give
a detailed overview of the methods and motivations for the formulation of the limited-
dependent variable models. Section 3 contains the model formulation and causal effects
derivations for the two-part M model. Section 4 contains the model formulation and
causal effects derivations for the two-part M, two-part Y model. In Section 5 Monte
Carlo simulations are performed to evaluate the small sample properties for these models.
Section 6 and 7 contain discussion and conclusions of the study.
2 Methodology
In this section all the parts necessary to construct the two-part mediation models is
introduced and motivated in detail.
2.1 Mediation analysis
In this section the development and properties of mediation analysis are presented. The
simple mediation model is presented and extended to become more suitable for this study.
2.1.1 The simple mediation model and its motivation
The simple mediation model is illustrated in Figure 2. The exposure X affects the outcome
Y, both directly and indirectly mediated by the mediator M. Rather than to focus on the
size of the total effect of an exposure on an outcome, mediation analysis directs special
attention to the ”How” part. That is, how or by what means, does the exposure affect
the outcome? Through what intermediate steps does the exposure affect the outcome?
An easy way to motivate the need of answers to this kind of questions is through the
perspective of policy makers. In many situations an exposure cannot be regulated by
policies, however some mediators might. Drawing on the example from VanderWeele
(2015), originally analysed in Vanderweele (2012), the risk of lung cancer is investigated.
A genetic variant of a chromosome (X) is believed to affect the risk of lung cancer (Y).
Moreover, evidence has shown that this genetic variant affect smoking behaviour, making
carriers of the genetic variant smoke more. It is known that smoking cigarettes increases
the risk of lung cancer. It is possible that the genetic variant of the chromosome is
causing cancer only through its effect on cigarette usage. In that case, the policy makers
7
can try to reduce the cigarette usage by laws and taxes in order to decrease the number
of lung cancer patients. It is also possible that the indirect effect of the gene through
cigarette usage on the risk of lung cancer is small, and the gene directly causes cancer. In
the latter scenario, it might be difficult for the policy makers to take effective actions to
decrease the number of patients diagnosed with lung cancer. This over-simplified example
illustrates the importance of understanding the role which mechanisms themselves play
in effective policy making. If one can quantify and compare the importance of single
mediators, resources can be directed more effectively. This way of coming at questions
transfers to a wide range of situations, in various kinds of research fields. In Equation 1
M
YX
εM
εY
Figure 2: The simple mediation model. X is the exposure variable, M the mediator andY the outcome.
the model formulation of the simple mediation model is displayed.
Mi = γ0 + γ1Xi + εMi
Yi = β0 + β1Mi + β2Xi + εY i
(1)
, Xi is the exposure variable for individual i. Mi and Yi are mediator and outcome of
individual i, both assumed continuous. εi is the error term. The error terms are most
commonly assumed to be iid normally distributed with mean zero and uncorrelated with
X and M. The model is constructed by two linear models. One model where the mediator
is modelled by the exposure, and one where the outcome is modelled by the mediator
and the exposure.
2.1.2 Adding relations to the simple mediation model
In most situations the simple mediation model is too parsimonious to capture relevant
mechanisms. For example the assumptions of no unmeasured confounder between M
8
and Y (see Section 2.5.2 for details) cannot be guaranteed to be fulfilled, but by adding
relevant covariates the violation might be substantially reduced. Moreover, interaction
between M and X is common; VanderWeele (2015) even suggest that it might generally
be better to keep interaction terms in analysis even when non-significant interaction
estimates are found. In Figure 3, a path diagram for a mediation model with a covariate
affecting M and Y is displayed. Additionally, interaction between M and X is visualized
by a path from X to the path between M and Y. In social sciences interaction is more
often referred to as moderation. The interaction term poses no problems in estimation,
however the traditional product method-based direct and indirect effects are no longer
applicable (Pearl, 2001). The model including the interaction and covariate can be written
as in Equation 2.
M
YX
C
εM
εY
Figure 3: Mediation model with covariate and interaction between M and X. C is thecovariate, X the exposure variable, M the mediator and Y the outcome.
Mi = γ0 + γ1Xi + γ2Ci + εMi
Yi = β0 + β1Mi + β2Xi + β3MiXi + β4Ci + εY i
(2)
The model formulation in Equation 2 is similar to that of the simple mediation model in
Equation 1. The covariate C and the interaction term MX is added. Xi and Ci are the
exposure and covariate variable for individual i. Mi and Yi are the mediator and outcome
for individual i. Again the error term, ε are usually assumed iid normally distributed
with mean zero, uncorrelated with X, C and M.
9
2.1.3 Estimation
The simple mediation model (Equation 1) and the extended mediation model (Equation
2) are estimated with Maximum Likelihood (ML) estimation or ordinary least square
(OLS). If the error terms are independent normally distributed, the OLS estimation
of the two regression models one by one give the same result as the ML estimation
of the whole system simultaneously. The likelihood function of these models is given in
Equation 3. The right hand expression in Equation 3 implies that if there are no common
parameters, as in typical cases, the terms can be maximized separately. The likelihood
can be expressed
logL =n∑i=1
log[yi,mi|xi, ci] =n∑i=1
log[yi|mi, xi, ci] +n∑i=1
log[mi|xi, ci] (3)
, where log[...] is the log of the conditional density function.
2.2 Two-group regression
Two-group regression is special case of multi-group regression, fitting two different regres-
sions to two subgroups within a sample. This can be compared to a single model with
a dummy variable estimating the mean differences between two subgroups in a sample.
The main difference is that two-group regression allows for different covariates in the two
regressions. For the variables that are common for the two regressions the coefficients
can be constrained to be the same, or to have different values, between the groups. The
mean difference between the subgroups that a dummy coefficient would estimate in a one
regression model, is estimated also in the two-group setting by the intercept difference.
If the two models include the same covariates and all coefficients are constrained to be
equal between the models, the intercept difference will be exactly the same as a dummy
variable coefficient in a single model. If some different covariates are included and/or
common covariates are not constrained the intercept difference will not be the same as
the dummy coefficient. Additionally the two-group regression makes it possible to use
different transformations of the same variable, between the groups.
The technical difference between two separate regressions and a two-group regression is
that common parameters constrained to be equal in the two regressions can be estimated
using all available information from both subsets of the sample. However, if no parameter
10
is set to be common, two separate regressions and the two-group regression will give the
exact same estimates as two-group. Hence, two-group regression is motivated when two
subgroups of a sample are believed to have substantially different relationships between
the covariates and the outcome for some covariates but equal for others. As in the case
with the limited mediator M (see details in Section 3), it might be believed that the
relation between the exposure and the outcome for the M=0 and the M>0 group is
similar. However, the M>0 group might have a relation with Y, that the fixed M=0
group will not have. Two-group regression makes it possible to fit one linear regression
of Y on C and X, for the M=0 part, another linear regression for the M>0 part where
the logarithm of M can be added to the independent variables C and X. The estimation
of the slope of Y on X can be constrained to be the same for both regressions, allowing
the estimates of these parameters to be based on the full data set.
The reasons mentioned above indicate that a two-group regression of the outcome in
the two-part mediator models gives a flexible model. The possibility to constrain slopes of
common variables is preserved, still allowing for different covariates in the regressions. If
all the slopes are constrained to be equal, the regression collapses back into a single linear
regression. Similarly, if it is chosen not to constrain any parameter it will simply be two
separate regressions. Two-group regression of Y will make possible general applications
of the derived causal effects.
2.3 Limited-dependent variable analysis
Limited-dependent variables are referred to as many things depending on the context
e.g. two-part, hurdle, corner solution outcome or censored variable. A limited variable
is a variable that for some reason is censored from above and/or below, having a point
mass at the limit (the case of truncated variables are beyond the scope of this study), see
Figure 1. Sometimes such variables are referred to as suffering from ceiling respectively
floor effects. This is probably due to the fact that in histograms of such variables it
looks like the observations ”hit” the ceiling and/or floor with a lot of observations on one
value and no values above/below. To describe the principle of how to handle limited-
dependent variables it is useful to consider only one type of censoring, even though all
results can be used for both censoring from above and below. For the current study only
censoring from below at zero will be considered to simplify examples and derivations,
11
without loss of generality. There are different methods of handling limited-dependent
variables. Most methods have in common that the variable is split into two parts; one
binary part handling the large number of zeros and one continuous part for the non-zero
part of the variable. Usually this is modelled by one binary regression and one standard
linear regression, using the same covariates in both regressions.
One of the first ways to handle limited-dependent variables was proposed in Tobin
(1958). His method, today known as the Tobit model, is widely applied. One limitation of
the Tobit model is that it only allows equal signs for the corresponding parameters in the
two regressions (Wooldridge, 2002). If the binary part has a substantially different data
generating process than the positive part it is in some cases also reasonable that effects
of certain independent variables has different signs on the two parts of the dependent
variable. Cragg (1971) suggested two extensions which solves the limitation of the Tobit
model, the truncated normal hurdle and the log-normal hurdle. In these models the
regressions of the binary and the continuous part of the limited-dependent variable is
estimated independently. Thus, the coefficients of the independent variables are allowed
to have different signs and sizes on the two different parts of the dependent variable.
Throughout this study these models will be referred to as two-part models. For a
thorough overview and comparison between different ways of handling limited-dependent
variables and how they differ from sample selection problems, see Wooldridge (2002) and
Greene (2012).
2.3.1 The Two-part model
Two-part modelling splits the limited-dependent variable into two parts. One binary
zero/non-zero part and one positive part. The intuition is that first a mechanism decides
if the variable will take a positive value or not, and if that value is non-zero; a second
mechanism decides what positive number it will take. For example ”Will a individual
smoke or not?”, if yes; ”How much will the individual smoke?”. Zero in this setting is
viewed as a category and not the continuous numeric value. That is, a person that smokes
zero cigarettes a day is simply a non-smoker. The zero indicates that the person belongs
to the group non-smokers, rather than the amount. It might seem as an unimportant
distinction, however to understand why it is not is crucial for the motivation of the
two-part model. The two-part model is based on the idea that there might be a more
12
substantial difference between a non-smoker and a smoker, than between a ”one cigarette
a day”-smoker and a ”two cigarettes a day”-smoker. Even though the difference in number
of smoked cigarettes between the ”zero cigarettes a day”-smoker and a ”one cigarette a
day”-smoker is the same as that between the ”one cigarette a day”-smoker and ”two
cigarette a day”-smoker, the two that actually smokes might have more characteristics in
common. It is likely that different mechanisms explain if you choose to smoke or not, and
how much you choose to smoke. This reasoning implies that there are situations where
the dependent variable has a point mass at zero but two-part analysis is not suitable. If
the group in the point mass is not viewed as a group of observations with substantial
different characteristics than the other observations, then the variable is not suitable for
two-part analysis.
In practice the zero/non-zero part will be estimated with a binary regression and
the continuous part with standard linear regression. The probit and logit model are
naturally considered for the binary part. Given the small difference in estimation result
between the two (Gill, 2000), probit is chosen to make the derivations in Appendix A
and B simpler. Even though in theory, the two-part model collapses back to the standard
regression for small amounts of censoring, there has to be a certain amount of censoring
for the estimation procedure to work well. The binary regression will behave badly if a too
small amount of the observations belongs to one group. Hence, the estimated coefficients,
and therefore casual effects, from a two-part model will never coincide exactly with the
classical estimates. The probit estimation will break down more severely the closer the
censoring gets to zero. This estimation limitation is discussed in detail in Section 5.1.2.
For the continuous part of the two-part variable a distributional assumption has to be
made. This is crucial for the derivations of the effects. The density function of the
continuous part of the two-part variables has an important role in the derivations. The
most common assumption is that the continuous part of the two-part dependent variable
is normal or lognormal distributed. The experience of the author is that this is often a
somewhat strong assumption not likely to be fulfilled. The sensitivity of this assumption
has, to the best knowledge of the author, not been investigated in detailed. Sensitivity
is discussed further in Section 5.4.
13
2.4 Mediation analysis with limited-dependent variable(s)
The mediation analysis is special regarding independent/dependent relations. As can be
seen in Figure 2, even the simple mediation model implies two dependent variables. M
is dependent of X, but Y is also dependent on X and M. This means that important
considerations normally investigated for the dependent variable should be investigated
for (at least) two variables, in a mediation setting. The focus of this study is to establish
the importance of accounting for limited-dependent variables, in mediation analysis. In
order to cover all cases for limited-dependent variables in a simple mediation setting,
three cases need to be considered. In Case 1 the outcome Y is limited, in Case 2 the
mediator M is limited and in Case 3 both Y and M are limited. Case 1 is the most
obvious since the outcome Y is what would be viewed as the (only) dependent variable
in most regression settings. There are many ways suggested in literature to handle Case
1 in regression analysis (Cragg, 1971; Duan et al., 1983). Limited-dependent variables
in mediation analysis is recently handled in Muthen et al. (2016), where causal effects
for the two-part approach for mediation with limited outcome are derived. A related
approach is given in Wang and Albert (2012) where causal effects in mediation with
limited counts is handled. The second and third case is, to the best knowledge of the
author, not investigated. The second and third case is covered in detail under Section
3 and 4. First the counterfactual framework, used to define the causal effects of these
models, is presented.
2.5 The counterfactual framework
In order to understand the counterfactual framework it is helpful to use an simple exam-
ple, where the exposure variable is a dichotomous treatment variable. An individual can
be given the treatment, or not given the treatment at the arbitrary time point t. The
outcome, say health on a continuous scale, is measured after the exposure at time point
t+1. The desired effect to measure is the difference between the individuals health at time
point t+1 if given the treatment, and the individuals health at time point t+1 if not
be given the treatment. This is of course impossible to retrieve since on person can at
time t only receive one treatment, and can thus at time t+1, only have received either the
treatment or not. This is the effect of interest since this effect is the true treatment effect
14
i.e. true in the sense that the healing effect of time would not distort the measure of the
treatment effect. Unfortunately, this cannot be resolved by giving both treatments after
each other to one individual, due to to carry-over effects, the time points would also not
be the same. Rubin (1974) started with a similar setup as above and suggested that since
only one outcome can be observed for each individual, the unobserved outcome could be
called the potential outcome. That is, the health that an individual would potentially
have had at time t+1 if given the other treatment. Rubin suggested that focus should be
shifted from the, by logic impossible to retrieve, individual effect, to instead look at the
effects on group level. The expected value for an individual conditioned on being given
treatment or not, could then be calculated. The difference between the expected value of
health given treatment and the expected value of health not given treatment, can then be
used as an estimate of the desired treatment effect. This was soon adopted and refined
by a large number of researchers in different fields (Imbens and Angrist, 1994; Pearl,
1995, 2001; Robins and Greenland, 1992; Spirtes et al., 1993) see Wooldridge (2002) and
VanderWeele (2015) for recent overviews. Attempting to generalize the causal effect def-
initions in the mediation field, Robins and Greenland (1992) and Pearl (2001) suggested
counterfactual-based effect definitions as a complement to the traditionally used product
method (Baron and Kenny, 1986).
2.5.1 Effect notation and calculations for mediation
To present the effect definition from Robins and Greenland (1992) and Pearl (2001) some
convenient notation is first defined. Consider the mediation model in Figure 2, and for
simplicity let X be dichotomous. Let Y0 be the outcome of an individual who was exposed
to X=0 and Y1 the outcome of X=1 respectively. Additionally let Y1m be the outcome
of an individual that was exposed to X=1 and where M was set to the value m. Now let
M(0) be M conditioned on X=x0. This means that Y1M(0) or short Y1,0 is the outcome of
a individual with X=1, however with M set to whatever it would have been conditioned
on X=0. This can be generalized to non-dichotomous X, where the last expression would
be Yx1,M(x0) for arbitrary chosen points x1 and x0. The Controlled Direct Effect (CDE),
for when X changes from x0 to x1, is defined CDE(m)=Yx1m − Yx0m. However, since
two different values can never be observed for one individual,the average effect is being
15
considered for all effects presented below. The Average of CDE is defined as
CDE(m) = E[Yx1m − Yx0m] (4)
where Yx1m is Y conditioned on X=x1 and M=m, Yx0m is Y conditioned on X=x0 and
M=m. CDE can be interpreted as the effect X has on Y when it changes from x0 to
x1 when M is fixed to m. If returning to the example of lung cancer in 2.1.1 from
VanderWeele (2015), then CDE(10)= Y1,10 − Y2,10, corresponds to the effect of ”moving”
from not having, to having the genetic variant of chromosome (gene), if smoking 10
cigarettes per day. Even though this kind of effect is often interesting, the fixed m=10
does not correspond to a natural situation. A more natural situation might be considered
if instead of fixing M letting it take the value it would have taken conditioned on the x0
considered. The Pure Natural Direct Effect (PNDE) is defined as
PNDE(m) = E[Yx1M(x0) − Yx0M(x0)] (5)
and can be interpreted as effect X has on Y when it changes from x0 to x1 when M
takes the value it would take on average for X = x0. Applying this to the lung cancer
example will give the effect of ”moving” from not having, to having the genetic variant,
given smoking the amount of cigarettes the average individual does in the absence of the
gene. This effect is natural in the sense that M takes a value it would naturally do on
average for one of the values of X considered. A corresponding Total Natural Indirect
Effect (TNIE), with the average TNIE being defined as
TNIE(m) = E[Yx1M(x1) − Yx1,M(x0)] (6)
and can be interpreted as the effect X have trough M on Y when X changes from x0 to
x1. In the lung cancer example this would correspond to the effect of ”moving” from not
having to having the gene has on the risk of lung cancer only by affecting the number
of cigarettes smoked per day. In addition to the Pure Natural effects there is also Total
Natural effects; the Total Natural Direct Effect and the Total Natural Indirect Effect. The
16
difference is on which value of X, Y or M is conditioned.
TNDE(m) = E[Yx1M(x1) − Yx0,M(x1)] (7)
PNIE(m) = E[Yx0M(x1) − Yx0,M(x0)] (8)
In linear mediation models there is no difference between the Pure Natural and the Total
Natural effects, however for non-linear models the difference can be substantial. The
counterfactual based causal effects of course also covers the usual treatment effect. That
is, the difference on the outcome if given the treatment or not, or in our continuous
exposure case: The difference on the outcome if exposed to x0 or x1. This is the total
effect of the exposure on the outcome, the sum of all indirect and direct effects of X on
Y. The Total effect is defined
TE(m) = E[Yx1M(x1) − Yx0,M(x0)] (9)
One important aspect of these counterfactual based effects is the fact that they always
fulfil the relation TE = TNIE+PNDE. This property is obvious from the definition and
is an important key to why these effects does not rely on any specific functional form.
The product method effects only fulfil this relation for linear models.
2.5.2 Assumptions for causal effect of mediation models
The review of the assumptions is based on that off VanderWeele (2015), which offers a
thorough overview. There are four assumptions for establishing causal interpretations of
all the effects.
• Assumption 1 - No unmeasured confounding of the exposure-outcome relationship
• Assumption 2 - No unmeasured confounding of the mediator-outcome relationship
• Assumption 3 - No unmeasured confounding of the exposure-mediator relationship
• Assumption 4 - No mediator-outcome confounder that is dependent on the exposure
The first two assumptions implies that the covariates included in the model have to be
sufficient to control for the confounding relations between the exposure and the outcome,
17
and between the mediator and the outcome. Assumption 1 can be fulfilled by random-
ization in assignment of exposure however this is not always the case for assumption 2.
Assumption 1 and 2 to are necessary and sufficient for controlled causal effects. Assump-
tion 1 and 2 are also necessary for the natural effects, however two additional assumptions
are needed to ensure the causal interpretations of the natural effects. The third assump-
tion means that the variables influencing the level of both the exposure and the mediator
must be controlled. The final fourth assumption is often viewed as a strong assumption
since it means that all confounders of the mediator and the outcome must be indepen-
dent of the exposure. It is important to recognize that randomization does not make
all assumptions in mediation fulfilled. This was emphasised by Judd and Kenny (1981)
and James and Brett (1984), but not in Baron and Kenny (1986), thus being a notion
less widely spread. This also implies that data collection and caution about controlling
for confounders is particularly important for causal mediation models to be reliable. If
all four assumptions are fulfilled, the effects defined above are said to have causal in-
terpretations. However, the causality relies on some additional implicit assumptions of
temporal ordering. The temporal ordering maybe implied by the word ”causal”, but is
worth pointing out as mediation analysis is often preformed on cross sectional data. Even
though causal interpretations in some cases can be made with cross sectional data this is
heavily relying on assumptions. With that said, Hayes (2013, pp. 89) makes a statement
regarding assumptions interpretations and analysis in general, and even though it is not
covering counterfactuals, is is still worth quoting: ”Sometimes theory and solid arguments
is the only foundation upon which a causal claim can be built given limitations of our data.
But I see no problem conduction the kind of analysis I describe in the following chapters
even when causal claims rest on shaky grounds. It is our brains that interpret the place
and meaning on the mathematical procedures used, not the procedures themselves.”. The
importance of assumptions is, according to Hayes, not necessarily to always fulfil them
but to understand and acknowledge the limitations they impose on the interpretations.
Assumption 1-4 are not testable, and of course an analyst can never know if all
relevant confounding relations are captured. In order to make these effects useful without
too much doubt, sensitivity analysis is suggested by many (Imai et al., 2010; Pearl, 2001;
VanderWeele, 2015). The idea is to explicitly display ”how much” the result of a mediation
analysis relies on the assumption. This is asserted e.g. by showing how large the effect
18
an unmeasured confounder on the exposure and the outcome must be, to fully explain
the effect of the exposure on the outcome. This can be done by trying different effect
sizes of the unmeasured confounder on the exposure and the mediator. Although these
values are arbitrary in some sense, they give a good indication of how strong the estimated
effects are relative to the assumption. If it only takes a weak confounder to account for the
indirect effects then the assertion of the assumptions is crucial for reliable interpretations.
On the other hand, if it takes a huge, non-plausible, effect of the confounder on the
exposure and the mediator to account for the indirect effect then the interpretations
may be more reliable. This kind of sensitivity analysis procedures are available for all
four assumptions (VanderWeele, 2015). In his book from 2015, VanderWeele strongly
promotes that sensitivity analysis should always be presented together with mediation
analysis and causal effect interpretations. It would arguably create a good standard for
reporting mediation results as well as making mediation analysis less prone to accusations
of relying on unreasonable assumptions. For a recent intuitive, less technical, review of
mediation with causal effects and assumptions see also Keele (2015).
3 Mediation, two-part M
3.1 Model formulation
If the mediator M is limited, the regression of M on the exposure X is affected. If this
would have been a single regression outside the mediation, all the limitations discussed
in Section 2.3, would apply. The question is whether the gains found accounting for cen-
soring in regression transfers into the case of a mediation model with a limited mediator.
In Figure 4 a mediation model similar to that in Figure 3 is expanded to a two-part M
model, to account for a limited mediator. The mediator M is separated into a binary
zero/non-zero part measured with a dummy (M*), and one non-zero continuous part (M).
That is, if M* = 1 then the observation is not censored and also has a continuous value,
if M*=0 the observation is censored and has no continuous value. The two-part model
of M on X and C is constructed from one probit model of M*, modelling the probability
of being in the censoring point zero or not (recall that a floor effect at zero is assumed
without loss of generality). M* is assumed to be generated from a dichotomized normal
distribution. Additionally one linear model accounts for the continuous part of the vari-
19
able. Both the binary and the continuous variable for M are then brought into the linear
regression model of Y. Moreover, a two-group regression model of Y is used, one for the
censored group and one for the uncensored group. In the group where M=0, M is not
a covariate since M is fixed. Since distributional assumptions have to be made (usually
normal) for the continuous part of M, the logarithm transformation is often suggested to
better meet this assumption. Many times the two-part variable has a long right tail and
have better resemblance with a normal curve after taking logarithms i.e. the continuous
part of M is assumed lognormally distributed. The lognormal assumption case will be
the focus of this study. The main reason for this is to make comparisons with earlier two-
part model easier. However, the derivations of the causal effects apply also for normal
distributed M (see details in Appendix A). The implied model formulation is displayed
in Equation 10.
Yi|Mi>0 = β(1)0 + β1 log(Mi) + β
(1)2 Xi + β3 log(Mi)Xi + β
(1)4 Ci + εyi (10a)
Yi|Mi=0 = β(2)0 + β
(2)2 Xi + β
(2)4 Ci + εyi (10b)
log(Mi|Mi>0) = γ0 + γ1Xi + γ2Ci + εmi (10c)
probit(Pr(Mi > 0)) = κ0 + κ1Xi + κ2Ci (10d)
, where by assumption εyi ∼ N(0, σ2y) and εmi ∼ N(0, σ2
m). Equation 10a and 10b are
the two-group regression of Y. The two-group regression is motivated in detail in Section
2.2. One benefit from using the two-group model is that it allows M and M* to have
different intercepts and different slopes for X and C. The interaction between X and the
positive part of M is quantified by β3. The interaction between the zero/non-zero part
of the mediator, M*, is somewhat less obvious. If β12 and β2
2 are allowed to be different
in the two-group regression, their difference is a measure of the interaction effect of the
binary M*. In the effect derivations β12 and β2
2 will be unconstrained until the final
simplifications, so that the full interaction model with interactions for M and M* can be
obtained. However, the focus of the final effect calculations is where only X is allowed to
interact with the effect of M on Y through β3, thus β12set= β2
2 . The difference in β10 and β2
0
will capture the mean difference in Y for the two parts of M. This is discussed in detail in
Section 2.2. Equation 10b and 10d corresponds to the two-part regression of M on X and
20
C. Both the exposure and the covariate are allowed to have different effects on M and
M*. The probit model will create a non-linear mediation model. The functional form
is of importance since the main objective in mediation analysis is often to estimate the
direct and indirect effects of the model. As was shown by Robins and Greenland (1992)
and Pearl (2001) the classical product method (Baron and Kenny, 1986) for calculating
effects from the mediation analysis does not apply for non-linear models. Instead the
counterfactual framework will be used to correctly define these effects.
Figure 4 shows the path diagram implied by Equation 10. The exposure variable X
affects the mediator, where the mediator is two-part and therefore divided into the binary
zero/non-zero part (M*) and the non-zero continuous part (M). X is allowed to moderate
the effect of M on Y. The covariate C affects M, M* and Y.
M∗
M
X Y|M>0
Y|M=0
C
εM
εY
εYTwo-part
Two-group
Figure 4: Path diagram for a mediation model with a two-part mediator and two-groupoutcome, with interaction between the exposure and the mediator. M* is the observedbinary variable coding for a observation being censored or not censored.
3.2 Estimation
Maximum likelihood (ML) estimation will be used for all models. The two-part media-
tor model above has a more complicated likelihood function since the mediator M is a
21
combination of a binary variable and a continuous variable.
For a sample i = 1, ..., N
L(Mi|xi, ci) =n∏i=1
Pr(M > 0|xi, ci)× f(Mi|Mi > 0, xi, ci)×
N∏i=n+1
(1− Pr(M > 0|xi, ci)) (11a)
L(Yi,Mi|xi, ci) =n∏i=1
Pr(M > 0|xi, ci)× f(Mi|Mi > 0, xi, ci)f(Yi|Mi > 0, xi, ci)×
N∏i=n+1
(1− Pr(M > 0|xi, ci))f(Yi|Mi = 0, xi, ci) (11b)
In Duan et al. (1983) the likelihood of a two-part dependent variable is shown to be Equa-
tion 11a, which implies the full likelihood of Equation 11b. The expressions f(Mi|...) and
f(Mi|...) are the conditional densities of M and Y. Note that in the two-group modelling
of Y the conditional density of Y is not restricted to be the same in the first and the
second product in Equation 11b.
3.3 Derivation of effects
In this study the effect of X and C on Y are assumed equal for both groups of M. Thus
β(1)2 = β
(2)2 and β(1)
4 = β(2)4 , indicated by dropped superscript. Results to form the effects
without these restrictions can be found in Appendix A.
3.3.1 Conditional expected value of Y
One of the conditional expectations used to define the causal effects are shown in Equation
12. For a detailed explanation see Section 2.5.
E[Y (x1, log(M(x0)))] =
= β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c) + (β2x1 + β4c)× (1− Φ(κ0 + κ1x0 + κ2c)+
Φ(κ0 + κ1x0 + κ2c)) + Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x1)× (γ0 + γ1x0 + γ2c) =
= β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c) + (β2x1 + β4c)+
Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x1)× (γ0 + γ1x0 + γ2c)(12)
22
3.3.2 Causal effects
The complete derivation is displayed in Appendix A. The simplified effects are given inEquation 13-17.The Total Natural Indirect Effect
TNIE = E[Y (x1, log(M(x1))|C = c]− E[Y (x1, log(M(x0))|C = c] =
= (β(1)0 − β(2)
0 )(Φ(κ0 + κ1x1 + κ2c)− Φ(κ0 + κ1x0 + κ2c)
)+
(β1 + β3x1)(Φ(κ0 + κ1x1 + κ2c)× (γ0 + γ1x1 + γ2c)− Φ(κ0 + κ1x0 + κ2c)× (γ0 + γ1x0 + γ2c)
)(13)
The Pure Natural Direct Effect
PNDE = E[Y (x1, log(M(x0))|C = c]− E[Y (x0, log(M(x0))|C = c] =
= β2 × (x1 − x0) + Φ(κ0 + κ1x0 + κ2c)× (γ0 + γ1x0 + γ2c)× β3 × (x1 − x0)(14)
The Pure Natural Indirect Effect
PNIE = E[Y (x0, log(M(x1))|C = c]− E[Y (x0, log(M(x0))|C = c]
= (β(1)0 − β(2)
0 )×(Φ(κ0 + κ1x1 + κ2c)− Φ(κ0 + κ1x0 + κ2c)
)+
(β1 + β3x0)×(Φ(κ0 + κ1x1 + κ2c)× (γ0 + γ1x1 + γ2c)− Φ(κ0 + κ1x0 + κ2c)× (γ0 + γ1x0 + γ2c)
)(15)
The Total Natural Direct Effect
TNDE = E[Y (x1, log(M(x1))|C = c]− E[Y (x0, log(M(x1))|C = c] =
= β2 × (x1 − x0) + Φ(κ0 + κ1x1 + κ2c)× (γ0 + γ1x1 + γ2c)× β3 × (x1 − x0)(16)
The Total effect
TE = E[Y (x1, log(M(x1))|C = c]− E[Y (x0, log(M(x0))|C = c] =
= (β(1)0 − β(2)
0 )×(Φ(κ0 + κ1x1 + κ2c)− Φ(κ0 + κ1x0 + κ2c)
)+ β2 × (x1 − x0)+
+ Φ(κ0 + κ1x1 + κ2c)× (β1 + β3x1)× (γ0 + γ1x1 + γ2c)−
Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x0)× (γ0 + γ1x0 + γ2c)
(17)
23
4 Mediation, two-part M and two-part Y
In this section the mediation model where both the mediator and the outcome are limited
is considered.
4.1 Model formulation
If the mediator M and the outcome Y are limited, both the regression of M on X and the
regression of Y on M and X are affected. Again the two-part model will be used to account
for the censoring in both M and Y. The two-group regression setup for Y implies that two
more regressions will be added due to the combination with the two-part model. Both
the continuous part of M and the continuous part of Y rely on distributional assumptions.
In this study both dependent variables are assumed to follow the lognormal distribution.
The model formulation of a two-part M, two-part Y mediation model is displayed in
Equation 18.
log(Yi|M>0) = β(1)0 + β1 log(Mi) + β
(1)2 Xi + β3 log(Mi)Xi + β
(1)4 Ci + εyi (18a)
probit(Pr(Yi|M>0 > 0)) = θ(1)0 + θ1 log(Mi) + θ
(1)2 Xi + θ3 log(Mi)Xi + θ
(1)4 Ci (18b)
log(Yi|M=0) = β(2)0 + β
(2)2 Xi + β
(2)4 Ci + εyi (18c)
probit(Pr(YiM=0 > 0)) = θ(2)0 + θ
(2)2 Xi + θ
(2)4 Ci (18d)
log(Mi|M > 0) = γ0 + γ1Xi + γ2Ci + εmi (18e)
probit(Pr(Mi > 0)) = κ0 + κ1Xi + κ2Ci (18f)
, where by assumption εyi ∼ N(0, σ2y) and εmi ∼ N(0, σ2
m). This is a non-linear model
and the counterfactual framework will be used to define the effects. The path diagram of
Equation 18 is shown in Figure 5. The mediator M and the outcome Y are both separated
into one zero/non-zero part (M* and Y*), and one non-zero corresponding continuous
24
part (M and Y). If M∗=1 then the observation is not censored and the observation has a
corresponding continuous value M. If M∗=0 then the observation is censored and has no
continuous value. The same goes for Y∗ and Y. M* and Y* are assumed to be generated
from dichotomized normal distributions. The exposure X affects M, M∗, Y and Y∗. M
affects only the two-part Y belonging to the group M>0. Moreover, X are allowed to
moderate the effect of M on Y. Additionally, a covariate measured at the same time point
as X is allowed to affect both parts of the mediator and the outcome. The full model
implied by Figure 5 is kept throughout the derivations, however in the last step some
parameters will be restricted to limit the scope of the Monte Carlo simulation study in
Section 5. Expressions for calculating unrestricted effects are available in Appendix B.
This model allows for moderation between the zero/non-zero part of the mediator M*,
and the effect of X on Y, since the slopes of the two group analysis of Y is not restricted.
That is, the difference in between the slope of Y on X in the two groups is a measure
of the interaction effect. The detailed motivation of this model formulation is discussed
in Section 2. As in the two-part M model in Equation 10, the zero/non-zero parts are
modelled with probit regression, and all the linear parts are modelled with standard linear
regressions.
4.2 Estimation
The likelihood function in Equation 11 is extended in Equation 19 to account for two-
part modelling of both M and Y. The four combinations of zero/non-zero M and Y is
represented by one product each in Equation 19. The expressions f(Mi|...) and f(Mi|...)
25
M∗
M
X Y ∗|M>0
Y|M>0
Y|M=0
Y ∗|M=0
C
εM
εY
εY
Two-part
Two-part, M > 0
Two-part, M = 0
Two-group
Figure 5: Path diagram for a mediation model with a two-part mediator M and two-partoutcome Y, combined with a two-group model of Y. M* and Y* are the binary observedvariable coding for a observation being censored or not censored.
are the conditional densities of M and Y.
Let n be a random sample such that n = ng1 + ng2 + ng3 + ng4 and i = 1, ..., n
L(Yi,Mi|xi, ci) =∏i∈g1
Pr(Mi > 0|xi, ci)Pr(Yi > 0|Mi > 0, xi, ci)f(Mi|Mi > 0, xi, ci)f(Yi|Yi > 0,Mi > 0, xi, ci)×
∏i∈g2
(1− Pr(M > 0|xi, ci))Pr(Yi > 0|Mi = 0, xi, ci)f(Yi|Yi > 0,Mi = 0, xi, ci)×
∏i∈g3
Pr(Mi > 0|xi, ci)(1− Pr(Yi > 0|Mi > 0, xi, ci))f(Mi|Mi > 0, xi, ci)×
∏i∈g4
(1− Pr(Mi > 0|xi, ci))(1− Pr(Yi > 0|Mi > 0, xi, ci))
(19)
26
4.3 Derivation of causal effects
In this study the effect of X and C on Y are assumed equal for both groups of M. Thus
β(1)2 = β
(2)2 , β(1)
4 = β(2)4 , θ(1)
2 = θ(2)2 and θ
(1)4 = θ
(2)4 indicated by dropped superscript. Re-
sults to form the unrestricted effects can be found in the detailed derivation in Appendix
B.
4.3.1 Conditional expected values
One of the conditional expectations used to define the causal effects are shown in Equation
20. For detailed explanation of these conditional expected values see Section 2.5.
E[Y (x0, log(M(x0)))] =
φ× exp(β2x0 + β4c
)×
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(
1− Φ (κ0 + κ1x0 + κ2c))
+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0) bµ√
(θ1 + θ3x0)2 σ2M + 1
(20)
4.3.2 Causal effects
The complete derivation is displayed in Appendix B. The simplified effects given the re-strictions mentioned above are given in Equation 21 - 25. Note that b and µ in Equation20 are substituted (see details in Appendix B). Some further simplifications are possible,however without gaining simplicity and with loss of intuition.
27
The Total Natural Indirect Effect
TNIE = E[Y (x1, log(M(x1))|C = c]− E[Y (x1, log(M(x0))|C = c] =
= φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)−
φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1xo + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1xo + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)
(21)
The Pure Natural Direct Effect
PNDE = E[Y (x1, log(M(x0))|C = c]− E[Y (x0, log(M(x0))|C = c] =
=φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1xo + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1xo + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)−
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1xo + γ2c)2
2σ2M
((
1 + (β1 + β3x0)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x0)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1xo + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)
(22)
28
The Pure Natural Indirect Effect
PNIE = E[Y (x0, log(M(x1))|C = c]− E[Y (x0, log(M(x0))|C = c]
=φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
((
1 + (β1 + β3x0)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x0)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)−
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1xo + γ2c)2
2σ2M
(1 + (β1 + β3x0)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x0)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1x0 + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)
(23)
The Total Natural Direct Effect
TNDE = E[Y (x1, log(M(x1))|C = c]− E[Y (x0, log(M(x1))|C = c] =
φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)−
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x1)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)
(24)
29
The Total effect
TE = E[Y (x1, log(M(x1))|C = c]− E[Y (x0, log(M(x0))|C = c] =
=φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)−
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x0 + γ2c)2
2σ2M
(1 + (β1 + β3x0)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x0)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1x0 + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)
(25)
30
5 Monte Carlo simulations
In this section a Monte Carlo study is performed to investigate the sample performance
of the two-part causal effect estimates. The study contains different sample sizes, and
percentage of limited observations. All simulation results are from simulations with 1000
iterations. The sample sizes covered are 100, 150, 200, 250, 300, 400, 500, 1000. The
simulations are performed on two synthetic models created from criteria on the effects
size, R-square and the log-odds of the models.
5.1 Synthetic models and true data generating processes
The models are completely synthetic and are arbitrarily chosen only to have certain R-
square and log-odds and standardized effect sizes. All models have some constrained
parameters, e.g. the interaction is set to zero. These constraints are done only to limit
the scope of the Monte Carlo simulation. The effects for the full models can be extracted
from the derivations in Appendix A and B.
5.1.1 Weak and Moderately strong effects model
To make the simulation study resemble empirical situations the characteristics of the
models used as data generating processes are discussed and specified in detail below.
There are several important aspects to consider; the size of the (true) effects, the amount
of explained variance, the mean difference (in Y) between the zero/non-zero groups of
the two-part variables and the amount of censoring. Censoring amounts are covered in
Section 5.1.2. The effect size is relevant to consider to establish for how strong effects
the difference between two-part effects and classical effects is relevant. To formalize the
Table 1: True TNIE effect sizes for different censoring amounts.
Effect Model Censor amount Size (sd of Y)TNIE Weak 10% 0.23TNIE Weak 25% 0.29TNIE Weak 50% 0.33TNIE Mod. Strong 10% 0.33TNIE Mod. Strong 25% 0.51TNIE Mod. Strong 50% 0.73
effect sizes, the framework suggested by Cohen (1992) is used. A weak effect is around
31
0.3 standard deviations (sd) change in Y for a one sd increase in X. A moderately
strong effect is around 0.5 sd change in Y for a one sd increase in X. Note that the
effect definitions are used to create the 25% censoring. When the censoring is changed
to 10% and 50% respectively, the effect sizes change since the effects are functions of the
censoring amount. To achieve the same effect size for different censoring amounts several
parameters would have to be changed. However if several parameters are altered it does
no longer give a isolated comparison between the censoring amounts. To avoid confusion
only the censoring amount within each effect size model is varied. Figure 1 shows the true
effect sizes within the weak and moderately strong effects models with the variations of
censoring amounts. As can be seen, the weak model’s effect size is close to accurate for all
censoring amounts, the moderately strong effects model does however get a substantial
increase in true effect size (0.73 instead of 0.5) of TNIE for 50% censoring.
The amount of explained variance, formalized by R2, is not covered in Cohen (1992).
The weak models are set to have R2 between 30 and 50% and the moderately strong
models between 40-65%. The size of the difference between the zero/non-zero groups is
formalized by log-odds, i.e the log-odds from the logit model of the binary variable(s)
regressed on the covariates. Note that the logit model is not used in estimation, but only
as a tool to evaluate and quantify the difference between the two groups. The weak
models have log-odds for all independent variables around 2. The independent variables
of the moderately strong models have log-odds around 4. The choice of R2 and
log-odds are chosen to mimic empirical data sets as close as possible. Since the effect
size, R2 and log-odds are related there are some variation between the R2 of the models.
5.1.2 Censoring
Three levels of censoring are studied; 10%, 25% and 50%. Two examples of generated
random samples of 10 000 observations of log(M) are illustrated in Figure 6. As can be
seen, the positive part of log(M) is normally distributed in both cases without censoring
to any practical degree, which is desired to fulfil the distributional assumption. It is
worth pointing out that in most empirical examples the point mass and the continuous
part are not separated as clearly as in this histogram. However, to evaluate the behaviour
of the two-part effects under fulfilled distributional assumptions this is necessary. This
strong distributional assumption will probably make the difference between the classical
32
estimates and the two-part estimates larger than in most empirical situations. The sen-
sitivity of the crucial distributional assumptions is discussed further in Section 5.4. The
only difference between the two histograms in Figure 6 is the amount of censoring; 25%
and 10%. In the model where both M and Y are generated as two-part, samples from
log(Y) would look similar but with different mean and scale. Note that theoretically the
(a) 10% censoring. (b) 25% censoring.
Figure 6: Histograms over 10000 observations of log(M), generated with 10% and 25% ofthe observations are censored at zero.
two-part effects are reduced to the classical effects if the amount of censoring gets close
enough to zero. However, in practise the probit estimation breaks down if there are few
observations in one group of the dependent variable. The number of observations in each
group is dependent on the sample size. Censoring of less than 10% would mean less than
ten observation on average in the censored group with n=100. Ten observations in each
category is suggested as a useful approximate ”smallest number rule” for binary regres-
sion, discussed in Agresti (2007). However when n=100 is used in the weak model setting
of this study the estimation of the slope of X in the probit regression of M* becomes
extremely biased. In Table 2 the slope estimates for the weak models probit regression of
M* is displayed. As can be seen the extreme bias of n=100 is reduced substantially when
the sample size is increased to n=150. Keeping the 10% censoring as one of the cases
will give a broad spectrum of situations, from the lower limit of what the ML estimation
procedure of the probit regression can handle, to a large samples situations with many
observations in each group.
33
Table 2: Slope estimates from probit regressions of M* on X for different sample sizesand amount of censoring. The weak model is used in these Monte Carlo simulations
M∗ % of obs. Population Average EmpiricalON n in group Z=0 value estimate s.dX 100 10 0.400 6.4461 190.0507X 150 10 0.400 0.4186 0.1753
5.1.3 Model 1 - Two-part M
The restrictions of parameters in our study (see Section 4.3) makes it possible to use one
model of Y as opposed to the two-group model in the derivations. A dummy variable is
used to estimate the intercept shift. No interaction effects are included in the simulations
to limit the scope. To obtain a normally distributed Mi|M > 0, an auxiliary variable Z∗1is created as a function of X. The allocation to the M=0 group is based on the threshold
on Z∗1 . That is, for all observations where Z∗1 < τ the Mi|M > 0 is set to missing and
a dummy variable M* is set to zero. If Z∗1 ≥ τ , M* is set to one and the Mi|M∗i=1 is
set to M. In practise the dummy M* is created in the DEFINE command in Mplus, see
Section C for details. The probability of being in the M*=1 group given a value on X
can be obtained from the estimates of a probit regression. No additional covariate is
brought into the synthetic models for simplicity. Note that the missingness on M and Y
are missing at random (MAR) by construction.
5.1.4 Model 1 - Two-part M, Weak
X ∼ N(5, 1)
Yi = 2 + 0.5M∗ + 0.25 log(M) + 0.25X + ηyi
Mi = exp(4 + 0.5X + ηmi)
Z∗1i = 0.4Xi + ηz∗i
(26)
, where ηyi ∼ N(0, 1), ηmi ∼ N(0, 1) and ηz∗i ∼ N(0, 1). The coefficient of M* is the
difference between the intercept in the M*=1 and M*=0 group, thus the β(2)0 is identified
since β0diff = β(2)0 − β
(1)0 → β
(2)0 = β0diff + β
(1)0 . Z∗1i is the continuous normal variable
that is dichotomized into M* according to the censoring amount.
34
5.1.5 Model 1 - Two-part M, Moderately strong
X ∼ N(5, 1)
Yi = 2 + 0.5M∗ + 0.3 log(M) + 0.3X + ηyi
Mi = exp(4 + 0.7X + ηmi)
Z∗1i = 0.8Xi + ηz∗i
(27)
Again, ηyi ∼ N(0, 1), ηmi ∼ N(0, 1) and ηz∗i ∼ N(0, 1). The coefficient of M* is the
difference between the intercept in the M=1 and M=0 group, thus the β(2)0 is identified,
just as for the weak effects two-part M model. Z∗1i is the continuous normal variable that
is dichotomized into M∗ according to the censoring amount.
5.1.6 Model 2 - Two-part M, Two-part Y
Based on the reasoning for the two-part M model (Section 5.1.3) the restrictions allow
us to generate data from a one group model of Y. The intercept shift is captured with
the dummy variable M∗. Since Y is also two-part an additional normal variable Z2* is
generated to be dichotomized into Y* according to the censoring amount. More details
about how the variables are generated with the help of Mplus can be found in Section C.
5.1.7 Model 2 - Two-part M, Two-part Y, Weak
X ∼ N(5, 1)
Yi = exp(2 + 0.55M∗ + 0.25 log(M) + 0.25X + ηyi)
Z∗2i = 0.3M∗ + 0.35 log(M) + 0.32X + ηz∗2i
Mi = exp(4 + 0.5X + ηmi)
Z∗1i = 0.4Xi + ηz∗i
(28)
5.1.8 Model 2 - Two-part M, Two-part Y, Moderately strong
X ∼ N(5, 1)
Yi = exp(2 + 0.55M∗ + 0.3 log(M) + 0.2X + ηyi)
Z∗2i = 0.28M∗ + 0.84 log(M) + 0.57X + ηz∗2i
Mi = exp(4 + 0.7X + ηmi)
Z∗1i = 0.8Xi + ηz∗i
(29)
35
5.2 Estimation
The simulations and estimations of the models considered in this Monte Carlo are per-
formed with ML estimation in Mplus version 7.4 (Muthen and Muthen). Note that there
are observations missing at random (MAR) since the Monte Carlo models are run with
one model for Y with a dummy for group belonging. The continuous part of log(M) is
only observed when M∗ = 1. The missingness does not affect the estimation except that
numerical procedures that can handle this has to be chosen in Mplus. For details about
the simulation study setup in Mplus see Appendix C.
5.3 Results
To limit the scope of this study the Monte Carlo results of the two-part mediator will be
given special attention. The two-part mediator effect estimates will be compared to the
corresponding estimates without accounting for the censoring, referred to as the classical
estimates. The classical estimates coincide with the product method estimates in
the two-part M setting, since the restricted models considered in this simulation study
is linear if the the two-part structure of M is disregarded. The Monte Carlo results of
the two-part M, two-part Y estimates will be reduced, showing only the small sample
behaviour without comparison with the classical effect estimates.
5.3.1 Outcome variables
The definitions of the outcome variables presented in the Monte Carlo result (Section
5.3.2) are shown in Equation 30. The bias will be negative if the average estimate
is below the true effect. The proportion of the empirical standard deviation and the
average standard error (SE) estimate is larger than 1 if the average SE is smaller than
36
the empirical standard deviation.
Bias = Average estimate - True effect (30a)
Relative Bias (%) = 100× Average estimate - True effectTrue effect (30b)
95% CI coverage = Number of 95% CIs covering true effectk
(30c)
Percentage significant coefficients = Number of 95% CIs not covering zerok
(30d)
Relative standard error = Empirical standard deviationAverage standard error estimate (30e)
, where i=1,...,k. k is number of iterations. Usually a bias larger than 10% of the true
effect be regarded as a substantial bias. The percentage of significant effect estimates is
a variable related to the power of the estimator. However, power interpretations are only
meaningful if the bias is small enough. If close to unbiased, a significance rate of above
0.8 is often regarded as strong power for an estimator.
5.3.2 Two-part M
Table 3 to Table 8 show the detailed results from the Monte Carlo simulation for the
weak effects model. The outcome variables given in the tables are defined in Equation
30.
Figure 7 shows the biases of the average effect estimates for the classical and the
two-part M models. Both the indirect effect TNIE bias and the direct effect PNDE
bias for the weak effects model are displayed. The two-part model’s average estimates
of TNIE and PNDE are close to unbiased for all sample sizes, and converge to zero
with increasing sample size. The classical estimates are consistently underestimating
the TNIE and overestimating the PNDE for all sample sizes, with larger bias for larger
amount of observations censored. The overestimation of the PNDE is smaller than the
underestimation of the TNIE for all three censoring amounts. The pattern is similar for
37
the moderately strong effects model, however with accordingly larger biases. Note that
the variance of Y is not 1. The size of the bias for TNIE estimated with classical effects
in terms of standard deviations (sd) of Y is ranging from around -0.1 for 10% censoring
to around -0.17 for 50% in the weak effects model. In the moderately strong effects
model corresponding standardized effects are between around -0.14 for 10% censoring to
around -0.55 for 50% censoring for the. In percentage the bias of the classical TNIE
estimates range from -40% to -59% for the weak effects model and from -41% to 74%
for the moderately strong effects model. These large biases for all classical estimates
are likely due to the distributional assumptions discussed in Section 5.1.2, investigated
further in Section 5.4. In short the clear separation of the point mass and the continuous
part of the two-part variable(s) makes the classical approach especially biased.
Figure 8 gives special attention to the bias of the two-part estimates of TNIE. The
bias, as shown above, is small for all censoring amounts and converges to zero as sample
size increase. There is a drastic change in pattern of the bias between the 10% and 25%
censored in contrast to the 50% censored. The pattern of the the bias, first decreasing
with increased censoring amount, then increasing, is explained by the lower graphs of
Figure 8. The bias of TNIE seem to be close to a linear function of censoring amount
with zero bias at around 35% censoring. This unexpected pattern is due to the behaviour
displayed in the probit slope bias graph in Figure 8. Since the sample size is small (n=100)
the probit regression is performing poorly due to few observations in one specific group.
When the sample size of the smaller group increases the bias becomes smaller. However,
when the zero-group becomes to large (large censoring amount) the linear regression has
a smaller sample to fit and becomes less accurate. For optimal estimation behaviour of
small sample sizes the following is needed; large enough censoring amount to give a good
probit regression fit , but small enough censoring amount to get a good linear regression
fit. The censoring amount for small sample sizes is a trade off between a good probit
fit and a good linear regression fit. The censoring amount becomes less important when
the sample size increases since both regressions gets good fit even for small censoring
amounts.
Figure 9 shows the 95% confidence interval coverage. The percentage of confidence
intervals that cover the true effect are stable around 95% for the two-part estimates and
decreasing with sample size for the classical estimates. The coverage for the classical
38
Figure 7: Bias of the TNIE and PNDE estimates for the weak effects model, with cen-soring amounts 10%, 25% and 50%.
estimates is in line with the bias plot in Figure 7. The classical models point estimates
are equally biased for all sample sizes and the standard error; therefore the width of
the confidence interval, decreases with increasing sample size. The coverage rate for the
confidence intervals of the classical estimates of PNDE goes to zero slower than for the
TNIE, in line with the smaller bias of PNDE.
Figure 10 shows the ratio between the empirical standard deviation and the average
standard errors. It seems to be a small underestimation of the standard error for the
PNDE, for both the classical and the two-part estimates. The SE of estimates for two-
part TNIE is overestimated for small censoring amounts, whereas the SE for estimates of
39
Figure 8: Bias in percentage of true effect for two-part estimates of TNIE for censoredamounts of 10%, 25% and 50%. Bias of TNIE for sample size 100 as a function ofcensoring amount. Bias of the slope of the probit regression as a function of censoringamount.
the classical is underestimated. For larger censoring amounts the ratio is close to one for
both the classical and the two-part estimates. Overall the average SE seems to estimate
the sample variation well.
Figure 11 shows the rate of classical and two-part TNIE estimates significantly differ-
ent from zero. The classical estimates have higher or equally high significance rate as the
two-part estimates for all settings included in this study. That might seem contradictory
since the TNIE is shown to be underestimated in Figure 7. However, in Figure 12 it can
be seen that the two-part estimates are about 2.5 times as large average standard error
estimates as the classical estimates, which is in line with the empirical counterpart dis-
played in the same plot. The results from the Monte Carlo simulations of the moderately
strong effects model are very similar to the weak effects model for all outcome variables,
however with corresponding larger differences between the classical and the two-part es-
40
Figure 9: Percentage of 95% confidence intervals that covered the true effects value forTNIE and PNDE, with censoring amounts 10%, 25% and 50%.
timates performance. Because of the similar behaviour, the results of the moderately
strong effects model are not displayed.
41
Figure 10: The empirical standard deviation divided by the average SE estimate for TNIEand PNDE, with censoring amounts 10%, 25% and 50%.
Table 3: Monte Carlo results for two-part estimated TNIE with 10% censoring.
Population Average Empir. st. SE 95% CI % sig. Empir. sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 0.172 0.165 -0.007 0.055 0.056 0.934 0.916 0.977150 0.172 0.168 -0.004 0.045 0.046 0.935 0.995 0.974200 0.172 0.169 -0.003 0.039 0.040 0.944 1.000 0.980250 0.172 0.171 -0.001 0.035 0.036 0.946 1.000 0.997300 0.172 0.171 -0.001 0.033 0.032 0.929 1.000 1.025400 0.172 0.171 -0.001 0.028 0.028 0.944 1.000 0.982500 0.172 0.171 -0.001 0.025 0.025 0.948 1.000 1.004
1000 0.172 0.171 -0.001 0.018 0.018 0.936 1.000 1.023
42
Figure 11: Percent of estimates significantly different from 0 with α = 0.05, for the TNIEestimate of the weak effects model.
Figure 12: Ratios of empirical standard deviation and average SE for two-part estimatesand classical estimates. The results come from the weak effects model with 25% censoring.
Table 4: Monte Carlo results for two-part estimated TNIE with 25% censoring.
Population Average Empir.st. SE 95% CI % sig. Empir. sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 0.226 0.222 -0.004 0.085 0.086 0.938 0.809 0.987150 0.226 0.223 -0.003 0.069 0.070 0.933 0.955 0.993200 0.226 0.224 -0.002 0.060 0.060 0.936 0.991 1.002250 0.226 0.226 -0.000 0.053 0.054 0.947 0.997 0.987300 0.226 0.226 -0.000 0.049 0.049 0.942 1.000 1.008400 0.226 0.224 -0.002 0.042 0.042 0.945 1.000 0.995500 0.226 0.224 -0.002 0.038 0.038 0.948 1.000 1.011
1000 0.226 0.225 -0.001 0.027 0.026 0.947 1.000 1.011
43
Table 5: Monte Carlo results for two-part estimated TNIE with 50% censoring.
Population Average Empir. st. SE 95% CI % sig. Empir sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 0.257 0.262 0.005 0.128 0.127 0.935 0.556 1.009150 0.257 0.260 0.003 0.099 0.101 0.949 0.803 0.977200 0.257 0.262 0.005 0.086 0.087 0.952 0.918 0.990250 0.257 0.260 0.003 0.076 0.077 0.948 0.971 0.982300 0.257 0.260 0.002 0.071 0.070 0.944 0.991 1.011400 0.257 0.257 -0.000 0.059 0.060 0.960 0.998 0.988500 0.257 0.257 -0.000 0.054 0.054 0.948 1.000 1.011
1000 0.257 0.255 -0.002 0.037 0.038 0.958 1.000 0.976
Table 6: Monte Carlo results for two-part estimated PNDE with 10% censoring.
Population Average Empir. st. SE 95% CI % sig. Empir. sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 0.250 0.251 0.001 0.066 0.063 0.950 0.967 1.043150 0.250 0.251 0.001 0.053 0.052 0.948 0.992 1.021200 0.250 0.251 0.001 0.047 0.045 0.945 0.999 1.040250 0.250 0.251 0.001 0.041 0.040 0.944 1.000 1.027300 0.250 0.250 0.000 0.038 0.037 0.932 1.000 1.041400 0.250 0.250 0.000 0.032 0.032 0.947 1.000 1.016500 0.250 0.250 0.000 0.028 0.028 0.945 1.000 1.004
1000 0.250 0.250 0.000 0.021 0.020 0.946 1.000 1.030
Table 7: Monte Carlo results for two-part estimated PNDE with 25% censoring.
Population Average Empir. st. SE 95% CI % sig. Empir. sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 0.250 0.251 0.001 0.070 0.066 0.942 0.957 1.053150 0.250 0.251 0.001 0.056 0.054 0.941 0.995 1.031200 0.250 0.251 0.001 0.049 0.047 0.936 0.998 1.045250 0.250 0.250 0.000 0.044 0.042 0.942 1.000 1.048300 0.250 0.250 -0.000 0.040 0.038 0.935 1.000 1.050400 0.250 0.250 -0.000 0.034 0.033 0.950 1.000 1.012500 0.250 0.250 -0.000 0.030 0.030 0.948 1.000 1.017
1000 0.250 0.250 0.000 0.021 0.021 0.945 1.000 1.014
44
Table 8: Monte Carlo results for two-part estimated PNDE with 50% censoring.
Population Average Empir. st. SE 95% CI % sig. Empir sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 0.250 0.249 -0.001 0.075 0.073 0.944 0.913 1.034150 0.250 0.249 -0.001 0.060 0.059 0.943 0.978 1.007200 0.250 0.249 -0.001 0.052 0.051 0.947 0.994 1.021250 0.250 0.249 -0.000 0.047 0.046 0.954 0.998 1.022300 0.250 0.249 -0.001 0.043 0.042 0.943 1.000 1.029400 0.250 0.250 -0.000 0.036 0.036 0.947 1.000 1.006500 0.250 0.250 0.000 0.033 0.032 0.946 1.000 1.031
1000 0.250 0.250 0.000 0.023 0.023 0.939 1.000 1.022
45
5.3.3 Two-part M, Two-part Y
Table 9 to Table 14 show the detailed Monte Carlo results of the two-part M, two-part
Y estimated TNIE and PNDE for different sample sizes and censoring amounts. Note
that no classical effects are presented. The focus is on the small sample behaviour of
the two-part M, two-part Y estimated causal effects. Additionally, the reader may notice
that the numerical size of the effects is substantially larger than those of the two-part M
results. This is because the outcome variable Y for these models is assumed lognormal,
as compared to normal in the two-part M case. The standard deviation of Y is around
240 for the two-part M, two-part Y models.
Figure 13 shows the relative bias. For small sample sizes the bias is large, especially for
the TNIE. For 30% censoring and sample size 100 the bias is almost 20% of the true effect.
There is a steep decrease in the bias of all three censoring amounts between sample sizes
100 and 250. There is a unexpected slight bump for samples size 300, increased number
of iterations (4000 instead of 1000) does not change this pattern. For sample sizes larger
than or equal to 500 the bias is less than 5% for all censoring amounts. Note that the
bias is positive implying an overestimation of the effects.
Figure 13: Bias of the TNIE and the PNDE estimates for the weak effects model forcensoring amounts 10%, 25% and 50%. The y-axis is standard deviations of Y.
Figure 14 shows the percentage of 95% confidence intervals that covered the true effect
value. The coverage rate for TNIE is similar for all three censoring amounts, increasing
from around 0.8 for sample size 100, to around 0.9 for sample size 1000. The PNDE
follows a similar pattern with around 0.85 for sample size 100 and reaching the desired
46
coverage rate 0.95 for sample size 1000. The low coverage of TNIE implies that the SE
estimates, on which the confidence intervals are based, are underestimated which is in
line with Figure 15.
Figure 14: Percentage of 95% confidence intervals that covered the true effects value forTNIE and PNDE, with censoring amounts 10%, 25% and 50%.
Figure 15 shows that for small sample sizes the average SE estimates underestimates
the variation of the estimator. The average SE estimates seem to be stable for sample
sizes larger than or equal to 400. The three censoring amounts do not have the same
clear ordering as in the bias of the two-part M model in Figure 13.
Figure 15: The empirical standard deviation divided by the average SE estimate for TNIEand PNDE, with censoring amounts 10%, 25% and 50%.
Figure 16 shows the percent of significant effect estimates (α = 0.05). For TNIE
47
there is a steep increase in significant effects rate between sample size 100 and 300.
The significance rate of 10% and 25% censoring estimates differs substantially from the
50% censoring estimates. The estimates for the 10% and 25% censored data sets are
all significant for sample size 500 and above. The estimates for the 50% censored data
sets have a similar significance rate for small samples but a slower increase tendency
with increased sample size. The PNDE effects has similar patterns however with an even
steeper increase. The PNDE estimates for the 10% and 25% data sets are all significant
for sample sizes 400 and larger. The PNDE estimates for 50% censored data sets are
significant from sample size 500. The estimates on 50% censored data has a stronger
separation from 10% and 25% for PNDE than for TNIE.
Figure 16: Percent of estimates significantly different from 0 with α = 0.05, for the TNIEestimate of the weak effects model.
Table 9: Monte Carlo results for two-part, two-part estimated TNIE with 10% censoring.
Population Average Empir. st. SE 95% CI % sig. Empir sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 36.204 43.358 7.154 56.346 43.581 0.784 0.018 1.293150 36.204 39.380 3.176 31.164 29.505 0.815 0.082 1.056200 36.204 38.471 2.267 27.061 24.339 0.845 0.241 1.112250 36.204 38.031 1.827 23.062 21.110 0.856 0.485 1.092300 36.204 37.952 1.748 20.206 19.031 0.871 0.721 1.062400 36.204 37.484 1.280 16.691 16.007 0.885 0.966 1.043500 36.204 36.910 0.706 14.503 13.924 0.897 0.999 1.042
1000 36.204 36.520 0.316 9.482 9.560 0.920 1.000 0.992
In summary the Monte Carlo results were the following. The Two-part M causal effect
48
Table 10: Monte Carlo results for two-part, two-part estimated TNIE with 25% censoring.
Population Average Empir. st. SE 95% CI % sig. Empir sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 43.750 50.597 6.847 57.245 50.698 0.798 0.066 1.129150 43.750 47.728 3.978 40.852 35.679 0.832 0.191 1.145200 43.750 46.050 2.300 30.455 28.474 0.860 0.367 1.070250 43.750 45.378 1.628 24.426 24.372 0.870 0.551 1.002300 43.750 45.573 1.823 23.192 22.127 0.882 0.754 1.048400 43.750 45.006 1.256 18.842 18.548 0.886 0.954 1.016500 43.750 44.208 0.458 16.016 16.011 0.882 0.999 1.000
1000 43.750 43.735 -0.015 10.693 10.937 0.913 1.000 0.978
Table 11: Monte Carlo results for two-part, two-part estimated TNIE with 50% censoring.
Population Average Empir. st. SE 95% CI % sig. Empir sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 38.247 48.946 10.699 87.850 66.308 0.806 0.056 1.325150 38.247 44.714 6.467 49.950 41.745 0.819 0.147 1.197200 38.247 42.159 3.912 37.742 31.445 0.828 0.267 1.200250 38.247 40.431 2.184 26.503 25.242 0.855 0.399 1.050300 38.247 40.640 2.393 25.524 22.921 0.850 0.552 1.114400 38.247 40.022 1.775 19.564 18.838 0.874 0.785 1.039500 38.247 39.321 1.074 16.741 16.167 0.878 0.929 1.036
1000 38.247 38.481 0.234 10.984 10.774 0.909 1.000 1.019
estimates are close to unbiased for small sample sizes whereas the classical estimates are
consistently underestimating the indirect effect as well as overestimating the direct effect
for all sample sizes. The two-part M, two-part Y causal effect estimates severely biased
for small sample sizes but becomes close to unbiased for sample sizes above 300.
49
Table 12: Monte Carlo results for two-part, two-part estimated PNDE with 10% censor-ing.
Population Average Empir. st. SE 95% CI % sig. Empir sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 46.296 51.230 4.934 41.351 36.387 0.862 0.118 1.136150 46.296 49.039 2.743 28.378 26.763 0.884 0.468 1.060200 46.296 47.966 1.670 23.242 22.283 0.896 0.781 1.043250 46.296 47.678 1.382 20.285 19.543 0.895 0.937 1.038300 46.296 47.948 1.652 18.919 17.814 0.906 0.988 1.062400 46.296 47.569 1.273 15.284 15.150 0.921 1.000 1.009500 46.296 46.904 0.608 13.517 13.287 0.922 1.000 1.017
1000 46.296 46.693 0.397 9.122 9.234 0.940 1.000 0.988
Table 13: Monte Carlo results for two-part, two-part estimated PNDE with 25% censor-ing.
Population Average Empir. st. SE 95% CI % sig. Empir sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 40.406 45.535 5.129 35.354 33.680 0.870 0.128 1.050150 40.406 43.949 3.543 27.824 25.283 0.889 0.383 1.101200 40.406 42.691 2.285 21.066 20.794 0.909 0.702 1.013250 40.406 42.201 1.795 18.187 18.074 0.919 0.880 1.006300 40.406 42.222 1.816 16.915 16.486 0.914 0.963 1.026400 40.406 41.948 1.542 13.473 14.046 0.932 1.000 0.959500 40.406 41.077 0.671 11.956 12.230 0.931 1.000 0.978
1000 40.406 40.657 0.251 8.067 8.457 0.953 1.000 0.954
Table 14: Monte Carlo results for two-part, two-part estimated PNDE with 50% censor-ing.
Population Average Empir. st. SE 95% CI % sig. Empir sd/n effect estimate Bias deviation Average coverage coeff. SE average
100 23.485 30.842 7.357 35.596 32.339 0.866 0.021 1.101150 23.485 28.044 4.559 23.509 21.683 0.892 0.070 1.084200 23.485 26.609 3.124 19.123 17.006 0.893 0.215 1.124250 23.485 25.730 2.245 14.470 14.079 0.906 0.408 1.028300 23.485 25.486 2.001 13.081 12.712 0.905 0.643 1.029400 23.485 25.211 1.726 10.857 10.668 0.927 0.896 1.018500 23.485 24.537 1.052 9.684 9.205 0.909 0.985 1.052
1000 23.485 23.991 0.506 6.372 6.249 0.940 1.000 1.020
50
5.4 Sensitivity analysis
To get some insight in the sensitivity of the distributional assumptions which are made
for the continuous part of the two-part variables, a small Monte Carlo simulation is
conducted. The design is simple, the intercept of M in the two-part M setting (Model 1)
is gradually decreased. Since substantial parts of the normal distribution is censored, the
dichotomization of Z∗1 is adjusted to obtain the same censoring amount. Figure 17 shows
random samples from the different M-variables that are fitted to the two-part models.
Note that the continuous part of M becomes truncated normal when the point mass starts
to connect with the tail of the normal density, thus the true effects are unknown. Since
the distribution to the right of zero is a truncated normal distribution, it not possible
to calculate the true bias without deriving the causal effects for the two-part model,
with truncated normal distributional assumption of the continuous part of the variable.
As a guiding measure the two-part estimate under the same censoring amount, with no
truncation of the continuous part, is used as ”true” effect.
Figure 18 shows the bias, presented as percentage of true effect. The bias of the
two-part estimates in this graphs is informative with regarding to the robustness of the
two-part estimator. The only reason that makes the true effects different from the es-
timated effects is that the distributional assumption is violated gradually from right to
left. Each point in Figure 18 corresponds to a sample of M, sampled from a population
as shown in Figure 18. When the intercept of M is small enough for the point mass to
connect with the density (in theory it always does, but a practical view is taken here)
the difference of the two-part estimates becomes larger. The two-part estimator does
not seem robust, as soon as the normal distribution is markedly truncated the estimates
differs substantially. In practise it is not uncommon with large deviations from normal
or lognormal continuous parts of two-part variables. The interpretation of the classical
results of Figure 18 is a bit problematic when the true value is unknown. When the nor-
mal distribution assumption of the continuous part of the two-part variables is gradually
violated the difference between the inviolated two-part estimates and the classical esti-
mates change direction. The difference between the classic and the inviolated two-part
is larger than that of the violated two-part for all intercept lower than 0.5. It is however
difficult, to interpret this in an informative way. These results should rather be viewed
as an indication that the robustness of the two-part estimator can be questioned. It is
51
Figure 17: Random samples of 10000 observations from M generated with different in-tercepts. The censored amount is adjusted to 25% for all intercepts.
also a clear indication of the need for two-part models to be implemented with other
continuous distributions than normal.
52
Figure 18: Bias of the TNIE for different intercepts of M. The sample size is 200, thecensoring amount 25%. *Note that the true values are unknown and the bias is ratherthe difference from the two-part effect with inviolated distributional assumptions.
6 Discussion
As expected there are gains to be made from accounting for censoring when the censoring
occurs in the mediator variable. For the two-part M estimation, the Monte Carlo results
for the classical approach consistently underestimated the TNIE, with larger underesti-
mation for larger censoring amounts. Given the purpose of most mediation analysis is
to establish and measure indirect effects compared to the direct effect the bias direction
makes the problem particularly severe. The bias is substantial even for small effects
and censoring amounts, indicating the importance of accounting for censoring also in
situations where the censoring is moderate.
It is clear that even though the TNIE is underestimated by the classical estimates the
TNIE is significant for smaller sample sizes. Since the estimation of the classical effects
is more parsimonious than the two-part, the average SE estimates are a lot smaller. The
cost of complexity is confirmed by the two-part M, two-part Y estimates. For example
the bias is substantial for the small sample sizes as compared to the bias of the two-part
M estimates. The bias is a lot smaller and not substantial for any sample size considered
for the two-part M estimates. The 95% coverage is lower with slower increase than for
two-part M. Thus, not surprisingly, larger sample sizes are needed to benefit from the two-
53
part M, two-part Y estimates. The high significance rate of the classical estimates should
also be contrasted with the rate of confidence interval covering the true effect. When
the latter is also considered, the rate of confidence intervals including the true effect and
not zero is drastically decreased for the classical effects. This pattern is explained by the
severe bias of the classical estimates.
It might seem to be the case that even though the two-part approach gives less un-
derestimated TNIE, the classical estimates have better abilities to establish a significant
effects for small sample sizes. The two-part estimation is more complex, involving the
non-linear probit model, and the SE estimates are considerably higher. However, the
biases of the classical estimates are so substantial that the classical estimates should not
be trusted at all under moderate or heavier censoring. The behaviour of the classical
approach under censoring with zero indirect effect is not investigated, but it seems rea-
sonable that problems and inaccurate inference might be the case. Further analysis of the
behaviour of the bias when there is censoring but no indirect effect is needed to conclude
the behaviour of the effects under zero indirect effects.
The coverage rate of confidence intervals for the classical estimates of PNDE is de-
creasing slower than for TNIE. This is reasonable since in these runs only the mediator
is two-part and therefore the largest distortion in estimates could be expected for the
indirect effect. The two-part M, two-part Y Monte Carlo results also confirm this with
smaller biases, higher coverage rate and higher significance rate for PNDE than for TNIE
in all settings.
The large biases of the classical estimates of the causal effects in the Monte Carlo
simulations are expected. The fact that the point mass of the censored dependent variable
is so clearly separated from the continuous part will make the classical approach ill-suited
for estimation of the model and the effects. The Monte Carlo setup might be criticized as
being in favour of the two-part estimates from the beginning. Such criticism is legitimate,
however, the distributional assumptions made for the censored variables in this study
are among the most common ones made in practice. There are therefore at least two
important questions to answer. The first of which this study set out to answer: Under
correctly assumed distribution of the continuous part of the censored variable, how does
the two-part approach perform? The second question, only touched upon in this study,
is directed to the two-part framework in general rather than mediation analysis: How
54
sensitive is the two-part approach to misspecification of the distributional assumption(s)?
If the censored variable does not look like a spike beside a proper normal distribution but
rather like a spike and a truncated normal, the continuous part is never exactly normal
or log-normal. The results of this study, showing gains from two-part estimated effects,
are indicating that it is important to establish the answer to the second question. The
small sensitivity analysis indicate that misspecification of the distributional assumption
influences the estimation substantially. In order to be able to claim the benefits from the
two-part causal effects the sensitivity of the assumption must be investigated in detailed,
and also other distributions than normal and lognormal might need to be considered to
better fit the data in practise.
7 Conclusion
In this study the counterfactual framework has been used to derive the causal effects of
a flexible mediation model accounting for censored mediator and outcome. Even though
the assumptions are strict, the simplicity of the definition allows us to define the effects
even when the functional forms become somewhat complex. Building on the limited-
dependent variable approach suggested by Cragg (1971), the two-part model is used to
account for censoring in form of floor effects at zero. The first part of this study motivated
and explained the model formulation, for which the causal effects were derived in detail.
A Monte Carlo simulation was performed to investigate the small sample properties of
these effects.
Referring to the research questions regarding model formulation and assumptions the
two-part framework, together with two-group regression, were used to account for censor-
ing. This gives the additional assumption(s) about the distribution of the continuous part
of the two-part variable. The two-part framework also relies on the implicit assumption
of a data generating process where the point mass differs from the continuous part, with
respect to other variables, in the model.
Detailed derivations of the causal effects of the two-part mediator, and the two-part
mediator, two-part outcome variable are presented in Appendix A and B.
The research questions concerning the accuracy and small sample behaviour of the
derived causal effects were answered with Monte Carlo simulations. The simulations
55
showed that there are large improvements to be made if correctly accounting for limited
mediator and outcome. The simulations also showed that the flexibility of these models
are not too costly in terms of estimation performance and power. The two-part M model
performed well from sample sizes as small as 100. For the two-part M, two-part Y
model samples larger than 300 is suggested for nice behaviour of the estimates. It is,
however, also discussed and pointed out throughout the study that the results of the
simulation study is under perfect distributional assumptions. The small Monte Carlo
study of the sensitivity of the normal assumption of the two-part M model indicates the
need of other distributions. A natural step would be to implement the causal effect under
truncated normal assumption of the continuous part of the two-part variables. The two-
part with truncated normal-effects would also serve as tool for evaluating the robustness
of the normal assumption accurately. It is clear that the two-part model can give much
improvement in causal effect estimation in limited-dependent variable mediation analysis.
However the two-part model it self may still benefit from some improvements.
Acknowledgements
The author would like to thank Shaobo Jin for his patient mathematical advise and
suggestions. Tihomir Asparouhov for his feedback on derivations.
56
References
Agresti, A. (2007). An Introduction to Categorical Data Analysis. Wiley Series in Prob-
ability and Statistics. Wiley.
Aitchison, J. and Brown, J. A. C. (1966). The Lognormal Distribution: With Special Ref-
erence to Its Uses in Economics. Cambridge. University. Dept. of Applied Economics.
Monographs, 5. University Press.
Baron, R. M. and Kenny, D. (1986). The Moderator-Mediator Variable Distinction in
Social The Moderator-Mediator Variable Distinction in Social Psychological Research:
Conceptual, Strategic, and Statistical Considerations. Journal of Personality and Social
Psychology, 51(6):1173–1182.
Brown, E., Catalano, R., Fleming, C., Haggerty, K., and Abbott, R. (2005). Adolescent
substance use outcomes in the Raising Healthy Children Project: A two-part latent
growth curve analysis. Journal of Consulting and Clinical Psychology, 73:699.
Cohen, J. (1992). A Power Primer. Psychological Bulletin, 112(July):155–9.
Cragg, J. (1971). Some Statistical Models for Limited Dependent Variables with Appli-
cation to the Demand for Durable Goods. Econometrica, 39(5):829–844.
Duan, N., Manning, W. G., Morris, C. N., and Newhouse, J. P. (1983). A Comparison of
Alternative Models for the Demand for Medical Care. Journal of Business & Economic
Statistics, 1(2):115–126.
Gill, J. (2000). Generalized Linear Models: A Unified Approach. Quantitative Applica-
tions in the Social Sciences. SAGE Publications.
Greene, W. H. (2012). Econometric Analysis. The Pearson series in economics. Pearson.
Hayes, A. F. (2013). Introduction to Mediation, Moderation, and Conditional Process
Analysis: A Regression-Based Approach. Methodology in the Social Sciences Series.
Guilford Press.
Imai, K., Keele, L., and Tingley, D. (2010). A general approach to causal mediation
analysis. Psychological Methods, 15(4):309–334.
57
Imbens, G. W. G. and Angrist, J. D. J. (1994). Identification and Estimation of Lo-
cal Average Treatment Effects. Econometrica: Journal of the Econometric Society,
62(2):467–475.
James, L. R. and Brett, J. M. (1984). Mediators, moderators, and tests for mediation.
Journal of Applied Psychology, 69(2):307–321.
Jones, A. M. (1989). A double-hurdle model of cigarette consumption. 4(August 1988):23–
39.
Judd, M. C. and Kenny, D. (1981). Estimating Mediation in Treatment Evaluations.
Evaluation Review, 5(5):602–619.
Keele, L. (2015). Causal Mediation Analysis: Warning! Assumptions Ahead. American
Journal of Evaluation, pages 1–14.
Muthen, B. (1979). Probit Model With Latent Variables. 74(368):807–811.
Muthen, B., Muthen, L., and Asparouhov, T. (2016). Regression and Mediation Analysis
Using Mplus. Muthen & Muthen, Los Angeles.
Muthen, L. and Muthen, B. Mplus User’s Guide. Seventh Edition. Muthen & Muthen,
Los Angeles.
Pearl, J. (1995). Causal diagrams for empirical research. Biometrika, 82(4):669–688.
Pearl, J. (2001). Direct versus Total Effects. Proceedings of the Seventeenth Conference
on Uncertainty in Artificial Intelligence, (1992):411–420.
Robins, J. M. and Greenland, S. (1992). Identifiability and exchangeability for direct and
indirect effects. Epidemiology, pages 143–155.
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonran-
domized studies. Journal of Educational Psychology, 66(5):688–701.
Rucker, D. D., Preacher, K. J., Tormala, Z. L., and Petty, R. E. (2011). Mediation
Analysis in Social Psychology: Current Practices and New Recommendations. Social
and Personality Psychology Compass, 5(6):359–371.
58
Spirtes, P., Glymour, C., and Scheines, R. (1993). Causation, Prediction, and Search.
Technometrics, 45(3):272–273.
Tobin, J. (1958). Estimation of Relationships for Limited Dependent Variables ESTIMA-
TION OF RELATIONSHIPS FOR LIMITED DEPENDENT VARIABLES’. Source:
Econometrica, 26(1):24–36.
VanderWeele, T. (2015). Explanation in Causal Inference: Methods for Mediation and
Interaction. Oxford University Press, Incorporated.
Vanderweele, T. J. (2012). Mediation analysis with multiple versions of the mediator.
Epidemiology (Cambridge, Mass.), 23(3):454–63.
VanderWeele, T. J. and Vansteelandt, S. (2010). Odds Ratios for Mediation Analysis for
a Dichotomous Outcome. American Journal of Epidemiology, 172(12):1339–1348.
Wang, W. and Albert, J. M. (2012). Estimation of mediation effects for zero-inflated
regression models. Statistics in Medicine, 31(26):3118–3132.
Wooldridge, J. M. (2002). Econometric Analysis of Cross Section and Panel Data. Econo-
metric Analysis of Cross Section and Panel Data. MIT Press.
59
A Appendix - Derivation - Two-part M
ModelsYi = β
(1)0 + β1 log(Mi) + β
(1)2 Xi + β3 log(Mi)Xi + β
(1)4 Ci + εyi ,M > 0
Yi = β(2)0 + β
(2)2 Xi + β
(2)4 Ci + εyi ,M = 0
log(Mi|M > 0) = γ0 + γ1Xi + γ2Ci + εmi
probit(Pr(Mi > 0)) = κ0 + κ1Xi + κ2Ci
(31)
Where by assumption εyi ∼ N(0, σ2y) and εmi ∼ N(0, σ2
m). This model is a combination of a two-part model for M and a two group model for Y.
The expected values needed to define the causal effects are of the kind in Equation 32. Since it is a sum the terms can be calculated sepa-
rately.E[Y (x1, log(M(x0)))] = E[y|X = x1,M = 0, C = c]× P (M = 0|X = x0)+∫ ∞−∞
E[Y |X = x1,M > 0, C = c]× P (M > 0|X = x0, C = c)× f(log(M)|M > 0, X = x0, C = c)∂ log(M)(32)
First term
Which gives the first term the simple form
E[Y |X = x1,M = 0, C = c]× P (M = 0|X = x0) = (β(2)0 + β
(2)2 x1 + β
(2)4 c)× (1− Φ(κ0 + κ1x0 + κ2c)) (33)
60
Second term
In the second case the randomness of M must be accounted for, since the conditioning is not on a fixed value of M.
∫ ∞−∞
E[y|X = x1,M > 0, C = c]× P (M > 0|X = x0, C = c)× f(log(M)|M > 0, X = x0, C = c)︸ ︷︷ ︸Normal density by assumption
∂ log(M) =
=∫ ∞−∞
(β
(1)0 + β1 log(M) + β
(1)2 x1 + β3 log(M)x1 + β
(1)4 c
)× Φ(κ0 + κ1x0 + κ2c)× f(log(M); γ0 + γ1x0 + γ2c, σ
2)∂ log(M) =
= Φ(κ0 + κ1x0 + κ2c)×∫ ∞−∞
(β
(1)0 + β
(1)2 x1 + β
(1)4 c+ log(M)(β1 + β3x1)
)× f(log(M); γ0 + γ1x0 + γ2c, σ
2)∂ log(M)︸ ︷︷ ︸Part 1
(34)
looking only at the integral , using the sum rule of integration, and for simplicity let µm = γ0 + γ1x0 + γ2c
Part 1 =∫ ∞−∞
(β(1)0 + β
(1)2 x1 + β
(1)4 c)× f(log(M);µm, σ2)∂ log(M) +
∫ ∞−∞
log(M)(β1 + β3x1)× f(log(M);µm, σ2)∂ log(M) =
= (β(1)0 + β
(1)2 x1 + β
(1)4 c)×
∫ ∞−∞
f(log(M);µm, σ2)∂ log(M)︸ ︷︷ ︸=1
+(β1 + β3x1)×∫ ∞−∞
log(M)× f(log(M);µm, σ2)∂ log(M)︸ ︷︷ ︸=E[log(M)]=µm
=
= (β(1)0 + β
(1)2 x1 + β
(1)4 c) + (β1 + β3x1)(µm)
(35)
Combining this with the parts outside the integral gives
∫ ∞−∞
E[y|X = x1,M > 0, C = c]× P (M > 0|X = x0, C = c)× f(M |M > 0, X = x0, C = c)∂M =
= Φ(κ0 + κ1x0 + κ2c)×((β(1)
0 + β(1)2 x1 + β
(1)4 c) + (β1 + β3x1)(µm)
) (36)
61
Full expression
Adding the first and the second term the full expression is given by
E[Y (x1, log(M(x0)))] =
= (β(2)0 + β
(2)2 x1 + β
(2)4 c)× (1− Φ(κ0 + κ1x0 + κ2c)) + Φ(κ0 + κ1x0 + κ2c)×
((β(1)
0 + β(1)2 x1 + β
(1)4 c) + (β1 + β3x1)(γ0 + γ1x0 + γ2c)
)=
= (β(2)0 + β
(2)2 x1 + β
(2)4 c)− (β(2)
0 + β(2)2 x1 + β
(2)4 c)× Φ(κ0 + κ1x0 + κ2c) + Φ(κ0 + κ1x0 + κ2c)× (β(1)
0 + β(1)2 x1 + β
(1)4 c)+
Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x1)× (γ0 + γ1x0 + γ2c)
(37)
At this point β(1)2 = β
(2)2 and β
(1)4 = β
(2)4 is assumed, indicated by dropped superscript below. If this is not desired the above expression can be
used to define the effects.
E[Y (x1, log(M(x0)))] =
β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c) + (β2x1 + β4c)× (1− Φ(κ0 + κ1x0 + κ2c) + Φ(κ0 + κ1x0 + κ2c))+
Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x1)× (γ0 + γ1x0 + γ2c) =
= β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c) + (β2x1 + β4c) + Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x1)× (γ0 + γ1x0 + γ2c)
(38)
From the expression it can be seen that the parts dependent on M has the probit of M being larger than 1 multiplied with them. This also means
that if Pr(M>0) is close to one the effects are exactly the usual effects without two-part M.
62
Effects
For convenience some compact notation will be used for defining the counterfactuals based effects. E[Y (x1, log(M(x0))] is to be understood as the
expected value of Y given that Y is conditioned on x1 and log(M) is conditioned on x0.
The Total Natural Indirect Effect
TNIE = E[Y (x1, log(M(x1))|C = c]− E[Y (x1, log(M(x0))|C = c] =
= β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x1 + κ2c) + (β2x1 + β4c) + Φ(κ0 + κ1x1 + κ2c)× (β1 + β3x1)× (γ0 + γ1x1 + γ2c)−
− β(2)0 − (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c)− (β2x1 + β4c)− Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x1)× (γ0 + γ1x0 + γ2c) =
= (β(1)0 − β(2)
0 )(Φ(κ0 + κ1x1 + κ2c)− Φ(κ0 + κ1x0 + κ2c)
)+ (β1 + β3x1)
(Φ(κ0 + κ1x1 + κ2c)× (γ0 + γ1x1 + γ2c)− Φ(κ0 + κ1x0 + κ2c)× (γ0 + γ1x0 + γ2c)
)(39)
The Pure Natural Direct Effect
PNDE = E[Y (x1, log(M(x0))|C = c]− E[Y (x0, log(M(x0))|C = c] =
= β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c) + (β2x1 + β4c) + Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x1)× (γ0 + γ1x0 + γ2c)−
− β(2)0 − (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c)− (β2x0 + β4c)− Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x0)× (γ0 + γ1x0 + γ2c) =
= β2 × (x1 − x0) + Φ(κ0 + κ1x0 + κ2c)× (γ0 + γ1x0 + γ2c)× β3 × (x1 − x0)
(40)
63
The Pure Natural Indirect Effect
PNIE = E[Y (x0, log(M(x1))|C = c]− E[Y (x0, log(M(x0))|C = c]
= β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x1 + κ2c) + (β2x0 + β4c) + Φ(κ0 + κ1x1 + κ2c)× (β1 + β3x0)× (γ0 + γ1x1 + γ2c)−
− β(2)0 − (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c)− (β2x0 + β4c)− Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x0)× (γ0 + γ1x0 + γ2c) =
= (β(1)0 − β(2)
0 )×(Φ(κ0 + κ1x1 + κ2c)− Φ(κ0 + κ1x0 + κ2c)
)+
(β1 + β3x0)×(Φ(κ0 + κ1x1 + κ2c)× (γ0 + γ1x1 + γ2c)− Φ(κ0 + κ1x0 + κ2c)× (γ0 + γ1x0 + γ2c)
)(41)
The Total Natural Direct Effect
TNDE = E[Y (x1, log(M(x1))|C = c]− E[Y (x0, log(M(x1))|C = c] =
= β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x1 + κ2c) + (β2x1 + β4c) + Φ(κ0 + κ1x1 + κ2c)× (β1 + β3x1)× (γ0 + γ1x1 + γ2c)−
− β(2)0 − (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x1 + κ2c)− (β2x0 + β4c)− Φ(κ0 + κ1x1 + κ2c)× (β1 + β3x0)× (γ0 + γ1x1 + γ2c) =
= β2 × (x1 − x0) + Φ(κ0 + κ1x1 + κ2c)× (γ0 + γ1x1 + γ2c)× β3 × (x1 − x0)
(42)
64
The Total effect
TE = E[Y (x1, log(M(x1))|C = c]− E[Y (x0, log(M(x0))|C = c] =
= β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x1 + κ2c) + (β2x1 + β4c) + Φ(κ0 + κ1x1 + κ2c)× (β1 + β3x1)× (γ0 + γ1x1 + γ2c)−
− β(2)0 + (β(1)
0 − β(2)0 )× Φ(κ0 + κ1x0 + κ2c) + (β2x0 + β4c) + Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x0)× (γ0 + γ1x0 + γ2c) =
= (β(1)0 − β(2)
0 )×(Φ(κ0 + κ1x1 + κ2c)− Φ(κ0 + κ1x0 + κ2c)
)+ β2 × (x1 − x0)+
+ Φ(κ0 + κ1x1 + κ2c)× (β1 + β3x1)× (γ0 + γ1x1 + γ2c)− Φ(κ0 + κ1x0 + κ2c)× (β1 + β3x0)× (γ0 + γ1x0 + γ2c)
(43)
For which it holds that
TE = TNIE + PNDE (44)65
B Appendix - Derivation - Twopart M, twopart Y
Modelslog(Yi|Yi > 0) = β
(1)0 + β1 log(Mi) + β
(1)2 Xi + β3 log(Mi)Xi + β
(1)4 Ci + εyi ,M > 0
probit(Pr(Yi > 0)) = θ(1)0 + θ1 log(Mi) + θ
(1)2 Xi + θ3 log(Mi)Xi + θ
(1)4 Ci ,M > 0
log(Yi|Yi > 0) = β(2)0 + β
(2)2 Xi + β
(2)4 Ci + εyi ,M = 0
probit(Pr(Yi > 0)) = θ(2)0 + θ
(2)2 Xi + θ
(2)4 Ci ,M = 0
log(Mi|Mi > 0) = γ0 + γ1Xi + γ2Ci + εmi
probit(Pr(Mi > 0)) = κ0 + κ1Xi + κ2Ci
(45)
Where by assumption εyi ∼ N(0, σ2y) and εmi ∼ N(0, σ2
m). This is a combination of a two-part model for M, a two-part model for Y together with
a two group model for Y.
The expected values needed to define the causal effects are of the kind displayed Equation 46. The terms where Y=0 is not brought into the
expression since they will be zero regardless of the corresponding probability. Since the expected value is a sum the terms can be calculated
66
separately.
E[Y (x1, log(M(x0)))] =
P (M = 0|X = x0, C = c)× P (Y > 0|X = x1,M = 0, C = c)× E[Y |Y > 0, X = x1,M = 0, C = c]+∫ ∞−∞
P (M > 0|X = x0, C = c)× P (Y > 0|X = x1,M = m,C = c)× f(log(M)|M > 0, X = x0, C = c)× E[Y |Y > 0, X = x1,M > 0, C = c] ∂ log(M)
(46)
First termI = E[Y |Y > 0, X = x1,M = 0, C = c]× P (M = 0|X = x0)× P (Y > 0|X = x1,M = 0, C = c) =
= φ× exp(β
(2)0 + β
(2)2 x1 + β
(2)4 c
)× Φ
(θ
(2)0 + θ
(2)2 x1 + θ
(2)4 c
)×(1− Φ
(κ0 + κ1x0 + κ2c
)) (47)
Where φ = exp(σ2εy
2
)from the definition of the expected value of a lognormal distribution (Aitchison and Brown, 1966).
67
Second termII =
∫ ∞−∞
E[Y |Y > 0, X = x1,M = m,C = c]× P (Y > 0|X = x1,M = 0, C = c)×
P (M > 0|X = x0, C = c)× f(log(M)|M > 0, X = x0, C = c)︸ ︷︷ ︸Normal density by assumption
∂ log(M) =
=∫ ∞−∞
φ× exp(β
(1)0 + β1 log(m) + β
(1)2 x1 + β3 log(m)x1 + β
(1)4 c
)×
Φ(θ
(1)0 + θ1 log(m) + θ
(1)2 x1 + θ3 log(m)x1 + θ
(1)4 c
)× Φ
(κ0 + κ1x0 + κ2c
)× f(log(M), γ0 + γ1xo + γ2c︸ ︷︷ ︸
=µM
;σ2M )∂ log(M) =
= φ× exp(β
(1)0 + β
(1)2 x1 + β
(1)4 c
)× Φ
(κ0 + κ1x0 + κ2c
)×∫ ∞
−∞exp (β1 log(m) + β3 log(m)x1)× f(log(M), µM ;σ2
M )︸ ︷︷ ︸Part1
×Φ(θ
(1)0 + θ1 log(m) + θ
(1)2 x1 + θ3 log(m)x1 + θ
(1)4 c
)∂ log(M)
(48)
68
Again φ = exp(σ2εy
2
)(Aitchison and Brown, 1966). Note that µM and σ2
M is short for µlog(M) and σ2log(M) respectively. A closer look at part 1 by
expanding the density function of log(M)
Part1 = exp(
log(m) (β1 + β3x1))× (2πσ2
M )−12 exp
(− 1
2σ2M
(log(m)− µM )2)
=
= (2πσ2M )−
12 × exp
(− 1
2σ2M
(log(m)− µM )2 + log(m) (β1 + β3x1))
=
= (2πσ2M )−
12 × exp
(− log(m)2
2σ2M
+ 2 log(m)µM2σ2
M
− µ2M
2σ2M
+ log(m) (β1 + β3x1) 2σ2µM2σ2
MµM
)=
= (2πσ2M )−
12 × exp
(− log(m)2
2σ2M
+ 2 log(m)µM2σ2
M
(1 + (β1 + β3x1)σ2
M
µM
)− µ2
M
2σ2M
)=
= (2πσ2M )−
12 × exp
(− log(m)2
2σ2 + 2 log(m)µM2σ2
M
(1 + (β1 + β3x1)σ2
M
µM
)− µ2
M
2σ2M
−
− µ2M
2σ2M
(1 + (β1 + β3x1)σ2
M
µM
)2
+ µ2M
2σ2M
(1 + (β1 + β3x1)σ2
M
µM
)2)=
(49)
Set b =(
1 + (β1+β3x1)σ2M
µM
), this gives
Part1 = exp(µ2M
2σ2M
(b2 − 1
))× (2πσ2
M )−12 × exp
(− 1
2σ2M
(log(m)− bµM )2)
︸ ︷︷ ︸Normal density
(50)
Part 1 is inserted in IIII = φ× exp
(β
(1)0 + β
(1)2 x1 + β
(1)4 c
)× Φ
(κ0 + κ1x0 + κ2c
)× exp
(µ2M
2σ2M
(b2 − 1
))×∫ ∞
−∞Φ(θ
(1)0 + θ1 log(m) + θ
(1)2 x1 + θ3 log(m)x1 + θ
(1)4 c
)× f(log(M); bµM , σ2
M )∂ log(M)︸ ︷︷ ︸Part2
(51)
69
Part 2 is expandedPart2 =
∫ ∞−∞
Φ(θ
(1)0 + θ1 log(m) + θ
(1)2 x1 + θ3 log(m)x1 + θ
(1)4 c
)× f(log(M); bµM , σ2
M )∂ log(M) =
=∫ ∞−∞
∫ θ(1)0 +θ1 log(m)+θ(1)
2 x1+θ3 log(m)x1+θ(1)4 c
−∞f(Z; 0, 1)∂Z × f(log(M); bµM , σ2
M )∂ log(M) =
=∫ ∞−∞
∫ θ(1)0 +θ(1)
2 x1+θ(1)4 c+log(m)(θ1+θ3x1)
−∞f(Z; 0, 1)∂z × f(log(M); bµM , σ2
M )∂ log(M)
(52)
By integral transformations
Part2 =∫ ∞−∞
∫ θ(1)0 +θ(1)
2 x1+θ(1)4 c
−∞f(Z| log(M);− log(m) (θ1 + θ3x1) , 1
)∂Z × f
(log(M); bµM , σ2
M
)∂ log(M) = (53)
Since the support of the integral no longer is a function of M or Z, the orders of the integrals may be changed
Part2 =∫ θ
(1)0 +θ(1)
2 x1+θ(1)4 c
−∞
∫ ∞−∞
f(Z| log(M);− log(m) (θ1 + θ3x1) , 1
)× f
(log(M); bµM , σ2
M
)∂ log(M)∂Z (54)
By the appendix in Muthen (1979) the inner integral can be simplified further
Part2 =∫ θ
(1)0 +θ(1)
2 x1+θ(1)4 c
−∞f(Z;− (θ1 + θ3x1) bµ︸ ︷︷ ︸
Mean
, (θ1 + θ3x1)2 σ2M + 1︸ ︷︷ ︸
Variance
)∂Z (55)
Again integral transformations gives
Part2 =∫ θ
(1)0 +θ(1)
2 x1+θ(1)4 c+(θ1+θ3x1)bµ√
(θ1+θ3x1)2σ2M
+1
−∞f(z; 1, 0)∂Z = Φ
(θ
(1)0 + θ
(1)2 x1 + θ
(1)4 c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
)(56)
70
Part 2 is inserted into II
II = φ× exp(β
(1)0 + β
(1)2 x1 + β
(1)4 c
)× Φ
(κ0 + κ1x0 + κ2c
)× exp
(µ2
2σ2M
(b2 − 1
))× Φ
(θ
(1)0 + θ
(1)2 x1 + θ
(1)4 c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
)=
= exp(β
(1)0 + β
(1)2 x1 + β
(1)4 c+ µ2
2σ2M
(b2 − 1
))× Φ
(κ0 + κ1x0 + κ2c
)× Φ
(θ
(1)0 + θ
(1)2 x1 + θ
(1)4 c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
) (57)
Adding I and II together yields
E[Y (x1, log(M(x0)))] =
= φ× exp(β
(2)0 + β
(2)2 x1 + β
(2)4 c
)× Φ
(θ
(2)0 + θ
(2)2 x1 + θ
(2)4 c
)×(1− Φ
(κ0 + κ1x0 + κ2c
))+
φ× exp(β
(1)0 + β
(1)2 x1 + β
(1)4 c+ µ2
2σ2M
(b2 − 1
))× Φ
(κ0 + κ1x0 + κ2c
)× Φ
(θ
(1)0 + θ
(1)2 x1 + θ
(1)4 c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
) (58)
71
Common parameters
Set β(1)2 = β
(2)2 and β
(1)4 = β
(2)4 . That is, the slope of Y on X and the covariate C is assumed to be the same for both groups of M. If that is not
desired the expression above can be used to calculate the effects. This restriction gives us the simplified expression
E[Y (x1, log(M(x0)))] = φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)Φ(θ
(2)0 + θ
(2)2 x1 + θ
(2)4 c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ
(1)2 x1 + θ
(1)4 c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
)(59)
Similar restrictions are in this derivation made to the coefficients of the probit model, that is θ(1)2 = θ
(2)2 and θ(1)
4 = θ(2)4 . Again, if these restrictions
are not desired the above expression can be used to calculate the effects. These restrictions gives
E[Y (x1, log(M(x0)))] = φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)Φ(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
)(60)
Although not as easy to see as for the two-part M derivation this again reduces to the usual counterfactual based effects when the Pr(M>0) and
Pr(Y>0) gets close to 1.
72
The conditional expectations
There are four conditional expected values to define all causal effects. The one derived was arbitrarily chosen since Y and M is conditioned on
different x-values, which makes the generalization easier at the next step.
E[Y (x0, log(M(x0)))] = φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0) bµ√
(θ1 + θ3x0)2 σ2M + 1
) (61)
E[Y (x1, log(M(x1)))] = φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x1 + κ2c)× Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
) (62)
E[Y (x0, log(M(x1)))] = φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x1 + κ2c)× Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0) bµ√
(θ1 + θ3x0)2 σ2M + 1
) (63)
E[Y (x1, log(M(x0)))] = φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
) (64)
73
Causal effects
For convenience lets set up some notation for when defining the counterfactuals based effects. E[Y (x1, log(M(x0))] is to be understood as the
expected value of Y given that Y is conditioned on x1 and log(M) is conditioned on x0.
At the end of each expression b and µ are substituted with their full expression, note that both are functions of X and are thus different from case
to case.
(b|x1,M(x0)) = 1 + (β1 + β3x1)σ2M
γ0 + γ1x0 + γ2c
(b|x1,M(x1)) = 1 + (β1 + β3x1)σ2M
γ0 + γ1x1 + γ2c
(b|x0,M(x0)) = 1 + (β1 + β3x0)σ2M
γ0 + γ1x0 + γ2c
(b|x0,M(x1)) = 1 + (β1 + β3x0)σ2M
γ0 + γ1x1 + γ2c
(65)
(µM |M(x1) = γ0 + γ1x1 + γ2c
(µM |M(x0) = γ0 + γ1x0 + γ2c(66)
74
The Total Natural Indirect Effect
TNIE = E[Y (x1, log(M(x1))|C = c]− E[Y (x1, log(M(x0))|C = c] =
φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x1 + κ2c)× Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
)−φ× exp
(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
) =
= φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)−
φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1xo + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1xo + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)
(67)
75
The Pure Natural Direct Effect
PNDE = E[Y (x1, log(M(x0))|C = c]− E[Y (x0, log(M(x0))|C = c] =
φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
)−φ× exp
(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0) bµ√
(θ1 + θ3x0)2 σ2M + 1
) =
=φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1xo + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1xo + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)−
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1xo + γ2c)2
2σ2M
((
1 + (β1 + β3x0)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x0)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1xo + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)
(68)
76
The Pure Natural Direct Effect
PNIE = E[Y (x0, log(M(x1))|C = c]− E[Y (x0, log(M(x0))|C = c]
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x1 + κ2c)× Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0) bµ√
(θ1 + θ3x0)2 σ2M + 1
)−φ× exp
(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0) bµ√
(θ1 + θ3x0)2 σ2M + 1
) =
=φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
((
1 + (β1 + β3x0)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x0)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)−
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1xo + γ2c)2
2σ2M
(1 + (β1 + β3x0)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x0)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1x0 + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)
(69)
77
The Total Natural Direct Effect
TNDE = E[Y (x1, log(M(x1))|C = c]− E[Y (x0, log(M(x1))|C = c] =
φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x1 + κ2c)× Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
)−φ× exp
(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x1 + κ2c)× Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0) bµ√
(θ1 + θ3x0)2 σ2M + 1
) =
φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)−
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x1)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)
(70)
78
The Total effectTE = E[Y (x1, log(M(x1))|C = c]− E[Y (x0, log(M(x0))|C = c] =
φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x1 + κ2c)× Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1) bµ√
(θ1 + θ3x1)2 σ2M + 1
)−φ× exp
(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp(β
(1)0 + µ2
2σ2M
(b2 − 1))× Φ (κ0 + κ1x0 + κ2c)× Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0) bµ√
(θ1 + θ3x0)2 σ2M + 1
) =
=φ× exp(β2x1 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x1 + θ4c
)×(1− Φ (κ0 + κ1x1 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x1 + γ2c)2
2σ2M
(1 + (β1 + β3x1)σ2M
(γ0 + γ1x1 + γ2c)
)2
− 1
× Φ (κ0 + κ1x1 + κ2c)×
Φ
θ(1)0 + θ2x1 + θ4c+ (θ1 + θ3x1)
(1 + (β1+β3x1)σ2
M(γ0+γ1x1+γ2c)
)(γ0 + γ1x1 + γ2c)√
(θ1 + θ3x1)2 σ2M + 1
)−
φ× exp(β2x0 + β4c
)×(
exp(β
(2)0
)× Φ
(θ
(2)0 + θ2x0 + θ4c
)×(1− Φ (κ0 + κ1x0 + κ2c)
)+
exp
β(1)0 + (γ0 + γ1x0 + γ2c)2
2σ2M
(1 + (β1 + β3x0)σ2M
(γ0 + γ1x0 + γ2c)
)2
− 1
× Φ (κ0 + κ1x0 + κ2c)×
Φ
θ(1)0 + θ2x0 + θ4c+ (θ1 + θ3x0)
(1 + (β1+β3x0)σ2
M(γ0+γ1x0+γ2c)
)(γ0 + γ1x0 + γ2c)√
(θ1 + θ3x0)2 σ2M + 1
)
(71)
79
80
C Appendix - Mplus syntax
In the subsections below some examples of input syntax for Mplus is displayed. The
examples are arbitrary chosen from the 192 runs that was performed for this study.
The code of the runs not displayed here was of the same structure, only with different
parameters values of the true models.
Internal Monte Carlo simulation Mplus syntaxInput syntax for the internal Monte Carlo simulation of the weak effects model. Allgenerated variables are independent normal variables.
MONTECARLO:NAMES ARE X Mres1 Zres Yres ;NREPS=1000;NOBS=100;REPSAVE=ALL;SAVE = twopartm . 1 0 0∗ . dat ;SEED=11;
MODEL POPULATION:[X@5 ] ; X@1;[ Mres1@0 ] ; Mres1@1 ;[ Zres@0 ] ; Zres@1 ;[ Yres@0 ] ; Yres@0 . 3 ;
MODEL:[X@5 ] ; X@1;[ Mres1@0 ] ; Mres1@1 ;[ Zres@0 ] ; Zres@1 ;[ Yres@0 ] ; Yres@0 . 3 ;
External Monte Carlo simulation, two-part M two-part Y, syntaxInput syntax for the external Monte Carlo simulation of the weak effects model. TheTrue model is two-part M, two-part Y. The effects are estimated with a two-part M,two-part Y model. The Censoring amount is 25%.
DATA:FILE = twopartMandY .100 l i s t . dat ;TYPE = MONTECARLO;
Var iab le :NAMES = X Mres Zres1 Zres2 Yres ;USEVAR = X Zm Zy logYpos logMpos ;
81
CATEGORICAL = Zm Zy ;
DEFINE: M = exp (4 + 0.5∗X + Mres ) ;Zstar1 = 0.4∗X + Zres1 ;IF ( Zstar1 <=1.273552) THEN Zm true = 0 ;IF ( Zstar1> 1 .273552) THEN Zm true = 1 ;logM = log (M) ;Zstar2 = 0.3∗ Zm true + 0.35∗ logM + 0.32∗X + Zres2 ;Y = exp(2+ 0.55∗ Zm true + 0.25∗ logM + 0.25∗X + Yres ) ;logY=log (Y) ;IF ( Zstar2 <=3.313404) THEN logY = 0 ;IF ( Zstar1 <=1.273552) THEN logM = 0 ;IF ( logM<=0) THEN Zm = 0 ;IF ( logM> 0) THEN Zm = 1 ;IF ( logM> 0) THEN logMpos = logM ;IF ( logY<=0) THEN Zy = 0 ;IF ( logY >0) THEN Zy = 1 ;IF ( logY >0) THEN logYpos = logY ;
ANALYSIS :ESTIMATOR = ML;LINK = PROBIT;INTEGRATION = MONTECARLO;PROCESSORS = 2 ;
MODEL:logMpos ON X∗0 . 5 (gamma1 ) ;[ logMpos ∗4 ] (gamma0 ) ;logMpos ( sigM ) ;
Zm ONX∗0 . 4 ( kappa1 ) ;[ Zm$1∗1 . 273552 ] ( kappa0 ) ;
logYpos ONZm∗0 .55 ( b 0 d i f f )logMpos ∗0 .25 ( bet1 )X∗0 .25 ( bet2 ) ;
[ logYpos ∗2 ] ( bet01 ) ;logYpos ∗1( sigY ) ;
Zy ONZm∗0 .3 ( t h e t 0 d i f f )logMpos ∗0 .35 ( thet1 )X∗0 .32 ( thet2 ) ;
[ Zy$1 ∗3 .313404 ] ( thet01 ) ;
82
MODEL CONSTRAINT:new( bet02 bet3 thet02 thet3 b11 b00 b01 b10 mu0 mu1 x0 x1e00 e11 e01 e10 f i bet30 bet31 Pk0 Pk1 sq0 sq1 Pt020 Pt021TNIE∗43.75021PNDE∗40.40601PNIE∗33.08004TNID∗51.07618TE∗84.15622CONTROL∗84 . 15622 ) ;x1 = 6 ;x0 = 5 ;bet3 = 0 ;thet3 = 0 ;bet02 = b 0 d i f f + bet01 ;thet02 = t h e t 0 d i f f + (− thet01 ) ;f i = EXP( sigY / 2 ) ;mu0 = (gamma0+gamma1∗x0 ) ;mu1 = (gamma0+gamma1∗x1 ) ;Pk0 = PHI(−kappa0+kappa1∗x0 ) ;Pk1 = PHI(−kappa0+kappa1∗x1 ) ;Pt020 = PHI( thet02+thet2 ∗x0 ) ;Pt021 = PHI( thet02+thet2 ∗x1 ) ;sq0 = SQRT( ( thet1+thet3 ∗x0 )ˆ2∗ sigM +1);sq1 = SQRT( ( thet1+thet3 ∗x1 )ˆ2∗ sigM +1);bet30 = ( bet1+bet3∗x0 ) ;bet31 = ( bet1+bet3∗x1 ) ;b00 = (1+bet30∗sigM/mu0 ) ;b11 = (1+bet31∗sigM/mu1 ) ;b10 = (1+bet31∗sigM/mu0 ) ;b01 = (1+bet30∗sigM/mu1 ) ;
e00 = f i ∗EXP( bet2∗x0 )∗ (EXP( bet02 )∗Pt020∗(1−Pk0)+EXP( bet01+(mu0ˆ2/2∗ sigM )∗ ( b00ˆ2−1))∗Pk0∗PHI((− thet01+thet2 ∗x0+(thet1+thet3 ∗x0 )∗b00∗mu0)/ sq0 ) ) ;
e11 = f i ∗EXP( bet2∗x1 )∗ (EXP( bet02 )∗Pt021∗(1−Pk1)+EXP( bet01+(mu1ˆ2/2∗ sigM )∗ ( b11ˆ2−1))∗Pk1∗PHI((− thet01+thet2 ∗x1+(thet1+thet3 ∗x1 )∗b11∗mu1)/ sq1 ) ) ;
e10 = f i ∗EXP( bet2∗x1 )∗ (EXP( bet02 )∗Pt021∗(1−Pk0)+EXP( bet01+(mu0ˆ2/2∗ sigM )∗ ( b10ˆ2−1))∗Pk0∗PHI((− thet01+thet2 ∗x1+(thet1+thet3 ∗x1 )∗b10∗mu0)/ sq1 ) ) ;
e01 = f i ∗EXP( bet2∗x0 )∗ (
83
EXP( bet02 )∗Pt020∗(1−Pk1)+EXP( bet01+(mu1ˆ2/2∗ sigM )∗ ( b01ˆ2−1))∗Pk1∗PHI((− thet01+thet2 ∗x0+(thet1+thet3 ∗x0 )∗b01∗mu1)/ sq0 ) ) ;
TNIE = e11−e10 ;PNDE = e10−e00 ;PNIE = e01−e00 ;TNID = e11−e01 ;TE = e11−e00 ;CONTROL = TNIE+PNDE;
External Monte Carlo simulation, classical, syntaxInput syntax for the external Monte Carlo simulation of the weak effects model. The truemodel is two-part M. The effects are estimated with a classical model, without accountingfor the censored M. The Censoring amount is 25%.
DATA:FILE = twopartm .100 l i s t . dat ;TYPE = MONTECARLO;
VARIABLE:NAMES = X Mres1 Zres Yres ;USEVAR = X Y M;
DEFINE:M = 4 + 0.5∗X + Mres1 ;Zstar = 0.4∗X + Zres ;IF ( Zstar <=1.273552) THEN Ztrue = 0 ;IF ( Zstar> 1 .273552) THEN Ztrue = 1 ;Y = 2 + 0.5∗ Ztrue+ 0.25∗M +0.25∗X +Yres ;IF ( Zstar <=1.273552) THEN M = 0 ;
ANALYSIS :ESTIMATOR = ML;
MODEL:M ON X∗0 . 5 (gamma1 ) ;[M∗4 ] ( gamma0 ) ;
M∗1 ;
Y ONM∗0 .25 ( bet1 )X∗0 .25 ( bet2 ) ;
[Y∗2 ] ( bet01 ) ;
84
Y∗1 ;
MODEL CONSTRAINT:NEW( x1 x0 e00 e01 e10 e11 bet02 bet3TNIE∗0.2255199PNDE∗0 .25PNIE∗0.2255199TNID∗0 .25TE∗0.4755199CONTROL∗0 .4755199 ;bet02 = 0 ;x1 = 6 ;x0 = 5 ;bet3 = 0 ;e00 = bet01+bet1 ∗(gamma0+gamma1∗x0)+bet2∗x0 ;e11 = bet01+bet1 ∗(gamma0+gamma1∗x1)+bet2∗x1 ;e01 = bet01+bet1 ∗(gamma0+gamma1∗x1)+bet2∗x0 ;e10 = bet01+bet1 ∗(gamma0+gamma1∗x0)+bet2∗x1 ;TNIE = e11−e10 ;PNDE = e10−e00 ;PNIE = e01−e00 ;TNID = e11−e01 ;TE = e11−e00 ;CONTROL = TNIE+PNDE;
External Monte Carlo simulation, two-part M, syntaxInput syntax for the Monte Carlo simulation for the weak effects model. The true modelis two-part M. The effects are estimated with a two-part M model. The Censoring amountis 25%.
DATA:FILE = twopartm .100 l i s t . dat ;TYPE = MONTECARLO;
Var iab le :NAMES = X Mres1 Zres Yres ;USEVAR = X Y Z Mpos ;CATEGORICAL = Z ;
DEFINE:M = 4 + 0.5∗X + Mres1 ;Zstar = 0.4∗X + Zres ;
IF ( Zstar <=1.273552) THEN Ztrue = 0 ;IF ( Zstar >1.273552 ) THEN Ztrue = 1 ;Y = 2 + 0.5∗ Ztrue+ 0.25∗M +0.25∗X +Yres ;IF ( Zstar$<=$1 .273552) THEN M = 0 ;
85
IF (M<=0) THEN Z = 0 ;IF (M> 0) THEN Z = 1 ;IF (M> 0) THEN Mpos = M;
ANALYSIS :ESTIMATOR = ML;LINK = PROBIT;
MODEL:Mpos ON X∗0 . 5 (gamma1 ) ;[ Mpos∗4 ] ( gamma0 ) ;Mpos∗1 ;
Z ONX∗0 . 4 ( kappa1 ) ;[ Z$1 ∗1 . 273552 ] ( kappa0 ) ;
Y ONZ∗0 . 5 ( b 0 d i f f )Mpos∗0 .25 ( bet1 )X∗0 .25 ( bet2 ) ;
[Y∗2 ] ( bet01 ) ;Y∗ 0 . 3 ;Mpos ;
MODEL CONSTRAINT:NEW( x1 x0 e00 e01 e10 e11 bet02 Pk0 Pk1 bet3TNIE∗0.2255199PNDE∗0 .25PNIE∗0.2255199TNID∗0 .25TE∗0.4755199CONTROL∗0 .4755199 ;bet02 = b 0 d i f f + bet01 ;x1 = 6 ;x0 = 5 ;Pk0 = PHI(−kappa0+kappa1∗x0 ) ;Pk1 = PHI(−kappa0+kappa1∗x1 ) ;bet3 =0;
e00 = bet02 + ( bet01−bet02 )∗Pk0 + bet2∗x0 +Pk0∗( bet1+bet3∗x0 )∗ (gamma0+gamma1∗x0 ) ;e11 = bet02 + ( bet01−bet02 )∗Pk1 + bet2∗x1 +Pk1∗( bet1+bet3∗x1 )∗ (gamma0+gamma1∗x1 ) ;e01 = bet02 + ( bet01−bet02 )∗Pk1 + bet2∗x0 +Pk1∗( bet1+bet3∗x0 )∗ (gamma0+gamma1∗x1 ) ;
86
e10 = bet02 + ( bet01−bet02 )∗Pk0 + bet2∗x1 +Pk0∗( bet1+bet3∗x1 )∗ (gamma0+gamma1∗x0 ) ;
TNIE = e11−e10 ;PNDE = e10−e00 ;PNIE = e01−e00 ;TNID = e11−e01 ;TE = e11−e00 ;CONTROL = TNIE+PNDE;
87