estimands, missing data, and sensitivity analysis …estimands, missing data, and sensitivity...

Estimands, Missing Data, and Sensitivity Analysis

Geert Molenberghs

[email protected] & [email protected]

Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BioStat)

Universiteit Hasselt & KU Leuven, Belgium

www.ibiostat.be

Interuniversity Institute for Biostatisticsand statistical Bioinformatics

Kyoto, Japan December 14, 2018

Survey Sampling 101

Survey population: The collection of units (individuals) about which the researcherwants to make quantitative statements.

Sample frame: The set of units (individuals) that has non-zero probability of beingselected.

Sample: The subset of units that have been selected.

Probability sampling: The family of probabilistic (stochastic) methods by which asubset of the units from the sample frame is selected.

Design properties: The entire collection of methodological aspects that leads to theselection of a sample.

Estimands, Missing Data, and Sensitivity Analysis 1

Sample size: The number of units in the sample.

Analysis and inference: The collection of statistical techniques by which populationestimands are estimated.

Examples: estimation of means, averages, totals, linear regression, ANOVA, logisticregression, loglinear models.

Estimand: The true population quantity (e.g., the average body mass index of the USpopulation).

Estimator: A (stochastic) function of the sample data, with the aim to “come close” tothe estimand.

Estimate: A particular realization of the estimator, for the particular sample taken (e.g.,22.37).


Your M.o.t.R. Clinical Trial

• Setting:

Potential outcomes (T0j, T1j)

Individual treatment effect ∆Tj = T1j − T0j

Expected treatment effect β = E(T1j − T0j)

• No missing data =⇒ 50% of missing data

• Fair to say: Estimand is β = E(T1 − T0) in population

• Randomization: Treatment effect estimable from observed data:

• Estimator: T1 − T0


• Information coming from:

. data

. design

. assumptions××××××

• Would be different in an epidemiological study


Surrogate Endpoints Evaluation:Potential Outcomes

Alonso, Van der Elst, Molenberghs (Statistical Modeling 2016)

• Setting:

Potential outcomes (T0j, T1j)

Individual causal effect ∆Tj = T1j − T0j

Expected causal effect β = E(T1j − T0j)

Surrogate Sj


• Predictive causal association:

ρψ = corr(∆Tj, Sj)

• (Un)identifiability:

ρT0T1not identifiable


. data

. design

. assumptions −→ sensitivity


⇒ Sensitivity analysis for age-related macular degeneration trial:

PCA

ρψ

Perc

enta

ge

0.00 0.01 0.02 0.03 0.04 0.05 0.06

0.0

0.2

0.4

0.6

0.8


Surrogate Endpoints Evaluation:Full Causal Paradigm

Alonso et al. (Biometrics 2015)

• Setting:

Treatment potential outcomes (T0j, T1j)

Treatment individual causal effect ∆Tj = T1j − T0j

Surrogate potential outcomes (S0j, S1j)

Surrogate individual causal effect ∆Sj = S1j − S0j

• Individual causal association (ICA):

ρ∆ = corr(∆Tj,∆Sj)


• Joint distribution unidentifiable

• Capture assumptions in causal diagrams → reduced forms of ρ∆


. data

. design

. assumptions −→ sensitivity

• Meta-analytic version in multiple trials


−10 −5 0 5 10

−20

−15

−10

−5

05

10

15

(a) Trial−level surrogacy

Treatment effect on the surrogate endpoint

Treatm

ent effect on the tru

e e

ndpoin

t

−30 −20 −10 0 10 20 30

−30

−20

−10

010

20

30

(b) Individual−level surrogacy

Results for the surrogate endpoint

Results for

the tru

e e

ndpoin

t

(c) MICA

ρM

Percen

tage

−0.2 0.0 0.2 0.4 0.6 0.8 1.0

0.00

0.05

0.10

0.15


Terms of Enrichment

Enriched data

Coarse data Augmented data

Incomplete data Randomized studies

Censored data Random effects

Joint models Latent classes

Grouped data Latent variables

Non-compliance Mixtures


Increasing Complexity

• Standard clinical trial: design compensates for what is unobserved

• Surrogacy: augmentation: sensitivity is design-based

• Incomplete data /non-compliance: coarsening:

sensitivity is (non-)observation-based

. (Subjective) choices unavoidable

. Interference of intercurrent events

. Scenarios needed about f (ymi |yoi ,xi, θ)

. Such scenarios should preserve estimand

. Easy and elegant with MI


The Slovenian Plebiscite

Rubin, Stern, and Vehovar (1995)

• Slovenian Public Opinion (SPO) Survey

• Four weeks prior to decisive plebiscite

• Three questions:

1. Are you in favor of Slovenian independence?

2. Are you in favor of Slovenia’s secession from Yugoslavia?

3. Will you attend the plebiscite?

• Political decision: ABSENCE≡NO

• Primary Estimand: θ: Proportion in favor of independence


• Slovenian Public Opinion Survey Data:

Independence

Secession Attendance Yes No ∗Yes Yes 1191 8 21

No 8 0 4

∗ 107 3 9

No Yes 158 68 29

No 7 14 3

∗ 18 43 31

∗ Yes 90 2 109

No 1 2 25

∗ 19 8 96


Slovenian Plebiscite ←→ Slovenian PublicOpinion Survey

θ =0.885

Estimator θ

Pessimistic bound 0.694

Optimistic bound 0.904

Complete cases 0.928 ?Available cases 0.929 ?MAR (2 questions) 0.892

MAR (3 questions) 0.883

MNAR 0.782


Rubin, 1976

• Ignorability: Rubin (Biometrika, 1976): 42 years ago!

• Little and Rubin (1976, 2002)

• Why did it take so long?


A Vicious Triangle

Industry

↗↙ ↘↖←−Academe Regulatory−→

• Academe: The R2 principle

• Regulatory: Rigid procedures ←→ scientific developments

• Industry: We cannot / do not want to apply new methods


Terminology & Confusion

• The Ministry of Disinformation:

←− All directions

Other directions −→

•MCAR, MAR, MNAR: “What do the terms mean?”

•MAR, random dropout, informative missingness, ignorable, censoring,. . .

• Dropout from the study, dropout from treatment, lost to follow up,. . .

• “Under MAR patients dropping out and patients not dropping out aresimilar.”


A Virtuous Triangle

Industry

↗↙ ↘↖←−Academe Regulatory−→

• FDA/Industry Workshops

• DIA/EMA Meetings

• The NAS Experience


A Virtuous Quadrangle

←−Industry Patients−→

l l

←−Academe Regulatory−→


The NAS Experience: A WholesomeProduct

• FDA −→ NAS −→ the working group

• Composition

• Encompassing:

. terminology/taxonomy/concepts

. prevention

. treatment


Taxonomy

•Missingness pattern: complete — monotone — non-monotone

• Dropout pattern: complete — dropout — intermittent

•Model framework: SEM — PMM — SPM

•Missingness mechanism: MCAR — MAR — MNAR

• Ignorability: ignorable — non-ignorable

• Inference paradigm: frequentist — likelihood — Bayes


The NAS Panel

Name Specialty Affiliation

Rod Little biostat U Michigan

Ralph D’Agostino biostat Boston U

Kay Dickerson epi Johns Hopkins

Scott Emerson biostat U Washington

John Farrar epi U Penn

Constantine Frangakis biostat Johns Hopkins

Joseph Hogan biostat Brown U

Geert Molenberghs biostat U Hasselt & K.U.Leuven

Susan Murphy stat U Michigan

James Neaton biostat U Minnesota

Andrea Rotnitzky stat Buenos Aires & Harvard

Dan Scharfstein biostat Johns Hopkins

Joseph Shih biostat New Jersey SPH

Jay Siegel biostast J&J

Hal Stern stat UC at Irvine


Every MNAR Model Has Got an MARCounterpart

Molenberghs, Beunckens, Sotto, and Kenward (JRSSB 2008)

Creemers, Hens, Aerts, Molenberghs, Verbeke, and Kenward (2008)

• Fit an MNAR model to a set of incomplete data

• Change the conditional distribution of the unobserved outcomes, given the observedones, to comply with MAR

• Resulting new model has exactly the same fit as the original MNAR model

• The missing data mechanism has changed

• This implies that definitively testing for MAR versus MNAR is not possible


Slovenian Public Opinion Survey: AnMNAR Model Family

Baker, Rosenberger, and DerSimonian (1992)

• Counts: Yr1r2jk

• Questions: j, k = 1, 2

• Non-response: r1, r2 = 0, 1

E(Y11jk) = mjk

E(Y10jk) = mjkβjk

E(Y01jk) = mjkαjk

E(Y00jk) = mjkαjkβjkγjk

. αjk: non-response on independence question

. βjk: non-response on attendance question

. γjk: interaction between both non-response indicators


SPO Survey: Interval of Ignorance &Counterpart

Model Structure d.f. loglik θ C.I. θMAR

BRD1 (α, β) 6 -2495.29 0.892 [0.878;0.906] 0.8920

BRD2 (α, βj) 7 -2467.43 0.884 [0.869;0.900] 0.8915

BRD3 (αk, β) 7 -2463.10 0.881 [0.866;0.897] 0.8915

BRD4 (α, βk) 7 -2467.43 0.765 [0.674;0.856] 0.8915

BRD5 (αj, β) 7 -2463.10 0.844 [0.806;0.882] 0.8915

BRD6 (αj, βj) 8 -2431.06 0.819 [0.788;0.849] 0.8919

BRD7 (αk, βk) 8 -2431.06 0.764 [0.697;0.832] 0.8919

BRD8 (αj, βk) 8 -2431.06 0.741 [0.657;0.826] 0.8919

BRD9 (αk, βj) 8 -2431.06 0.867 [0.851;0.884] 0.8919

Model 10 (αk, βjk) 9 -2431.06 [0.762;0.893] [0.744;0.907] 0.8919

Model 11 (αjk, βj) 9 -2431.06 [0.766;0.883] [0.715;0.920] 0.8919

Model 12 (αjk, βjk) 10 -2431.06 [0.694;0.904] 0.8919


Slovenian Public Opinion Survey:Incomplete Data

Observed ≡BRD7 ≡ BRD7(MAR) ≡BRD9 ≡ BRD9(MAR):

1439 78

16 16

159

32144 54 136

BRD1 ≡ BRD1(MAR):1381.6 101.7

24.2 41.4

182.9

8.1179.7 18.3 136.0

BRD2 ≡ BRD2(MAR):1402.2 108.9

15.6 22.3

159.0

32.0181.2 16.8 136.0


Slovenian Public Opinion Survey:Complete-data Prediction

BRD1 ≡ BRD1(MAR):1381.6 101.7

24.2 41.4

170.4 12.5

3.0 5.1

176.6 13.0

3.1 5.3

121.3 9.0

2.1 3.6

BRD2:1402.2 108.9

15.6 22.3

147.5 11.5

13.2 18.8

179.2 13.9

2.0 2.9

105.0 8.2

9.4 13.4

BRD2(MAR):1402.2 108.9

15.6 22.3

147.7 11.3

13.3 18.7

177.9 12.5

3.3 4.3

121.2 9.3

2.3 3.2

BRD7:1439 78

16 16

3.2 155.8

0.0 32.0

142.4 44.8

1.6 9.2

0.4 112.5

0.0 23.1

BRD9:1439 78

16 16

150.8 8.2

16.0 16.0

142.4 44.8

1.6 9.2

66.8 21.0

7.1 41.1

BRD7(MAR) ≡ BRD9(MAR):1439 78

16 18

148.1 10.9

11.8 20.2

141.5 38.4

2.5 15.6

121.3 9.0

2.1 3.6


Slovenian Public Opinion Survey:Collapsed (Marginalized) Predictions

BRD1 ≡ BRD1(MAR):1849.9 136.2

32.4 55.4=⇒ θ = 89.2%

BRD2:1833.9 142.5

40.2 57.5=⇒ θ = 88.4%

BRD2(MAR):1849.0 142.0

34.5 48.5=⇒ θ = 89.2%

BRD7:1585.0 391.1

17.6 80.3=⇒ θ = 76.4%

BRD9:1799.7 152.0

40.7 82.3=⇒ θ = 86.7%

BRD7(MAR) ≡ BRD9(MAR):1849.9 136.3

30.4 57.4=⇒ θ = 89.2%


Frameworks and Their Methods: Start

f (yi, ri|Xi,θ,ψ)

Selection Models: f (yi|Xi, θ) f (ri|Xi,yoi ,y

mi ,ψ)

MCAR/simple −→ MAR −→ MNAR

direct likelihood!

direct Bayesian!

multiple imputation (MI)!

IPW ⊃ W-GEE!

d.l. + IPW = double robustness!


Frameworks and Their Methods: Next

f (yi, ri|Xi,θ,ψ)

Selection Models: f (yi|Xi, θ) f (ri|Xi,yoi ,y

mi ,ψ)

MCAR/simple −→ MAR −→ MNAR

joint model!?

sensitivity analysis!

PMM

MI (MGK, J&J)

local influence

interval ignorance

IPW based


Multiple Imputation

• Valid under MAR

• An alternative to direct likelihood and WGEE

• Three steps:

1. The missing values are filled in M times =⇒ M complete data sets

2. The M complete data sets are analyzed by using standard procedures

3. The results from the M analyses are combined into a single inference

• Rubin (1987), Rubin and Schenker (1986), Little and Rubin (1987), Verbeke and Molenberghs (2000),

Molenberghs and Verbeke (2005), Molenberghs and Kenward (2007), Ratitch and O’Kelly (2011),

Carpenter and Kenward (2013), Mallinckrodt (2013), van Buuren (2014), . . .


• Multiple imputation (M = 5 imputations):

Observeddata

.................................................................................................................................................................................................................................................................................................................................................

........................................................................................

........................................................................................

..........................................

........................................

....................................................................................................................................................................................... ........................................

.......................................................................................................................................................................................................................... ........................................

......................................................................................................................................................................................................................................................................................................... ........................................

Imputed 1

Imputed 2

Imputed 3

Imputed 4

Imputed 5

....................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

Results 1

Results 2

Results 3

Results 4

Results 5

......................................................................................................................................................................................................................................................................................................... ........................................

.......................................................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

.......................................................................................

........................................................................................

...........................................

........................................

.................................................................................................................................................................................................................................................................................................................................................

Finalresults

...........................

..........................

...........................

...........................

..................................................

....................................

....................................................................................

.....................................................................................................................................................................

.........................................................................................................................................................................................................................................................

Imputation CombinationAnalysis


Use of MI in Practice

• Many analyses of the same incomplete set of data

• A combination of missing outcomes and missing covariates

• As an alternative to WGEE: MI can be combined with classical GEE

• MI in SAS:

Imputation Task: PROC MI

↓Analysis Task: PROC “MYFAVORITE”

↓Inference Task: PROC MIANALYZE


Generalized Estimating Equations

Liang and Zeger (1986)

S(β) =N∑

i=1[Di]

T [Vi(α)]−1 (yi − µi) = 0

• V i(.) is not the true variance of Y i but only a plausible guess

• the score equations are solved in a standard way

• Asymptotic distribution:√N(β − β) ∼ N (0, I−1

0 I1I−10 )

I0 =N∑

i=1DTi [Vi(α)]−1Di I1 =

N∑

i=1DTi [Vi(α)]−1Var(Y i)[Vi(α)]−1Di


Weighted GEE

πi =ni∏

`=2(1 − pi`)

π′i =di−1∏

`=2(1− pi`)

· pidi

pi` = P (Di = `|Di ≥ `, Yi ` ,Xi ` )

Ri = 1 if subject i is complete

Ri = 0 if subject i is incomplete

S(β) =N∑

i=1

Ri

πi

∂µi∂β′

V −1i (yi − µi) = 0

S(β) =N∑

i=1

1

π′i

∂µoi∂β′

(V oi )−1(yoi − µoi ) = 0


The Analgesic Trial

• single-arm trial with 530 patients recruited (491 selected for analysis)

• analgesic treatment for pain caused by chronic nonmalignant disease

• treatment was to be administered for 12 months

• we will focus on Global Satisfaction Assessment (GSA)

• GSA scale goes from 1=very good to 5=very bad

• GSA was rated by each subject 4 times during the trial, at months 3, 6, 9, and 12.


• Research questions:

. Evolution over time

. Relation with baseline covariates: age, sex, duration of the pain, type of pain,disease progression, Pain Control Assessment (PCA), . . .

. Investigation of dropout

• Frequencies:

GSA Month 3 Month 6 Month 9 Month 12

1 55 14.3% 38 12.6% 40 17.6% 30 13.5%

2 112 29.1% 84 27.8% 67 29.5% 66 29.6%

3 151 39.2% 115 38.1% 76 33.5% 97 43.5%

4 52 13.5% 51 16.9% 33 14.5% 27 12.1%

5 15 3.9% 14 4.6% 11 4.9% 3 1.4%

Tot 385 302 227 223


• Missingness:

Measurement occasion

Month 3 Month 6 Month 9 Month 12 Number %

Completers

O O O O 163 41.2

Dropouts

O O O M 51 12.91

O O M M 51 12.91

O M M M 63 15.95

Non-monotone missingness

O O M O 30 7.59

O M O O 7 1.77

O M O M 2 0.51

O M M O 18 4.56

M O O O 2 0.51

M O O M 1 0.25

M O M O 1 0.25

M O M M 3 0.76


Analysis of the Analgesic Trial

• A logistic regression for the dropout indicator:

logit[P (Di = j|Di ≥ j, ·)] = ψ0 + ψ11I(GSAi,j−1 = 1) + ψ12I(GSAi,j−1 = 2)

+ψ13I(GSAi,j−1 = 3) + ψ14I(GSAi,j−1 = 4)

+ψ2PCA0i + ψ3PFi + ψ4GDi

with

. GSAi,j−1 the 5-point outcome at the previous time

. I(·) is an indicator function

. PCA0i is pain control assessment at baseline

. PFi is physical functioning at baseline

. GDi is genetic disorder at baseline are used


Effect Par. Estimate (s.e.)

Intercept ψ0 -1.80 (0.49)

Previous GSA= 1 ψ11 -1.02 (0.41)

Previous GSA= 2 ψ12 -1.04 (0.38)

Previous GSA= 3 ψ13 -1.34 (0.37)

Previous GSA= 4 ψ14 -0.26 (0.38)

Basel. PCA ψ2 0.25 (0.10)

Phys. func. ψ3 0.009 (0.004)

Genetic disfunc. ψ4 0.59 (0.24)

• There is some evidence for MAR: P (Di = j|Di ≥ j) depends on previous GSA.

• Furthermore: baseline PCA, physical functioning and genetic/congenital disorder.


• GEE and WGEE:

logit[P (Yij = 1|tj,PCA0i)] = β1 + β2tj + β3t2j + β4PCA0i

Effect Par. GEE WGEE

Intercept β1 2.95 (0.47) 2.17 (0.69)

Time β2 -0.84 (0.33) -0.44 (0.44)

Time2 β3 0.18 (0.07) 0.12 (0.09)

Basel. PCA β4 -0.24 (0.10) -0.16 (0.13)

• A hint of potentially important differences between both


Make Monotone −→ Weighted GEE

• Multiple imputation to create monotone missingness:

proc mi data=m.gsa4 seed=459864 simple nimpute=10

round=0.1 out=m.gsaimput;

title ’Monotone multiple imputation’;

mcmc impute = monotone;

var pca0 physfct gsa1 gsa2 gsa3 gsa4;

run;

• Standard GEE:

proc gee data=m.gsaw plots=histogram;

title ’Standard GEE for GSA data’;

class patid timecls;

model gsabin = time|time pca0 / dist=bin;

repeated subject=patid / within=timecls corr=un;

run;


• Weighted GEE, where weights are created at observation level:

ods graphics on;

proc gee data=m.gsaimput03 plots=histogram;

title ’Weighted GEE for GSA Data Based on Multiple

Imputation to Monotonize - OBSLEVEL’;

by _imputation_;

class patid timecls;

model gsabin = time|time pca0 / dist=bin covb;

repeated subject=patid / within=timecls corr=un ecovb;

missmodel prevgsa pca0 physfct / type=obslevel;

ods output GEEEmpPEst=gmparms parminfo=gmpinfo

modelinfo=modelinfo GEERCov=gmcovb;

run;

proc mianalyze parms=gmparms parminfo=gmpinfo covb=gmcovb;

title ’Multiple Imputation Analysis After Weighted GEE for GSA Data’;

modeleffects intercept time time*time pca0;

run;


• To use weights at subject rather than observation level:

missmodel prevgsa pca0 physfct / type=sublevel;

Effect Par. Est.(s.e.) p-value Est.(s.e.) p-value

Standard GEE

Without MI After MI

Intercept β0 2.90(0.46) 2.87(0.45)

Time β1 -0.81(0.32) 0.0124 -0.83(0.32) 0.0087

Time2 β2 0.17(0.07) 0.0083 0.18(0.06) 0.0058

PCA0 β3 -0.23(0.10) 0.0178 -0.21(0.10) 0.0253

Weighted GEE (after MI)

Observation level Subject level

Intercept β0 2.74(0.46) 2.62(0.60)

Time β1 -0.76(0.33) 0.0231 -0.71(0.40) 0.0747

Time2 β2 0.17(0.07) 0.0155 0.16(0.08) 0.0444

PCA0 β3 -0.19(0.10) 0.0384 -0.21(0.12) 0.0853


Age-related Macular Degeneration

I T I S A

P L E A S

U R E T O

B E W I T

H Y O U A

T F O R T

C O L L I

N S F O R

G R A Y B

I L L IC SA


Age-related Macular Degeneration Trial

• Pharmacological Therapy for Macular Degeneration Study Group (1997)

• An occular pressure disease which makes patients progressively lose vision

• 240 patients enrolled in a multi-center trial (190 completers)

• Treatment: Interferon-α (6 million units) versus placebo

• Visits: baseline and follow-up at 4, 12, 24, and 52 weeks

• Continuous outcome: visual acuity: # letters correctly read on a vision chart

• Binary outcome: visual acuity versus baseline ≥ 0 or ≤ 0


• Missingness:

Measurement occasion

4 wks 12 wks 24 wks 52 wks Number %

Completers

O O O O 188 78.33

Dropouts

O O O M 24 10.00

O O M M 8 3.33

O M M M 6 2.50

M M M M 6 2.50

Non-monotone missingness

O O M O 4 1.67

O M M O 1 0.42

M O O O 2 0.83

M O M M 1 0.42


Weighted Generalized EstimatingEquations

• Model for the weights:

logit[P (Di = j|Di ≥ j)] = ψ0 + ψ1yi,j−1 + ψ2Ti + ψ31L1i + ψ32L2i + ψ34L3i

+ψ41I(tj = 2) + ψ42I(tj = 3)

with

. yi,j−1 the binary outcome at the previous time ti,j−1 = tj−1 (since time is commonto all subjects)

. Ti = 1 for interferon-α and Ti = 0 for placebo

. Lki = 1 if the patient’s eye lesion is of level k = 1, . . . , 4 (since one dummyvariable is redundant, only three are used)

. I(·) is an indicator function


• Results for the weights model:

Effect Parameter Estimate (s.e.)

Intercept ψ0 0.14 (0.49)

Previous outcome ψ1 0.04 (0.38)

Treatment ψ2 -0.86 (0.37)

Lesion level 1 ψ31 -1.85 (0.49)

Lesion level 2 ψ32 -1.91 (0.52)

Lesion level 3 ψ33 -2.80 (0.72)

Time 2 ψ41 -1.75 (0.49)

Time 3 ψ42 -1.38 (0.44)

• GEE:logit[P (Yij = 1|Ti, tj)] = βj1 + βj2Ti


Effect Par. CC LOCF Observed data

Unweighted WGEE

Int.4 β11 -1.01(0.24;0.24) -0.87(0.20;0.21) -0.87(0.21;0.21) -0.98(0.10;0.44)

Int.12 β21 -0.89(0.24;0.24) -0.97(0.21;0.21) -1.01(0.21;0.21) -1.78(0.15;0.38)

Int.24 β31 -1.13(0.25;0.25) -1.05(0.21;0.21) -1.07(0.22;0.22) -1.11(0.15;0.33)

Int.52 β41 -1.64(0.29;0.29) -1.51(0.24;0.24) -1.71(0.29;0.29) -1.72(0.25;0.39)

Tr.4 β12 0.40(0.32;0.32) 0.22(0.28;0.28) 0.22(0.28;0.28) 0.80(0.15;0.67)

Tr.12 β22 0.49(0.31;0.31) 0.55(0.28;0.28) 0.61(0.29;0.29) 1.87(0.19;0.61)

Tr.24 β32 0.48(0.33;0.33) 0.42(0.29;0.29) 0.44(0.30;0.30) 0.73(0.20;0.52)

Tr.52 β42 0.40(0.38;0.38) 0.34(0.32;0.32) 0.44(0.37;0.37) 0.74(0.31;0.52)

Corr. ρ 0.39 0.44 0.39 0.33


Multiple Imputation

•M = 10 imputations

• GEE:logit[P (Yij = 1|Ti, tj)] = βj1 + βj2Ti

• GLMM:

logit[P (Yij = 1|Ti, tj, bi)] = βj1 + bi + βj2Ti, bi ∼ N (0, τ 2)

• Ti = 0 for placebo and Ti = 1 for interferon-α

• tj (j = 1, . . . , 4) refers to the four follow-up measurements

• Imputation based on the continuous outcome


Effect Par. GEE GLMM

Int.4 β11 -0.84(0.20) -1.46(0.36)

Int.12 β21 -1.02(0.22) -1.75(0.38)

Int.24 β31 -1.07(0.23) -1.83(0.38)

Int.52 β41 -1.61(0.27) -2.69(0.45)

Trt.4 β12 0.21(0.28) 0.32(0.48)

Trt.12 β22 0.60(0.29) 0.99(0.49)

Trt.24 β32 0.43(0.30) 0.67(0.51)

Trt.52 β42 0.37(0.35) 0.52(0.56)

R.I. s.d. τ 2.20(0.26)

R.I. var. τ 2 4.85(1.13)


Sensitivity Analysis

• We apply a shift to the treatment group:

proc mi data=m.armd13 seed=486048 simple out=m.armd13as1

nimpute=10 round=0.1;

title ’Shift multiple imputation’;

class treat;

var lesion diff4 diff12 diff24 diff52;

fcs reg;

mnar adjust (diff12 / shift=10 adjustobs=(treat=’2’));



by treat;

run;


• Expanded results (all after MI):

GLMM

Effect Par. GEE MAR shift

Int.4 β11 -0.84(0.20) -1.46(0.36) -1.42(0.36)

Int.12 β21 -1.02(0.22) -1.75(0.38) -1.67(0.38)

Int.24 β31 -1.07(0.23) -1.83(0.38) -1.83(0.39)

Int.52 β41 -1.61(0.27) -2.69(0.45) -2.77(0.44)

Trt.4 β12 0.21(0.28) 0.32(0.48) 0.24(0.48)

Trt.12 β22 0.60(0.29) 0.99(0.49) 0.91(0.50)

Trt.24 β32 0.43(0.30) 0.67(0.51) 0.65(0.51)

Trt.52 β42 0.37(0.35) 0.52(0.56) 0.60(0.55)

R.I. s.d. τ 2.20(0.26) 2.20(0.25)

R.I. var. τ 2 4.85(1.13) 4.83(1.08)


• Shift can be applied to one or a few groups only

• Alternative to shift: adding extra randomness

• Alternative to shift: multiplier

• Use pattern-mixture model identifying restrictions:

. NCMV

. CCMV


Overview

MCAR/simple CC biased

LOCF inefficient

not simpler than MAR methods

MAR direct likelihood easy to conduct

direct Bayes Gaussian & non-Gaussian

weighted GEE

MI

MNAR variety of methods strong, untestable assumptions

most useful in sensitivity analysis


“Estimands, Missing Data and Sensitivity Analysis” by Dr. Molenberghs

標本調査理論との関係

代替エンドポイント評価の事例

情報はどこから生成されるか：データ・デザイン・仮定

仮定は検証可能でないため，感度解析の実施が求められる．

エンリッチメント

モデルが含む見地>その見地についてデータが与えることができる情報

エンリチッチメントのクラス

a. Coarsening: 例えば，データは観測されるが，その一部は観測されなかった－欠測値

b. Augmentation: 簡便な解釈のために，モデルが観測されない変数(例えば，ランダム効果)で補足される

1

“Estimands, Missing Data and Sensitivity Analysis” by Dr. Molenberghs

欠測データの一般的な枠組み

National Academy of Science Report ｢患者｣‐アカデミア－規制－企業

MAR (Missing at Random)モデルとNot Missing at Random (NMAR) どのようなMARモデルにも，データに同じあてはめを提供できるカウン

ターパートとしてのNMARモデルが存在する

データにのみ依存し，MARが成りたっているかどうかを示すことはできない

標準的な解析と感度解析の適用

無視可能な尤度 (Ignorable Likelihood) Bayes流接近法

多重補完

Inverse Probability Weighting (IPW) (重みつき一般化方程式)

2

estimands, missing data, and sensitivity analysis …estimands, missing data, and sensitivity...

Documents