statistical methods for analysis with missing data 5mm...
TRANSCRIPT
Statistical Methods for Analysis with Missing Data
Lecture 11 setup: examples of Gibbs sampler, data augmentation forproper multiple imputation, MICE
Mauricio Sadinle
Department of Biostatistics
1 / 20
Previous Lectures
I Gibbs sampling
I Data augmentation to handle missing data in Bayesian inference
I Multiple imputation as a Monte Carlo approximation of properBayesian procedure
I Uncongeniality generally leads to invalidity of inferences based onRubin’s combining rules
I MICE: practical implementation of multiple imputation that buildson Gibbs sampling ideas
2 / 20
Today’s Lecture
R implementations of
I Gibbs sampler
I Data augmentation for proper multiple imputation
I MICE
3 / 20
Outline
Example of Gibbs Sampler
Data Augmentation for Proper Multiple Imputation
MICE
Summary
4 / 20
Bhattacharyya’s Distribution
Consider real-valued random variables X and Y having a jointdistribution with density1
pX ,Y (x , y) = exp
[1, x , x2]m00,m01,m02
m10,m11,m12
m20,m21,m22
1yy2
,
where either
(a) m22 = m21 = m12 = 0; m20,m02 < 0; m211 < 4m20m02;
(b) m22 < 0, 4m22m02 > m212, 4m22m20 > m2
21.
m00 is determined by the other mij ’s so that pX ,Y integrates to 1.
1Distribution credited to Anil Kumar Bhattacharyya, who was a professor at theIndian Statistical Institute. See, e.g.,https://projecteuclid.org/download/pdf_1/euclid.ss/1009213728
5 / 20
Bhattacharyya’s Distribution
From pX ,Y (x , y) it is easy to see that
pX |y (x |y) ∝ 1
σX (y)exp
{− [x − µX (y)]2
2σ2X (y)
},
where
µX (y) = − m10 + m11y + m12y2
2(m20 + m21y + m22y2),
and
σ2X (y) = − 1
2(m20 + m21y + m22y2)
6 / 20
Bhattacharyya’s Distribution
And analogously, it is easy to see that
pY |x(y |x) ∝ 1
σY (x)exp
{− [y − µY (x)]2
2σ2Y (x)
},
where
µY (x) = − m01 + m11x + m21x2
2(m02 + m12x + m22x2),
and
σ2Y (x) = − 1
2(m02 + m12x + m22x2)
7 / 20
Bhattacharyya’s Distribution
I In fact, Bhattacharyya’s distribution characterizes all bivariatedistributions with normal conditionals2
I Gibbs sampler to draw from pX ,Y is easy to implement:
I Choose starting point (x (0), y (0))I At iteration t draw
X (t) ∼ Normal[ µX (y(t−1)), σ2
X (y(t−1)) ]
Y (t) ∼ Normal[ µY (x(t)), σ2
Y (x(t)) ]
2Arnold, Castillo and Sarabia (Statistical Science, 2001):https://projecteuclid.org/download/pdf_1/euclid.ss/1009213728
8 / 20
R Time!
Open file Lecture11code.R, part 1
9 / 20
Outline
Example of Gibbs Sampler
Data Augmentation for Proper Multiple Imputation
MICE
Summary
10 / 20
Example: Multivariate Normal
I Distribution of the data
Z = {Zi}ni=1 | µ,Λi.i.d.∼ Normal(µ,Λ−1)
where Zi ∈ RK , µ is the vector of means, Λ−1 is the covariancematrix, and Λ is the inverse covariance matrix (the precision matrix)
I Conjugate prior is constructed in two steps
µ | Λ ∼ Normal(µ0, (κ0Λ)−1)
Λ ∼Wishart(υ0,W0)
Joint distribution of (µ,Λ) is called Normal-Wishart. Theparameterization is such that E (Λ) = υ0W0
11 / 20
Example: Multivariate Normal
I Distribution of the data
Z = {Zi}ni=1 | µ,Λi.i.d.∼ Normal(µ,Λ−1)
where Zi ∈ RK , µ is the vector of means, Λ−1 is the covariancematrix, and Λ is the inverse covariance matrix (the precision matrix)
I Conjugate prior is constructed in two steps
µ | Λ ∼ Normal(µ0, (κ0Λ)−1)
Λ ∼Wishart(υ0,W0)
Joint distribution of (µ,Λ) is called Normal-Wishart. Theparameterization is such that E (Λ) = υ0W0
11 / 20
Example: Multivariate Normal
Posterior is also Normal-Wishart
µ | Λ, z ∼ Normal(µ′, (κ′Λ)−1)
Λ | z ∼Wishart(υ′,W ′)
where
µ′ = (κ0µ0 + nz̄)/κ′
κ′ = κ0 + n
υ′ = υ0 + n
W ′ = {W−10 + n[Σ̂ +κ0κ′
(z̄ − µ0)(z̄ − µ0)T ]}−1
z̄ =n∑
i=1
zi/n
Σ̂ =n∑
i=1
(zi − z̄)(zi − z̄)T/n
12 / 20
Example: Multivariate Normal
HW3: write down and implement a data augmentation algorithm underignorability and multivariate normality
13 / 20
R Time!
Open file Lecture11code.R, part 2
14 / 20
Outline
Example of Gibbs Sampler
Data Augmentation for Proper Multiple Imputation
MICE
Summary
15 / 20
Multivariate Imputation by Chained Equations
Multivariate Imputation by Chained Equations (MICE)3 is an ad-hocmultiple imputation procedure that builds on Gibbs sampling ideas
I If each Y1, . . . ,YK is subject to missingness, we can posit Kdifferent regression models
p1(y1 | y−1, θ1)
p2(y2 | y−2, θ2)
...
pK (yK | y−K , θK )
I θk : parameters of the kth conditional distribution
I y−k = (y1, . . . , yk−1, yk+1, . . . , yK )
I Key idea: use these models to sequentially impute, one variable at atime. Repeat this over a number of iterations
3https://www.jstatsoft.org/article/view/v045i03/v45i03.pdf16 / 20
Multivariate Imputation by Chained Equations
The MICE algorithm:
I Initialize the algorithm by randomly imputing the missing values ofeach variable/column by observed values of that variable/column.
Denote this initial completed data as y(0)1 , . . . , y
(0)K
I Run a pseudo Gibbs/Data Augmentation sampler, with tth iteration:
θ(t)1 ∼p1(θ1 | y1(r1), y
(t−1)2 , . . . , y(t−1)
K ) ∝ p1(θ1)∏
i :ri1=1
p1(yi1 | y (t−1)i2 , . . . , y
(t−1)iK , θ1)
y(t)i1 ∼p1(y1 | y
(t−1)i2 , . . . , y
(t−1)iK , θ
(t)1 ), for all missing yi1
...
θ(t)K ∼pK (θK | yK(rK ), y
(t)1 , . . . , y(t)K−1) ∝ pK (θK )
∏i :riK=1
pK (yiK | y (t)i1 , . . . , y
(t)i,K−1, θK )
y(t)iK ∼pK (yK | y
(t)i1 , . . . , y
(t)i,K−1, θ
(t)K ), for all missing yiK
I Iterate for a number of times
17 / 20
R Time!
Open file Lecture11code.R, part 3
18 / 20
Outline
Example of Gibbs Sampler
Data Augmentation for Proper Multiple Imputation
MICE
Summary
19 / 20
Summary
Main take-aways from today’s lecture:
I Example of Gibbs sampler
I Data Augmentation and Proper Multiple Imputation using ’norm’package
I Multivariate Imputation by Chained Equations with the ’mice’package
Next lecture:
I Inverse Probability Weighting
20 / 20