statistical methods for analysis with missing data 5mm...

Statistical Methods for Analysis with Missing Data

Lecture 11 setup: examples of Gibbs sampler, data augmentation forproper multiple imputation, MICE

Mauricio Sadinle

Department of Biostatistics

1 / 20

Previous Lectures

I Gibbs sampling

I Data augmentation to handle missing data in Bayesian inference

I Multiple imputation as a Monte Carlo approximation of properBayesian procedure

I Uncongeniality generally leads to invalidity of inferences based onRubin’s combining rules

I MICE: practical implementation of multiple imputation that buildson Gibbs sampling ideas

2 / 20

Today’s Lecture

R implementations of

I Gibbs sampler

I Data augmentation for proper multiple imputation

I MICE

3 / 20

Outline

Example of Gibbs Sampler

Data Augmentation for Proper Multiple Imputation

MICE

Summary

4 / 20

Bhattacharyya’s Distribution

Consider real-valued random variables X and Y having a jointdistribution with density1

pX ,Y (x , y) = exp

[1, x , x2]m00,m01,m02

m10,m11,m12

m20,m21,m22

1yy2

,

where either

(a) m22 = m21 = m12 = 0; m20,m02 < 0; m211 < 4m20m02;

(b) m22 < 0, 4m22m02 > m212, 4m22m20 > m2

21.

m00 is determined by the other mij ’s so that pX ,Y integrates to 1.

1Distribution credited to Anil Kumar Bhattacharyya, who was a professor at theIndian Statistical Institute. See, e.g.,https://projecteuclid.org/download/pdf_1/euclid.ss/1009213728

5 / 20

https://projecteuclid.org/download/pdf_1/euclid.ss/1009213728


From pX ,Y (x , y) it is easy to see that

pX |y (x |y) ∝ 1

σX (y)exp

{− [x − µX (y)]2

2σ2X (y)

},

where

µX (y) = − m10 + m11y + m12y2

2(m20 + m21y + m22y2),

and

σ2X (y) = − 1

2(m20 + m21y + m22y2)

6 / 20


And analogously, it is easy to see that

pY |x(y |x) ∝ 1

σY (x)exp

{− [y − µY (x)]2

2σ2Y (x)

},

where

µY (x) = − m01 + m11x + m21x2

2(m02 + m12x + m22x2),

and

σ2Y (x) = − 1

2(m02 + m12x + m22x2)

7 / 20


I In fact, Bhattacharyya’s distribution characterizes all bivariatedistributions with normal conditionals2

I Gibbs sampler to draw from pX ,Y is easy to implement:

I Choose starting point (x (0), y (0))I At iteration t draw

X (t) ∼ Normal[ µX (y(t−1)), σ2

X (y(t−1)) ]

Y (t) ∼ Normal[ µY (x(t)), σ2

Y (x(t)) ]

2Arnold, Castillo and Sarabia (Statistical Science, 2001):https://projecteuclid.org/download/pdf_1/euclid.ss/1009213728

8 / 20

https://projecteuclid.org/download/pdf_1/euclid.ss/1009213728

R Time!

Open file Lecture11code.R, part 1

9 / 20

Outline



MICE

Summary

10 / 20

Example: Multivariate Normal

I Distribution of the data

Z = {Zi}ni=1 | µ,Λi.i.d.∼ Normal(µ,Λ−1)

where Zi ∈ RK , µ is the vector of means, Λ−1 is the covariancematrix, and Λ is the inverse covariance matrix (the precision matrix)

I Conjugate prior is constructed in two steps

µ | Λ ∼ Normal(µ0, (κ0Λ)−1)

Λ ∼Wishart(υ0,W0)

Joint distribution of (µ,Λ) is called Normal-Wishart. Theparameterization is such that E (Λ) = υ0W0

11 / 20


Posterior is also Normal-Wishart

µ | Λ, z ∼ Normal(µ′, (κ′Λ)−1)

Λ | z ∼Wishart(υ′,W ′)

where

µ′ = (κ0µ0 + nz̄)/κ′

κ′ = κ0 + n

υ′ = υ0 + n

W ′ = {W−10 + n[Σ̂ +κ0κ′

(z̄ − µ0)(z̄ − µ0)T ]}−1

z̄ =n∑

i=1

zi/n

Σ̂ =n∑

i=1

(zi − z̄)(zi − z̄)T/n

12 / 20


HW3: write down and implement a data augmentation algorithm underignorability and multivariate normality

13 / 20

R Time!


14 / 20

Outline



MICE

Summary

15 / 20

Multivariate Imputation by Chained Equations

Multivariate Imputation by Chained Equations (MICE)3 is an ad-hocmultiple imputation procedure that builds on Gibbs sampling ideas

I If each Y1, . . . ,YK is subject to missingness, we can posit Kdifferent regression models

p1(y1 | y−1, θ1)

p2(y2 | y−2, θ2)

...

pK (yK | y−K , θK )

I θk : parameters of the kth conditional distribution

I y−k = (y1, . . . , yk−1, yk+1, . . . , yK )

I Key idea: use these models to sequentially impute, one variable at atime. Repeat this over a number of iterations

3https://www.jstatsoft.org/article/view/v045i03/v45i03.pdf16 / 20

https://www.jstatsoft.org/article/view/v045i03/v45i03.pdf

Multivariate Imputation by Chained Equations

The MICE algorithm:

I Initialize the algorithm by randomly imputing the missing values ofeach variable/column by observed values of that variable/column.

Denote this initial completed data as y(0)1 , . . . , y

(0)K

I Run a pseudo Gibbs/Data Augmentation sampler, with tth iteration:

θ(t)1 ∼p1(θ1 | y1(r1), y

(t−1)2 , . . . , y(t−1)

K ) ∝ p1(θ1)∏

i :ri1=1

p1(yi1 | y (t−1)i2 , . . . , y

(t−1)iK , θ1)

y(t)i1 ∼p1(y1 | y

(t−1)i2 , . . . , y

(t−1)iK , θ

(t)1 ), for all missing yi1

...

θ(t)K ∼pK (θK | yK(rK ), y

(t)1 , . . . , y(t)K−1) ∝ pK (θK )

∏i :riK=1

pK (yiK | y (t)i1 , . . . , y

(t)i,K−1, θK )

y(t)iK ∼pK (yK | y

(t)i1 , . . . , y

(t)i,K−1, θ

(t)K ), for all missing yiK

I Iterate for a number of times

17 / 20

R Time!


18 / 20

Outline



MICE

Summary

19 / 20

Summary

Main take-aways from today’s lecture:

I Example of Gibbs sampler

I Data Augmentation and Proper Multiple Imputation using ’norm’package

I Multivariate Imputation by Chained Equations with the ’mice’package

Next lecture:

I Inverse Probability Weighting

20 / 20

statistical methods for analysis with missing data 5mm...

Documents