mcmc diagnostics - simon fraser university

16
MCMC Diagnostics Multiple chain © Dave Campbell 2009 Friday, June 12, 2009

Upload: others

Post on 20-Apr-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MCMC Diagnostics - Simon Fraser University

MCMC DiagnosticsMultiple chain

© Dave Campbell 2009Friday, June 12, 2009

Page 2: MCMC Diagnostics - Simon Fraser University

hist(beta,100)

The distribution of ß, the probability of getting cancer without getting vaccinated.

Friday, June 12, 2009

Page 3: MCMC Diagnostics - Simon Fraser University

Correction:

In the raftery Matlab function r is the interval width (precision).

But r is relative, not an absolute width in this Matlab software

r= .005 is a ½% precision, not absolute interval width as I suggested last day in class.

Friday, June 12, 2009

Page 4: MCMC Diagnostics - Simon Fraser University

Gelman-Rubin

Multiple chain method

Friday, June 12, 2009

Page 5: MCMC Diagnostics - Simon Fraser University

0 5000 10000 15000 20000

-50

050

100

150

200

250

Index

x[k

, ]

0 5000 10000 15000 20000

-300

-200

-100

0

Index

x[k

, ]

0 5000 10000 15000 20000

-50

050

100

200

300

Index

x[k

, ]

0 5000 10000 15000 20000

-300

-200

-100

050

Index

x[k

, ]

0 5000 10000 15000 20000

-300

-200

-100

0

Index

x[k

, ]

0 5000 10000 15000 20000

050

100

150

200

250

Index

x[k

, ]

0 5000 10000 15000 20000

-150

-100

-50

050

Index

x[k

, ]

0 5000 10000 15000 20000

-40

-20

020

40Index

x[k

, ]

0 5000 10000 15000 20000

-150

-100

-50

050

Index

x[k

, ]

0 5000 10000 15000 20000

-100

0100

200

Index

x[k

, ]

Multiple Chain Convergence Diagnostics: Gelman-Rubin method

If we start our Markov chain from m different places, do they all converge to the same place?

Friday, June 12, 2009

Page 6: MCMC Diagnostics - Simon Fraser University

Today we have a series of independent (and separate) MCMC runs j=1,2,…,m, where for us m=10. This is the usual value when using the Gelman Rubin diagnostic.

The sampled values are:

β j(i), i = 1,2,...,n

Friday, June 12, 2009

Page 7: MCMC Diagnostics - Simon Fraser University

RandStream.setDefaultStream(RandStream('mt19937ar','seed',sum(clock)))niter=10000;y=36; N=5766;stepvar=.004;m=10;betas=zeros(niter,m);betas(1,:)=(1:10)/11;for j=1:m iter=1; log_alpha_bot=(y*log(betas(iter,j))+(N-y)*log(1-betas(iter,j))+... log(2-2*betas(iter,j))); for iter=2:niter X=unifrnd(betas(iter-1,j)-stepvar,betas(iter-1,j)+stepvar); log_alpha_top= y*log(X)+(N-y)*log(1-X)+log(2-2*X); if(rand<exp(log_alpha_top-log_alpha_bot)) betas(iter,j)=X; log_alpha_bot=log_alpha_top; else betas(iter,j)=betas(iter-1,j); end endendplot(betas)

Note:I’m using the log of the acceptance ratio. It’s numerically more stable. You should always do this too.

Friday, June 12, 2009

Page 8: MCMC Diagnostics - Simon Fraser University

RandStream.setDefaultStream(RandStream('mt19937ar','seed',sum(clock)))niter=10000;y=36; N=5766;stepvar=.004;m=10;betas=zeros(niter,m);betas(1,:)=(1:10)/11;for j=1:m iter=1; log_alpha_bot=(y*log(betas(iter,j))+(N-y)*log(1-betas(iter,j))+... log(2-2*betas(iter,j))); for iter=2:niter X=unifrnd(betas(iter-1,j)-stepvar,betas(iter-1,j)+stepvar); log_alpha_top= y*log(X)+(N-y)*log(1-X)+log(2-2*X); if(rand<exp(log_alpha_top-log_alpha_bot)) betas(iter,j)=X; log_alpha_bot=log_alpha_top; else betas(iter,j)=betas(iter-1,j); end endendplot(betas)

NOTE: that I’m filling down columns since that is the way Matlab indexes a Matrix it’s faster to do it this way.

Friday, June 12, 2009

Page 9: MCMC Diagnostics - Simon Fraser University

plot(ßj)Friday, June 12, 2009

Page 10: MCMC Diagnostics - Simon Fraser University

Multiple Chain Convergence Diagnostics Gelman-Rubin method:

Run MCMC m times Discard a bunch for Burn-in With what is left compute:

Average within chain var:

Between chain variance:

W =1m

1n −1

β j(i) − β j( )2

i=1

n

∑⎡⎣⎢

⎤⎦⎥j=1

m

B =n

m −1β j − β( )2

j=1

m

β j =1n

β j(i)

i=1

n

Friday, June 12, 2009

Page 11: MCMC Diagnostics - Simon Fraser University

The total estimated variance:

And the Gelman-Rubin statistic:

R Should be close to 1 when all is working well

R>1.05 suggests possible problems

V̂ar(β) = 1− 1n

⎛⎝⎜

⎞⎠⎟W +

1nB

R =V̂ar(β)W

W =1m

1n −1

β j(i) − β j( )2

i=1

n

∑⎡⎣⎢

⎤⎦⎥j=1

m

∑ B =n

m −1β j − β( )2

j=1

m

Friday, June 12, 2009

Page 12: MCMC Diagnostics - Simon Fraser University

Gelman-Rubin is a univariate diagnostic

Multivariate version exists (Brooks, S. and A. Gelman. 1998. General methods for monitoring convergence of iterative simulations. Journal of Computational

and Graphical Statistics 7: 434-55. )

We could also run one very long chain and divide it into 50 segments to perform Gelman-Rubin

Large Gelman and Rubin might arise from slow mixing or multi-modality

Friday, June 12, 2009

Page 13: MCMC Diagnostics - Simon Fraser University

Download the Computational Statistics Toolbox from the authors of the Martinez and Martinez (2008), ‘Computational Statistics Handbook with Matlab’ 2nd ed.

Their book is on reserves

From a direct link to software: http://www.pi-sigma.info/

CompStatsToolboxV2.zip or visit their webpage: http://www.pi-sigma.info/CS2E.htm

Gelman-Rubin in Matlab

Friday, June 12, 2009

Page 14: MCMC Diagnostics - Simon Fraser University

Using the computational statistics toolbox is as simple as making it visible to Matlab:

The if you instead use you’ll open the contents file for Matlab.

Check which file you’re opening by using

addpath('/Volumes/iamdavecampbell/CompStatsToolboxV2')open '/Volumes/iamdavecampbell/CompStatsToolboxV2/Contents'

open 'Contents'

which 'Contents'

Friday, June 12, 2009

Page 15: MCMC Diagnostics - Simon Fraser University

The Gelman-Rubin diagnostic function takes a matrix of m rows and n columns

>> size(betas) >> R=csgelrub(betas')R =

1.0056

Friday, June 12, 2009

Page 16: MCMC Diagnostics - Simon Fraser University

To show it not working let’s contrive multi-modality

>> betas(:,1:5)=-betas(:,1:5); >> R=csgelrub(betas')R =

1.0711

While our result suggests multimodality, the modes are close together so the value of R is not all that large but above the 1.05 cutoff

Friday, June 12, 2009