mcmc diagnostics - simon fraser university
TRANSCRIPT
MCMC DiagnosticsMultiple chain
© Dave Campbell 2009Friday, June 12, 2009
hist(beta,100)
The distribution of ß, the probability of getting cancer without getting vaccinated.
Friday, June 12, 2009
Correction:
In the raftery Matlab function r is the interval width (precision).
But r is relative, not an absolute width in this Matlab software
r= .005 is a ½% precision, not absolute interval width as I suggested last day in class.
Friday, June 12, 2009
Gelman-Rubin
Multiple chain method
Friday, June 12, 2009
0 5000 10000 15000 20000
-50
050
100
150
200
250
Index
x[k
, ]
0 5000 10000 15000 20000
-300
-200
-100
0
Index
x[k
, ]
0 5000 10000 15000 20000
-50
050
100
200
300
Index
x[k
, ]
0 5000 10000 15000 20000
-300
-200
-100
050
Index
x[k
, ]
0 5000 10000 15000 20000
-300
-200
-100
0
Index
x[k
, ]
0 5000 10000 15000 20000
050
100
150
200
250
Index
x[k
, ]
0 5000 10000 15000 20000
-150
-100
-50
050
Index
x[k
, ]
0 5000 10000 15000 20000
-40
-20
020
40Index
x[k
, ]
0 5000 10000 15000 20000
-150
-100
-50
050
Index
x[k
, ]
0 5000 10000 15000 20000
-100
0100
200
Index
x[k
, ]
Multiple Chain Convergence Diagnostics: Gelman-Rubin method
If we start our Markov chain from m different places, do they all converge to the same place?
Friday, June 12, 2009
Today we have a series of independent (and separate) MCMC runs j=1,2,…,m, where for us m=10. This is the usual value when using the Gelman Rubin diagnostic.
The sampled values are:
β j(i), i = 1,2,...,n
Friday, June 12, 2009
RandStream.setDefaultStream(RandStream('mt19937ar','seed',sum(clock)))niter=10000;y=36; N=5766;stepvar=.004;m=10;betas=zeros(niter,m);betas(1,:)=(1:10)/11;for j=1:m iter=1; log_alpha_bot=(y*log(betas(iter,j))+(N-y)*log(1-betas(iter,j))+... log(2-2*betas(iter,j))); for iter=2:niter X=unifrnd(betas(iter-1,j)-stepvar,betas(iter-1,j)+stepvar); log_alpha_top= y*log(X)+(N-y)*log(1-X)+log(2-2*X); if(rand<exp(log_alpha_top-log_alpha_bot)) betas(iter,j)=X; log_alpha_bot=log_alpha_top; else betas(iter,j)=betas(iter-1,j); end endendplot(betas)
Note:I’m using the log of the acceptance ratio. It’s numerically more stable. You should always do this too.
Friday, June 12, 2009
RandStream.setDefaultStream(RandStream('mt19937ar','seed',sum(clock)))niter=10000;y=36; N=5766;stepvar=.004;m=10;betas=zeros(niter,m);betas(1,:)=(1:10)/11;for j=1:m iter=1; log_alpha_bot=(y*log(betas(iter,j))+(N-y)*log(1-betas(iter,j))+... log(2-2*betas(iter,j))); for iter=2:niter X=unifrnd(betas(iter-1,j)-stepvar,betas(iter-1,j)+stepvar); log_alpha_top= y*log(X)+(N-y)*log(1-X)+log(2-2*X); if(rand<exp(log_alpha_top-log_alpha_bot)) betas(iter,j)=X; log_alpha_bot=log_alpha_top; else betas(iter,j)=betas(iter-1,j); end endendplot(betas)
NOTE: that I’m filling down columns since that is the way Matlab indexes a Matrix it’s faster to do it this way.
Friday, June 12, 2009
plot(ßj)Friday, June 12, 2009
Multiple Chain Convergence Diagnostics Gelman-Rubin method:
Run MCMC m times Discard a bunch for Burn-in With what is left compute:
Average within chain var:
Between chain variance:
W =1m
1n −1
β j(i) − β j( )2
i=1
n
∑⎡⎣⎢
⎤⎦⎥j=1
m
∑
B =n
m −1β j − β( )2
j=1
m
∑
β j =1n
β j(i)
i=1
n
∑
Friday, June 12, 2009
The total estimated variance:
And the Gelman-Rubin statistic:
R Should be close to 1 when all is working well
R>1.05 suggests possible problems
V̂ar(β) = 1− 1n
⎛⎝⎜
⎞⎠⎟W +
1nB
R =V̂ar(β)W
W =1m
1n −1
β j(i) − β j( )2
i=1
n
∑⎡⎣⎢
⎤⎦⎥j=1
m
∑ B =n
m −1β j − β( )2
j=1
m
∑
Friday, June 12, 2009
Gelman-Rubin is a univariate diagnostic
Multivariate version exists (Brooks, S. and A. Gelman. 1998. General methods for monitoring convergence of iterative simulations. Journal of Computational
and Graphical Statistics 7: 434-55. )
We could also run one very long chain and divide it into 50 segments to perform Gelman-Rubin
Large Gelman and Rubin might arise from slow mixing or multi-modality
Friday, June 12, 2009
Download the Computational Statistics Toolbox from the authors of the Martinez and Martinez (2008), ‘Computational Statistics Handbook with Matlab’ 2nd ed.
Their book is on reserves
From a direct link to software: http://www.pi-sigma.info/
CompStatsToolboxV2.zip or visit their webpage: http://www.pi-sigma.info/CS2E.htm
Gelman-Rubin in Matlab
Friday, June 12, 2009
Using the computational statistics toolbox is as simple as making it visible to Matlab:
The if you instead use you’ll open the contents file for Matlab.
Check which file you’re opening by using
addpath('/Volumes/iamdavecampbell/CompStatsToolboxV2')open '/Volumes/iamdavecampbell/CompStatsToolboxV2/Contents'
open 'Contents'
which 'Contents'
Friday, June 12, 2009
The Gelman-Rubin diagnostic function takes a matrix of m rows and n columns
>> size(betas) >> R=csgelrub(betas')R =
1.0056
Friday, June 12, 2009
To show it not working let’s contrive multi-modality
>> betas(:,1:5)=-betas(:,1:5); >> R=csgelrub(betas')R =
1.0711
While our result suggests multimodality, the modes are close together so the value of R is not all that large but above the 1.05 cutoff
Friday, June 12, 2009