introduction to winbugs olivier gimenez. a brief history 1989: project began with a unix version...
TRANSCRIPT
Introduction to WinBUGS
Olivier Gimenez
A brief history
1989: project began with a Unix version called BUGS
1998: first Windows version, WinBUGS was born
Initially developed by the MRC Biostatistics Unit in Cambridge and now joint work with Imperial College School of Medicine at St Mary's, London.
Windows Bayesian inference Using Gibbs Sampling Software for the Bayesian analysis of complex statistical
models using Markov chain Monte Carlo (MCMC) methods
Nicky Best
Imperial College Faculty of Medicine, London (UK)
Thomas Andrew
University of Helsinki, Helsinki (Finland)
David Spiegelhalter
MRC Biostatistics Unit Institute of Public Health
Cambridge (UK)
Wally Gilks
MRC Biostatistics Unit Institute of Public Health
Cambridge (UK)
Who?
How to obtain and install WinBUGS
1. downloadable from:
http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml
(... see section Obtaining the File to download WinBUGS14.exe)
2. to install WinBUGS:
Exit all other programs currently running (particularly if using Windows XP); Copy WinBUGS14.exe to your computer; Double click on WinBUGS14.exe and follow the instructions in the dialog box; You should have a new directory called WinBUGS14 within Program Files;
Inside the WinBUGS14 directory is a program called WinBUGS14.exe; Right-click on the pretty WinBUGS icon, select `create shortcut', then drag this shortcut to the desktop;
Double click on WinBUGS14.exe to run WinBUGS14.
How to obtain and install WinBUGS To obtain the key for unrestricted use:
Fill in the registration form
~1h later, you will receive an email from Bugs with subject
WinBUGS registration - AUTOMATIC RESPONSE!
Follow the instructions...
Congratulations, you're ready to use WinBUGS
Principle You specify the prior and build up the likelihood
WinBUGS computes the posterior by running a Gibbs sampling algorithm, based on:
(|D) / L(D|) ()
WinBUGS computes some convergence diagnostics that you have to check
Demographic components (fecundity, breeding success,
survival, etc…)
A biological example throughoutWhite stork (Ciconia ciconia) in Alsace 1948-1970
Climate(Temperature, rainfall, etc…)
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32 15.6 36 13.1 72 13.1 43 15.0 92 11.7 32 15.3 86 14.4 28 14.4 57 12.7 55 11.7 66 11.9 26 15.9 28 13.4 96 14.0 48 13.9 90 12.9 86 15.1 7813.0 87
2.551.852.052.883.132.212.432.692.552.842.472.692.522.312.072.352.981.982.532.212.621.782.30
WinBUGS & Linear Regression
YNumber of chicks per pairs
T Temp. May (°C)R Rainf. May (mm)
1. Do temperature and rainfall affect the number of chicks?
Yi = + r Ri + t Ti + i , i=1,...,23i i.i.d. ~ N(0,2)
2. Regression model:
3. Estimation of parameters: , r, t,
4. Frequentist inference uses t-tests
Yi i.i.d. ~ N(i,2), i=1,...,23
i = + r Ri + t Ti
WinBUGS & Linear Regression
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32 15.6 36 ... ...13.0 87
2.551.852.052.883.132.21 ...
2.30
YNumber of chicks per pairs
T Temp. May (°C)R Rainf. May (mm)
Estimate Std. Error t value Pr(>|t|) temperature 0.031069 0.054690 0.568 0.57629 rainfall -0.007316 0.002897 -2.525 0.02011 *
Linear Regression using Frequentist approach
Y = 2.451 + 0.031 T - 0.007 R
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32 15.6 36 ... ...13.0 87
2.551.852.052.883.132.21 ...
2.30
YNumber of chicks per pairs
T Temp. May (°C)R Rainf. May (mm)
Linear Regression using Frequentist approach
Influence of Rainfall only
Estimate Std. Error t value Pr(>|t|) temperature 0.031069 0.054690 0.568 0.57629 rainfall -0.007316 0.002897 -2.525 0.02011 *
Y = 2.451 + 0.031 T - 0.007 R
Running WinBUGSWhat do you need?
1 - a model giving the likelihood and the priors
2 - some data of course
3 - initial values to start the MCMC algorithm
Running WinBUGSThe model
use the WinBUGS command 'model'
don't forget to embrace the model with {...}
Define the likelihood...Yi ~ N( + r Ri + t Ti ,2)
Note: 2 = 1/
Specify the priors
We use noninformative or vague or flat priors
here
Monitor any other parameter you'd like to... e.g. 2 = 1/
Running WinBUGSData and initial values
Use 'list' structures (R/Splus syntax)......and 'vector' structures (R/Splus syntax)
Running WinBUGSOverall
1 - a model giving the likelihood and the priors
2 - data
3 - initial values
1- check model
2- load data
3- compile model
4- load initial values
5- generate burn-in values
6- parameters to be monitored
7- perform the sampling to generate posteriors
8- check convergence and display results
Running WinBUGSAt last!!
Running WinBUGS1. Check model
Running WinBUGS1. Check model: highlight 'model'
Running WinBUGS1. Check model: open the Model Specification
Tool
Running WinBUGS1. Check model: Now click 'check model'
Running WinBUGS1. Check model: Watch out for the confirmation at the
foot of the screen
Running WinBUGS2. Load data: Now highlight the 'list' in the data
window
Running WinBUGS2. Load data: then click 'load data'
Running WinBUGS2. Load data: watch out for the confirmation at the foot of
the screen
Running WinBUGS3. Compile model: Next, click 'compile'
Running WinBUGS3. Compile model: watch out for the confirmation at the
foot of the screen
Running WinBUGS4. Load initial values: highlight the 'list' in the
data window
Running WinBUGS4. Load initial values: click 'load inits'
Running WinBUGS4. Load initial values: watch out for the confirmation at
the foot of the screen
Running WinBUGS5. Generate Burn-in values: Open the Model
Update Tool
Running WinBUGS5. Generate Burn-in values: Give the number of burn-in
iterations (1000)
Running WinBUGS5. Generate Burn-in values: click 'update' to do the
sampling
Running WinBUGS6. Monitor parameters: open the Inference Samples Tool
Running WinBUGS6. Monitor parameters: Enter 'intercept' in the node box
and click 'set'
Running WinBUGS6. Monitor parameters: Enter 'slope_temperature' in the node box
and click 'set'
Running WinBUGS6. Monitor parameters: Enter 'slope_rainfall' in the node box and
click 'set'
Running WinBUGS7. Generate posterior values: enter the number of samples you want
to take (10000)
Running WinBUGS7. Generate posterior values: click 'update' to do the sampling
Running WinBUGS8. Summarize posteriors: Enter '*' in the node box and
click 'stats'
Running WinBUGS8. Summarize posteriors: mean, median and credible
intervals
Running WinBUGS8. Summarize posteriors: 95% Credible intervals
tell us the same story
Estimate Std. Error t value Pr(>|t|) temperature 0.031069 0.054690 0.568 0.57629 rainfall -0.007316 0.002897 -2.525 0.02011 *
Running WinBUGS8. Summarize posteriors: 95% Credible intervals
tell us the same story
Estimate Std. Error t value Pr(>|t|) temperature 0.031069 0.054690 0.568 0.57629 rainfall -0.007316 0.002897 -2.525 0.02011 *
Running WinBUGS8. Summarize posteriors: click 'history'
Running WinBUGS8. Summarize posteriors: click 'auto cor'
Problem of autocorrelation
Coping with autocorrelationuse standardized covariates
Coping with autocorrelationuse standardized covariates
slope.rainfall
lag
0 20 40
-1.0 -0.5 0.0 0.5 1.0
slope.temperature
lag
0 20 40
-1.0 -0.5 0.0 0.5 1.0
Re-running WinBUGS1,2,...7, and 8. Summarize posteriors: click 'auto cor'
autocorrelation OK
slope.rainfall sample: 1000
-0.6 -0.4 -0.2
0.0 2.0 4.0 6.0 8.0
slope.temperature sample: 1000
-0.4 -0.2 0.0 0.2
0.0 2.0 4.0 6.0 8.0
Re-running WinBUGS1,2,...7, and 8. Summarize posteriors: click 'density'
slope.rainfall
iteration
1041 1250 1500 1750
-0.4 -0.3 -0.2 -0.1
-2.77556E-17
slope.temperature
iteration
1041 1250 1500 1750
-0.2 -0.1 0.0 0.1 0.2 0.3
Re-running WinBUGS1,2,...7, and 8. Summarize posteriors: click 'quantiles'
Running WinBUGS8. Checking for convergence using the Brooks-Gelman-
Rubin criterion
• A way to identify non-convergence is to simulate multiple sequences for over-dispersed starting points
• Intuitively, the behaviour of all of the chains should be basically the same.
• In other words, the variance within the chains should be the same as the variance across the chains.
• In WinBUGS, stipulate the number of chains after 'load data' and before 'compile' (obviously, as many sets of initial values as chains have to be loaded, or generated)
Running WinBUGS8. Checking for convergence using the Brooks-Gelman-
Rubin criterion
slope.temperature chains 1:2
iteration
1 5000 10000
0.0
0.5
1.0
1.5
slope.rainfall chains 1:2
iteration
1 5000 10000
0.0
0.5
1.0
The normalized width of the central 80% interval of the pooled runs is green
The normalized average width of the 80% intervals within the individual runs is blue
Re-running WinBUGS1,2,...7, and 8. Summarize posteriors: others...
• Click 'coda' to produce lists of data suitable for external treatment via the Coda R package
• Click 'trace' to produce dynamic history changing in real time
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32 15.6 36 13.1 72 13.1 43 15.0 92 11.7 32 15.3 86 14.4 28 14.4 57 12.7 55 11.7 66 11.9 26 15.9 28 13.4 96 14.0 48 13.9 90 12.9 86 15.1 7813.0 87
151 / 173 105 / 164 73 / 103 107 / 113 113 / 122 87 / 112 77 / 98 108 / 121 118 / 132 122 / 136 112 / 133 120 / 137 122 / 145 89 / 117 69 / 90 71 / 80 53 / 67 41 / 54 53 / 58 31 / 39 35 / 42 14 / 23 18 / 23
Another example: logistic regression
YProportion
of nests with success
(>0 youngs)
T Temp. May (°C)R Rainf. May (mm)
Performing a logistic regression with WinBUGS
model
# succ. in year i ~ Bin(pi, total # couples in year i)
where pi the probability of success in year i
logit(pi)= + r Ri + t Ti, i=1,...,23
Performing a logistic regression with WinBUGS
noninformative priors
Performing a logistic regression with WinBUGS
data & initial values
Performing a logistic regression with WinBUGS
Performing a logistic regression with WinBUGSthe results
• influence of rainfall, but not temperature (see credible intervals)
lower upper
• additional parameters as a by-product of the MCMC samples: just add them in the model as parameters to be monitored
- geometric mean:
geom <- pow(prod(p[]),1/N)
- odds-ratio:
odds.rainfall <- exp(slope.rainfall)
odds.temperature <- exp(slope.temperature)
Performing a logistic regression with WinBUGSthe results
• additional parameters as a by-product of the MCMC samples
- geom. mean probability of success around 82% [81%;84%]
- odds-ratio: -16% for an increase of rainfall of 1 unit
Performing a logistic regression with WinBUGSthe results
Climatic conditions affect survival in the wilda) European dippers in Eastern France (1981-1987)
Practical session 1
• Estimating survival using capture-recapture models
• CJS model: time-dependent survival and detection rates
• Case study a): A flood occured in the 1983 breeding season so that survival rates are expected to differ between flood (f) and non-flood (n) years
use constraints to estimate parameters f and n
Climatic conditions affect survival in the wildb) Lapwings in UK (1980-1993)
Practical session 2
• Estimating survival using ring-recovery models
• Case study b): Number of days that the mean temperature is below freezing in central England (fdays) is expected to affect Lapwings survival
use fdays as a covariate
• Strong assumption: variation in survival is totally explained by the covariate alone
incorporate random effects