mcqmc 2012 from inference to modelling to algorithms and back again kerrie mengersen qut brisbane

81
MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane C ollaborative Centre forD ata A nalysis, M odelling and Com putation Q U T, G PO Box 2434 Brisbane 4001, A ustralia

Upload: vernon-twine

Post on 14-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

MCQMC 2012

From inference to modelling to algorithms and back again

Kerrie MengersenQUT Brisbane

Collaborative Centre for Data Analysis,

Modelling and Computation QUT, GPO Box 2434

Brisbane 4001, Australia

Page 2: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Acknowledgements: BRAG

Bayesian methods and models+ Fast computation

+ Applications in environment, health, biology, industry

Page 3: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

So what’s the problem?

Page 4: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Matchmaking 101

Page 6: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Plant biosecurity

Page 7: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

• Observations and data– Visual inspection symptoms– Presence / absence data– Space and time

• Dynamic invasion process– Growth, spread

• Inference– Map probability of extent over time– Useful scale for managing trade / eradication– Currently use informal qualitative approach

• Hierarchical Bayesian model to formalise the information

From inference to model

Page 8: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Hierarchical Bayesian model for plant pest spread

• Data Model: Pr(data | incursion process and data parameters) – How data is observed given underlying pest extent

• Process Model: Pr(incursion process | process parameters) – Potential extent given epidemiology / ecology

• Parameter Model: Pr(data and process parameters)– Prior distribution to describe uncertainty in detectability, exposure, growth …

• The posterior distribution of the incursion process (and parameters) is related to the prior distribution and data by:

Pr(process, parameters | data) Pr(data | process, parameters ) Pr( process | parameters ) Pr(parameters)

Page 9: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Early Warning Surveillance Priors based on emergency

plant pest characteristics exposure rate for

colonisation probability spread rates to link sites

together for spatial analysis Add surveillance data

Posterior evaluation modest reduction in area

freedom large reduction in estimated

extent residual “risk” maps to target

surveillance

Page 10: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Observation Parameter Estimates

• Taking into account invasion process

• Hosts– Host suitability

• Inspector efficiency– Identify contributions

Page 11: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Study 2: Mixture modelsClair Alston

Page 12: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

CAT scanning sheep

Page 13: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

• Finite mixture model yi ~ j N(j,j

2)

• Include spatial information

From inference to model

What proportions of the sheepcarcase are muscle, fat and bone?

Page 14: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane
Page 15: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Inside a sheep

Page 16: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Inside a sheep

Page 17: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Study 3: State space modelsNicole White

Page 18: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Parkinson’s Disease

Page 19: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

PD symptom data• Current methods for PD subtype classification rely

on a few criteria and do not permit uncertainty in subgroup membership.

• Alternative: finite mixture model (equivalent to a latent class analysis for multivariate categorical outcomes)

• Symptom data: Duration of diagnosis, early onset PD, gender, handedness, side of onset

Page 20: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

1. Define a finite mixture model based on patient responses to Bernoulli and Multinomial questions.

2. Describe subgroups w.r.t. explanatory variables

3. Obtain patient’s probability of class membership

yij: ith subject’sresponse to item j

From inference to model

Page 21: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

PD: Symptom data

Page 22: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

PD Signal data:“How will they respond?”

Page 23: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Inferential aims

Identify spikesand assign tounknown no.source neurons

Compare clustersbetween segmentswithin a recordingand betweenrecordings atdifferent locationsof the brain

3 depths

Page 24: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Microelectrode recordings

Each recording wasdivided into 2.5sec.segments

Discriminatingfeatures foundvia PCA

Page 25: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

DP Modelyi | i ~ p(yi | i)

i ~ G

G ~ DP(, G0)

P PCs, yi=(yi1,..,yiP) ~ MVN()

G0 = p(p

~ Ga(2,2)

From inference to model

Page 26: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Average waveforms

Page 27: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Study 4: Spatial dynamic factor modelsChris StricklandIan Turner

What can we learn about landuse from MODIS data?

Page 28: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Differentiate landuse SDFM

• 1st factor has influence on temporal dynamics in right half of image (woodlands)

• 3rd factor has influence on LH image (grasslands)

1st trend component 2nd trend comp. common cyclical comp.

Page 29: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Matchmaking 101

Page 30: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Smart models

Page 31: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Example 1: GeneralisationMixtures are greatbut how do we choose k?

Propose an overfitting model (k>k0)

Non-identifiable!All values of = (p1

0,..,pk00, 0, 1

0,..,k00)

and all values of = (p10,..,pj,…,pk0

0, pk+1, 10,..,k0

0, j0)

with pj+pk+1=pj0 fit equally well.

Judith Rousseauf0(x) = j=1,..,k0 pj gj(x)

Page 32: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

So what?

• Multiplicity of possible solutions => MLE does not have a stable asymptotic behaviour.

• Not important when f is the main object of interest, but important if we want to recover

• It thus becomes crucial to know that the posterior distribution under overfitted mixtures give interpretable results

Page 33: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Possible alternatives to avoid overfitting

Fruhwirth-Schnatter (2006): either one of the component weights is zero or two of the component parameters are equal.

• Choose priors that bound the posterior away from the unidentifiability sets.

• Choose priors that induce shrinkage for elements of the component parameters.

Problem: may not be able to fit the true

model

Page 34: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Our result

Assumptions:– L1 consistency of the posterior – Model g is three times differentiable, regular, and

integrable– Prior on is continuous and positive, and the prior

on (p1,..,pk) satisfies (p) p1

1-1…pkk-1

Page 35: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Our result - 1

• If max(1)<d/2, where d=dim(), then asymptotically f(|x) concentrates on the subset of parameters for which f = f0, so k-k0 components have weight 0.

• The reason for this stable behaviour as opposed as the unstable behaviour of the maximum likelihood estimator is that integrating out the parameter acts as a penalization: the posterior essentially puts mass on the sparsest way to approximate the true density.

Page 36: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Our result - 2

• In contrast, if min(j, j≤k)>d/2 and k>k0, then 2 or more components will tend to merge with non-neglectable weights each. This will lead to less stable behaviour.

• In the intermediate case, if min(j, j≤k) ≤d/2 ≤max(j,j ≤k), then the situation varies depending on the j’s, and on the difference between k and k0.

Page 37: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Implications: Model dimension

• When d/2>max{j, j=1,..,k},dk0+k0-1+j≥k0+1j appears as an effective dimension of the model

• This is different from the number of parameters, dk+k-1, or from other “effective number of parameters”

• Similar results are obtained for other situations

Page 38: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Example 1 yi ~ N(0,1); fit pN(1,1)+(1-p)N(2,1)

i=1 > d/2

Page 39: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Example 2 yi ~ N(0,1)

G=pN2(1,1)+(1-p)N2(2, 2), j diagonal d = 3; 1=2=1 < d/2

Page 40: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Conclusions• The result validates the use of Bayesian estimation

in mixture models with too many components.• It is one of the few examples where the prior can

actually have an impact asymptotically, even to first order (consistency) and where choosing a less informative prior leads to better results.

• It also shows that the penalization effect of integrating out the parameter, as considered in the Bayesian framework is not only useful in model choice or testing contexts but also in estimating contexts.

Page 41: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Example 2: Empirical likelihoods

• Sometimes the likelihood associated with the data is not completely known or cannot be computed in a manageable time (eg population genetic models, hidden Markov models, dynamic models), so traditional tools based on stochastic simulation (eg, regular MCMC) are unavailable or unreliable.

• Eg, biosecurity spread model.

Christian Robert

Page 42: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Model alternative: ELvIS• Define parameters of interest as functionals of the cdf

F (eg moments of F), then use Importance Sampling via the Empirical Likelihood.

• Select the F that maximises the likelihood of the data under the moment constraint.

• Given a constraint of the form E((Y)) = the EL is defined as

Lel(|y) = maxFi=1:n{F(yi)-F(Yi-1}• For example, in the 1-D case when = E(Y)

the empirical likelihood in is the maximum of p,…,pn under the constraint i=1:npiyi =

Page 43: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Quantile distributions

• A quantile distribution is defined by a closed-form quantile function F-1(p;) and generally has no closed form for the density function.

• Properties: very flexible, very fast to simulate (simple inversion of the uniform distribution).

• Examples: 3/4/5-parameter Tukey’s lambda distribution and generalisations; Burr family; g-and-k and g-and-h distributions.

Page 44: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

g-and-k quantile distribution

Page 45: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Methods for estimating a quantile distribution

• MLE using numerical approximation to the likelihood

• Moment matching• Generalised bootstrap• Location and scale-free functionals• Percentile matching• Quantile matching• ABC• Sequential MC approaches for multivariate

extensions of the g-and-k

Page 46: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

ELvIS in practice

• Two values of =(A,B,g,k):=(0,1,0,0) standard normal distribution=(3,2,1,0.5) Allingham’s choice

• Two priors for :U(0,5)4

A~U(-5,5), B~U(0,5), g~U(5,5), k~(-1,1)• Two sample sizes:

n=100n=1000

Page 47: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

ELvIS in practice: =(3,2,1,0.5), n=100

Page 48: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Matchmaking 101

Page 49: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

A wealth of algorithms!

Page 50: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

From model to algorithm

Models:• Logistic regression• Non-Gaussian state space models• Spatial dynamic factor models

Evaluate:• Computation time• Maximum bias• sd• Inefficiency factor (IF)• Accuracy rate

Chris Strickland

Page 51: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Logistic Regression

k = 2, 4, 8, 20 covariates; n=1000 observations

• Importance sampling (IS):– E[h()] = h() [p(|y)/q(|y)] q(|q) d

with q(|y) proportional to exp(-0.5(-*)TV-1(-*))

– MLE * (mode found by IRWLS) variance V=-2∂2p(y|X,)/(ddT)|=*

• Random walk Metropolis-Hastings (RWMH):– same proposal distribution

• Adaptive RWMHGarthwaite, Yan, Sisson

– Only needs starting values (*) – easiest candidate!

Page 52: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Results

Algorithm k time bias sd IF acc.rateIS 2 5.1 0.07 0.06 - -

8 5.6 0.18 0.09 - - 20 6.3 0.30 0.11 - -

RWMH 2 8.1 0.08 0.07 12 0.56 8 8.8 0.15 0.10 30 0.20

20 9.2 0.20 0.12 120 0.03

ARWMH 2 10.5 0.08 0.07 10 0.24 8 10.8 0.20 0.07 35 0.23 20 12.1 0.30 0.12 50 0.21

Page 53: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

So what?

• If k is small (e.g. 8), even a naïve candidate is ok.• When k is larger (e.g. 20), we need something

more intelligent, e.g. adaptive MH• As models become more complicated, we need to

get more sophisticated

Page 54: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Importance samplervs

MCMCvs

Particle filtervs

Laplace approximation (INLA)

Non-Gaussian state space model

Page 55: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Non-Gaussian state space model• Importance sampler

√ General algorithms (Durbin&Koopman 2001 - global approximation) andTailored algorithms (Liesenfeld&Richard 2006 – local approximation)

√ Independence sampler - don’t have to worry about correlated draws

√ Parallelisable – potentially much faster× Difficult to come up with a good candidate distribution× More difficult as the sample and model complexity

increase× More complex than MCMC to obtain non-standard

expectations

Page 56: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Non-Gaussian state space model• MCMC

√ Very flexible w.r.t extensions to other models

√ The same algorithms can be used as the sample size and/or parameter dimension grows (with some provisos)

× Can be slow, and complicated to achieve good acceptance and mixing

× Single move samplers perform poorly in terms of mixingPitt&Shephard; Kim,Shephard&Chib

√ Very efficient (mixing) algorithms can be designed for specific problems, eg stochastic volatilityK,S,C

√ General approaches available – simple reparametrisation can lead to vastly improved simulation efficiencyStrickland et al

Page 57: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Non-Gaussian state space model

• Particle filters√ Easy to implement – at least intuitively

√ Updating estimates as the sample size grows is possibly a lot simpler and cheaper than a full MCMC or IS sampler update

× Perhaps not as easy as appears with parameter learning

× Particle approximation degenerates – may need MCMC steps, or alternatives

× To do full MCMC updates, need to store the entire history of particle approximation

Page 58: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Non-Gaussian state space model

• Integrated Nested Laplace Approximation (INLA)√ Extremely fast

√ If model complexity stays the same, then it can work for very large problems

√ R interface to code

× Can only hold a small number of hyperparameters than can be forced in the GMRF approximation, so very restrictive to the problem set

Page 59: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

So what?

• Many algorithms: general versus specific, flexible versus tailored

• Pros and cons should be weighed against inferential aims, model and computational resources

• Blocking and reparametrisation are two good tricks, but we need to be clever about non-centred reparametrisation

Page 60: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Spatial Dynamic Factor Models

Yt = Bft + t, t ~ N(0, ), diag.

factor loadings bj ~ N(0, V(s))

• Spatial correlation:– Lopes et al. use a GRF, O((p×k*)3)– Strickland et al. use a GMRF (images: large discrete

spatial domain)+ Krylov subspace methods to sample from GMRF posterior, scales linearly O(p×k*).

– Rue uses Cholesky decomposition, more complex, O(p×k*)3/2)

Page 61: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

So what?

Difference between (i) O(p×k*) and (ii) O(p×k*)3/2):

If the data set becomes 100 times larger, (i) will take 1 million times longer, compared to 100 times longer for (ii). Instead of waiting 1 hour, you might have to wait more than 1 year…

Page 62: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Case study• Spatial domain: 900 pixels

• Temporal: approx 200 periods (every 16 days)

• Total 180,000 observations, ~3600 parameters

10 mins for 15000 MCMC iterations (on a laptop!)

Page 63: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Conclusions• The model is too complicated and the datasets too

large for IS or PF.• There are too many parameters for INLA.• MCMC is thus desirable, but it is extremely

important to choose good algorithms!

As the model becomes more complex, the choice of smart algorithms becomes more important

and can make a large difference in computation and estimation

Page 64: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Matchmaking 102

Page 65: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Smart algorithms

Page 66: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Hybrid algorithms

Tierney (1994)

Design an efficient algorithm that combines features of other

algorithms, in order to overcome identified weaknesses in the

component algorithms

Page 67: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Comparing algorithms• Accuracy

– bias (H)• Efficiency

– rate of convergence, rate of acceptance (A)– mixing speed (integrated autocorrelation time) (H)

• Applicability– simplicity of set-up, flexibility of tailoring

• Implementation– coding difficulty, memory storage– computational demand:

• total number of iterations (T), burnin (T0) • correlation along the chain, measured by the effective sample size

ESSH = (T-T0)/(H)

Page 68: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Hybrid algorithms - 1

• Metropolis-Hastings Algorithm (MHAs)– Improve mixing speed via:

• parallel chains (MHA)

• repulsive proposal (MHARP M&R), pinball sampler

• delayed rejection (DRATierney & Mira, DRALP, DRAPinball)

– Improve applicability by: • reversible jumpGreen

• Metropolis adjusted Langevin (MALATierney&Roberts, Besag&Green)

Kate Lee, Christian Robert, Ross McVinish

Page 69: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Simulation study

• Model– mixture of 2-D normals, well separated

= 0.5 N([0,0]T, I2) + 0.5 N([5,5]T, I2)– 100 replicated simulations– Each result is obtained after running the algorithm for

1200 seconds using 10 particles– Proposal variance = 4; value in brackets is MSE

(similar results for variance = 2)• Platform

– Version 7.0.4 Matlab– run by SGI Altix XE cluster containing 112 x 64 bit

Intel Xeon cores.

Page 70: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Results

T=no. simulations; A=acceptance rate; H=accuracy; 2H=var(H),

p(H)=autocorrelation time; ESSH=effective sample size

Page 71: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Results• MHA

– shortest CPU time per iteration and largest sample size– need to tune proposal variance to optimise performance

• MALA– can get trapped in nearest mode if the scaling parameter in the

variance of the proposal is not sufficiently large• MHA with RP

– induces a fast mixing chain, but need to choose tuning parameter– expensive to compute– in rare cases the algorithm can be unstable and get stuck

• DRA– less correlated chains, higher acceptance rate– higher computational demand– Langevin proposal: improves mixing, but the loss in

computational efficiency overwhelmed gain in statistical efficiency

– Normal random walk faster to compute and improves mixing.

Page 72: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Hybrid algorithms - 2

• Population Monte Carlo (PMC)– Extension of IS by allowing importance function to

adapt to the target in an iterative mannerCappe et al

– PMC with repulsive proposal: create ‘holes’ around existing particles

• Particle systems and MCMC– IS + MCMCM&R

– parallel MCMC + SMCdel Moral

Page 73: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Simulation study repeated

• 100 replicates, run for 500 seconds using 50 particles, with first 100 iterations ignored

√ accuracy of estimation√ fast exploration (significantly reduced

integrated autocorrelation time)√ No instability (unlike MCMC algorithms)√ Less sensitivity to importance function√ Repulsive effect improved mixing

Page 74: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Summary

Algorithm Statistical Efficiency Computation Applicability

EPM CR RC CE SP FH CP Mode

MALA 1 1 1 -2 -1 0 -1 S

MHARP 0 1 0 -2 -1 -1 -2 B

DRA 1 1 0 -1 -1 0 0 B

DRALP 2 1 0 -2 -1 0 0 B

DRAPinball 2 1 0 -2 -1 0 0 B

PS 2 1 0 -3 -2 -1 -2 B

PMCR - 2 0 -1 -1 -1 0 B

Relative performance compared to MHA and PMC

EPM=efficiency of proposal move; CR=correlation reduction of chain;RC=rate of convergence; CE=cost effectiveness; SP=simplicity of programming;FH=flexibility of hyperparameters; CP=consistency of performance; Mode=preference between a single mode and multimodal problem

Page 75: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Hybrid algorithms - 3

Page 76: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

ABCelUnlike ABC, ABCel does not require:

Page 77: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Conclusions

1. Combining features of individual algorithms may lead to complicated characteristics in a hybrid algorithm.

2. Each individual algorithm may have a strong individual advantage with respect to a particular performance criterion, but this does not guarantee that the hybrid method will enjoy a joint benefit of these strategies.

3. The combination of algorithms may add complexity in set-up, programming and computational expense.

Page 78: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Implementing smart algorithms: PyMCMC

• Python package for fast MCMC• takes advantage of Python libraries Numpy, Scipy • Classes for Gibbs, M-H, orientational bias MC,

slice samplers, etc.• linear (with stochastic search), logit, probit, log-

linear, linear mixed-model, probit mixed-model, nonlinear mixed models, mixture, spatial mixture, spatial mixture with regressors, time series suite (including DFM and SDFM)

• Straightforward to optimise, extensible to C or Fortran, parallelisable (GPU)

Chris Strickland

Page 79: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

PyMCMC

[email protected]

Page 80: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Matchmaking algorithm!

Model Algorithm

Inference

Coolproblems

Past Future

Page 81: MCQMC 2012 From inference to modelling to algorithms and back again Kerrie Mengersen QUT Brisbane

Key References

• Lee, K., Mengersen, K., Robert, C.P. (2012) Hybrid models. In Case Studies in Bayesian Modelling. Eds Alston, Mengersen, Pettitt. Wiley, to appear.

• Stanaway, M., Reeves, R., Mengersen, K. (2010) Hierarchical Bayesian modelling of early detection surveillance for plant pest invasions. J. Environmental and Ecological Statistics.

• Strickland, C., Simpson, D, Denham, R., Turner, I., Mengersen, K. Fast methods for spatial dynamic factor models. CSDA.

• Strickland, C., Alston, C., Mengersen, K. (2011) PyMCMC. J. Statistical Software. Under review.

• White, N. Johnson, H., Silburn, P., Mengersen, K. Unsupervised sorting and comparison of extracellular spikes with Dirichlet Process Mixture Models. Annals Applied Statistics. Under review