an adaptive smc scheme for approximate bayesian ...sayan/fernando.pdf · data-simulation steps in...

33
An adaptive SMC scheme for Approximate Bayesian Computation (ABC) Fernando Bonassi (joint work with Prof. Mike West) Department of Statistical Science - Duke University April/2011 Fernando Bonassi An adaptive SMC scheme for ABC

Upload: others

Post on 18-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

An adaptive SMC scheme for ApproximateBayesian Computation (ABC)

Fernando Bonassi

(joint work with Prof. Mike West)

Department of Statistical Science - Duke University

April/2011

Fernando Bonassi An adaptive SMC scheme for ABC

Page 2: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Approximate Bayesian Computation (ABC)

Problems in which likelihood is intractable but we can simulatethe underlying stochastic model

So-called implicit statistical models

Allow great flexibility to model complex systems

Applications in evolutionary biology, epidemiology, systemsbiology.

Fernando Bonassi An adaptive SMC scheme for ABC

Page 3: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC Algorithm

1 Draw � from prior �(�)

2 Simulate x ∼ f (x ∣�)

3 Accept � if �(x , xobs) < �

The resulting distribution is �(�∣�(x , xobs) < �)

Exact posterior when � = 0

Fernando Bonassi An adaptive SMC scheme for ABC

Page 4: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC Illustration

Fernando Bonassi An adaptive SMC scheme for ABC

Page 5: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Approximate Bayesian Computation (ABC)

Accuracy of the approximation controlled by the tolerance level �

Ideally, � should be very small, but that implies low acceptancerate

Two kinds of methods proposed to improve the efficiency:automatic and post-sampling

Fernando Bonassi An adaptive SMC scheme for ABC

Page 6: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Automatice.g., ABC-MCMC, ABC-SMC

Inputs before simulation steps

Post-sampling

e.g., ABC-REG, ABC-GLM

Analysis after simulation steps

Automatic methods rely on more efficient schemes to samplefrom �(�∣�(x , xobs) < �)

Post-sampling methods are based on some sort of regression tocorrect sampled values and approximate �(�∣xobs)

Fernando Bonassi An adaptive SMC scheme for ABC

Page 7: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

A post-sampling approach: ABC-MIX

Marginal data in bionetwork models: toggle switch model(Bonassi, You and West, 2011)

Model

yu = uT + �+ ���u/u T

dudt = �u

(1+vt�u )− (�u + �uut) + �u�u,t

dvdt = �v

(1+ut�v )− (�v + �v vt) + �v�v,t

Independent noise processes �., �.

Observation

Fernando Bonassi An adaptive SMC scheme for ABC

Page 8: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-MIX for the toggle switch model

Massive prior:model simulation⇒ large sample of (�,y)

Data characterization and dimension reduction by means ofsignatures S(y) over a set of reference distributions

Constrain the sample {�,S} keeping the 5% closest syntheticdatasets to Sobs

Fernando Bonassi An adaptive SMC scheme for ABC

Page 9: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Fernando Bonassi An adaptive SMC scheme for ABC

Page 10: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-MIX for the toggle switch model

Fit mixture model to the constrained sample {�,S} (Suchard etal. 2010, Cron and West, 2011)

Conditional mixture g(�∣Sobs) yields approximate posteriordistribution

Fernando Bonassi An adaptive SMC scheme for ABC

Page 11: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Fernando Bonassi An adaptive SMC scheme for ABC

Page 12: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-SMC

Automatic ABC approach based on Sequential Monte Carlo

Main goal: improve the acceptance rate of ABC by dividing theproblem into subproblems (Sisson et al, 2007, Beaumont et al,2009)

In each step t obtain �(�∣�(x , xobs) < �t) for a decreasingtolerance schedule {�1, ⋅ ⋅ ⋅ , �T }

Fernando Bonassi An adaptive SMC scheme for ABC

Page 13: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-SMC Algorithm

S1 Initialize �1 > ⋅ ⋅ ⋅ > �T

S2 t = 1Simulate �(1)i ∼ �(�) and x ∼ f (x ∣�(1)i ) until �(x , xobs) < �1

Set wi = 1/N

S3 t = 2, . . . ,TPick �∗i from the �(t−1)

j ’s with probabilities w (t−1)j

Generate �(t)i ∼ Kt(�(t)i ∣�

∗i ) and x ∼ f (x ∣�(t)i ) until �(x , xobs) < �t

Set w (t)i ∝

�(�(t)i )∑

j w (t−1)j Kt(�

(t)i ∣�

(t−1)j )

Fernando Bonassi An adaptive SMC scheme for ABC

Page 14: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Toy Example

� ∼ Unif (−10,10)

Likelihood: f (x ∣�) = 0.5 N(�,1) + 0.5 N(�,1/100)

Goal: Approximate posterior of � for xobs = 0;

Fernando Bonassi An adaptive SMC scheme for ABC

Page 15: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC SMC for tolerance schedule: �1 = 5, �2 = 1, �3 = 0.01

�(�∣�(x , xobs) < �t) where � is the euclidean distance:

Fernando Bonassi An adaptive SMC scheme for ABC

Page 16: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

For N=5,000 particles, number of data-generation simulations (in103) in each step

Step t �t ABC-SMC ABC1 5 10 -2 1 26 -3 0.01 734 4,424

Total 770 4,424

The most expensive computational step is generally the modelsimulation

Beaumont et al. (2009) report 95% of time spent in modelsimulation for their application of ABC-SMC

Fernando Bonassi An adaptive SMC scheme for ABC

Page 17: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Idea behind ABC-SMC

∑j w (t−1)

j Kt(�(t)i ∣�

(t−1)j ) can be seen as a mixture approximation

for �(�∣�(x , xobs) < �t−1)

This approximation is then used as a proposal for�(�∣�(x , xobs) < �t) in order to achieve a better approximation

In some sense, it follows the same ideas of adaptive importancesampling of West (1993)

Fernando Bonassi An adaptive SMC scheme for ABC

Page 18: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

An adaptive SMC scheme for ABC

Extending the mixture approximation idea, we can approximate�(x , �∣�(x , xobs) < �t−1) by:

g(x , �) ∼∑

j

w (t−1)j Kt,x(x

(t)i ∣x

(t−1)j )Kt,�(�

(t)i ∣�

(t−1)j )

This is a more complete representation of the joint distribution of(x , �), which should induce better proposals and better efficiency

Fernando Bonassi An adaptive SMC scheme for ABC

Page 19: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

An adaptive SMC scheme for ABC

The new induced approximation will be:

g(�∣xobs) ∝∑

j

Kt,x(xobs∣x (t−1)j )w (t−1)

j Kt,�(�(t)i ∣�

(t−1)j )

Whereas in the ABC-SMC it was:

g(�∣xobs) ∝∑

j

w (t−1)j Kt,�(�

(t)i ∣�

(t−1)j )

Fernando Bonassi An adaptive SMC scheme for ABC

Page 20: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Mixture approximation at step one (�1 = 5) using ABC-SMC (blue)and ABC-SMC with adaptive weights (red)

Fernando Bonassi An adaptive SMC scheme for ABC

Page 21: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-SMC with Adaptive Weights

S2 t = 1Simulate �(1)i ∼ �(�) and x ∼ f (x ∣�(1)i ) until �(x , xobs) < �1

Set wi = 1/N

S3 t = 2, . . . ,TSet weights v (t−1)

i ∝ w (t−1)i Kt,x(xobs∣x (t−1)

j )

Normalize new weights v (t−1)i

Pick �∗i from the �(t−1)j ’s with probabilities v (t−1)

j

Generate �(t)i ∼ Kt,�(�(t)i ∣�

∗i ) and x ∼ f (x ∣�(t)i ) until �(x , xobs) < �t

Set w (t)i ∝

�(�(t)i )∑

j v (t−1)j Kt,�(�

(t)i ∣�

(t−1)j )

Fernando Bonassi An adaptive SMC scheme for ABC

Page 22: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Comparison for the Normal toy example

For N=5,000 particles, number of data-generation simulations (in 103)in each step

Step t �t ABC-SMC ABC-SMC with AW1 5 10 102 1 26 133 0.01 734 463

Total 770 486

Fernando Bonassi An adaptive SMC scheme for ABC

Page 23: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-SMC with AW for the Toggle Switch Problem

Observation

Fernando Bonassi An adaptive SMC scheme for ABC

Page 24: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-SMC with AW for the Toggle Switch Problem

Each step: 10K of model simulation steps and selection of 10% closestdatasets. Resulting tolerance schedule:

�1:5 = (4.4, 3.5, 0.8, 0.4, 0.2)

This way, the total number of data-generation steps was 50K.

For the previous analysis, ABC-MIX, some distance quantiles (in 10−5) for asample of 200K:

q(10%) = 3.45 q(5%) = 2.95 q(1%) = 0.77

Fernando Bonassi An adaptive SMC scheme for ABC

Page 25: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-MIX:

ABC-SMC with AW:

Fernando Bonassi An adaptive SMC scheme for ABC

Page 26: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Fernando Bonassi An adaptive SMC scheme for ABC

Page 27: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Data-simulation steps in ABC-MIX: 200K.Data-simulation steps in ABC-SMC with AW: 50K

For the regular ABC-SMC, with the same tolerance schedule,the number of generation steps was:

(10K, 10K, 23K, 23K, 27K, 35K)Total: 128K

Fernando Bonassi An adaptive SMC scheme for ABC

Page 28: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

ABC-SMC with AW resulted in final effective sample size of 456

Generation steps in ABC-SMC with AW depend on the particularreal dataset. Then, the algorithm should be run separately foreach one of the 10 real datasets

In ABC-MIX, all 200K generations are the same used for everyreal dataset

Fernando Bonassi An adaptive SMC scheme for ABC

Page 29: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Another application (from Toni et al. (2009))

Common-cold outbreak in the island Tristan da Cunha (1967)

day 1 2 3 4 ⋅ ⋅ ⋅ 20 21I(t) 1 2 3 7 ⋅ ⋅ ⋅ 1 0R(t) 0 0 0 0 ⋅ ⋅ ⋅ 36 37

Fernando Bonassi An adaptive SMC scheme for ABC

Page 30: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

SIR Model: Susceptible (S), Infected (I) and Recovered (R)

For this case S is unobserved.

SIR Model

Differential equations:

∂S = − SI

∂I = SI − �I

∂R = �I

Prior specification:

∼ U(0, 3),

� ∼ U(0, 3),

S(0) ∼ Unif{37, ⋅ ⋅ ⋅ , 100}

Runge-Kutta method to approximate solution for ODE. � used was theeuclidean distance based on the observed time-points.

Fernando Bonassi An adaptive SMC scheme for ABC

Page 31: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Using ABC-SMC with the tolerance schedule: �1:4 = (100, 70, 40, 20)

For N=1,000 particles, number of data generations in each step (in 103):

ABC-SMC: Step 1: 29 Step 2: 49 Step 3: 706 Step 4: 63

ABC-SMC with AW: Step 1: 29 Step 2: 40 Step 3: 116 Step 4: 11

Fernando Bonassi An adaptive SMC scheme for ABC

Page 32: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

Summary

ABC methods: interesting tool to model problems described by complexsystems

Improvement of efficiency can be obtained by automatic andpost-sampling methods

As an application and illustration of post-sampling approach, ToggleSwitch model was studied using ABC-MIX

New extension of ABC-SMC was proposed, which is based on adaptiveweights. It presented better efficiency than regular ABC-SMC

Choice of the most advantageous ABC approach is still aproblem-specific question

Fernando Bonassi An adaptive SMC scheme for ABC

Page 33: An adaptive SMC scheme for Approximate Bayesian ...sayan/Fernando.pdf · Data-simulation steps in ABC-SMC with AW: 50K For the regular ABC-SMC, with the same tolerance schedule, the

M.A. Beaumont, J.M. Cornuet, J.M. Marin, and C.P. Robert, Adaptive approximatebayesian computation, Biometrika 96 (2009), no. 4, 983.

M.A. Beaumont, W. Zhang, and D.J. Balding, Approximate bayesian computation inpopulation genetics, Genetics 162 (2002), no. 4, 2025.

F.V. Bonassi, L.You, and M. West, Bayesian learning from marginal data in bionetworkmodels, Department of Statistical Science, Duke University: Discussion Paper 11-07(2011).

S. A. Sisson, Y. Fan, and M. M. Tanaka, Sequential Monte Carlo without likelihoods,Proceedings of the National Academy of Sciences USA 104 (2007), 1760–1765.

T. Toni and M. P. H. Stumpf, Simulation-based model selection for dynamical systemsin systems and population biology, Bioinformatics 26 (2010), 104–110.

T. Toni, D. Welch, N. Strelkowa, A. Ipsen, and M.P.H. Stumpf, Approximate Bayesiancomputation scheme for parameter inference and model selection in dynamicalsystems, Journal of the Royal Society Interface 6 (2009), no. 31, 187.

M. West, Approximating posterior distributions by mixtures, Journal of the RoyalStatistical Society (Ser. B) 54 (1993), 553–568.

Fernando Bonassi An adaptive SMC scheme for ABC