asymptotic genealogies of sequential monte carlo algorithms · asymptotic genealogies of sequential...

33
Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins 1 , Adam M. Johansen 2 , and Dario Span` o, all at Department of Statistics, University of Warwick. arxiv: 1804.01811v2 CIRM — 25.10.2018 Compiled: Wednesday 24 th October, 2018: 22:05 1 Also at Department of Computer Science, University of Warwick 2 Also at The Alan Turing Institute

Upload: others

Post on 21-Apr-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Asymptotic Genealogies ofsequential Monte Carlo algorithms

Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and DarioSpano, all at Department of Statistics, University of Warwick.

arxiv: 1804.01811v2

CIRM — 25.10.2018Compiled: Wednesday 24th October, 2018: 22:05

1Also at Department of Computer Science, University of Warwick2Also at The Alan Turing Institute

Page 2: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

CoSInES Launch Day, 2/11/18 at Warwick

.

CoSInES a collaboration of researchers from Warwick, Bristol,Lancaster, Oxford and the Alan Turing Institute totackle fundamental challenges in Computational andBayesian Statistics.

Dates The project will run 1st October 2018 till 30thSeptember 2023, is primarily funded by EPSRC.

Launch We’d like to invite anyone who would like to attendto register their interest by emailing Shital Desai([email protected]).

Jobs! Soon 5 4-year PDRA positions associated with theproject based at any of the 5 institutions involved inthe grant. Informal enquiries to Gareth Roberts([email protected]) are very welcome.

http://www.cosines.org

Page 3: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Outline

Sequential Monte Carlo

Path degeneracy

The genealogical process and convergence

A numerical example

Conclusions and outlook

Page 4: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Importance sampling

I Intractable target:

Eπ[f (X)] :=

∫f (x)π(dx).

I Monte Carlo: let X(1:N) ∼ π⊗N . Then

Eπ[f (X)] ≈ 1

N

N∑i=1

f (X(i)).

I Change of measure:

Eπ[f (X)] = Eµ[dπ

dµ(X)f (X)

]=

∫dπ

dµ(x)f (x)µ(dx).

I Importance sampling: let X(1:N) ∼ µ⊗N . Then

Eπ[f (X)] ≈ 1

N

N∑i=1

dµ(X(i))f (X(i)).

Page 5: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Sequential Monte Carlo / Interacting Particle System

1: for i ∈ {1, . . . ,N} Sample X(i)0 ∼ µ0.

2: for i ∈ {1, . . . ,N} Set

w(i)0 ←

π0(X(i)0 )/µ0(X

(i)0 )∑N

j=1 π0(X(j)0 )/µ0(X

(j)0 )

.

3: for t ∈ {1, . . . ,T} do4: Sample (a

(1)t , . . . , a

(N)t ) ∼ Categorical(w

(1)t−1, . . . ,w

(N)t−1).

5: for i ∈ {1, . . . ,N} Sample X(i)t ∼ µt(·|X(a

(i)t )

t−1 ).6: for i ∈ {1, . . . ,N} Set

w(i)t ←

πt(X(i)t |X

(a(i)t )

t−1 )/µt(X(i)t |X

(a(i)t )

t−1 )∑Nj=1 πt(X

(j)t |X

(a(j)t )

t−1 )/µt(X(j)t |X

(a(j)t )

t−1 ).

Page 6: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Example: Hidden Markov Model

I Let {Xt}t≥0 be a Markov process with transition densityp(x, x′) and initial density π(x).

I Suppose a noisy observation Yt with density g(y|x) is madeof each state Xt .

I SMC algorithms with

π0(x0) ∝ π(x0)g(y0|x0),

πt(xt |x1:(t−1)) ∝ p(xt−1, xt)g(yt |xt)

target P(X0:T ∈ dx0:T |Y0:T = y0:T ).

I E.g. the boostrap particle filter: µ0 = π,µt(xt |x1:(t−1)) = p(xt−1, xt), and wt ∝ g(yt |xt).

Page 7: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Example: Bootstrap Particle Filter (Gordon et al., 1993)

1: for i ∈ {1, . . . ,N} Sample X(i)0 ∼ π.

2: for i ∈ {1, . . . ,N} Set

w(i)0 ←

g(y0|X (i)0 )∑N

j=1 g(y0|X (j)0 )

.

3: for t ∈ {1, . . . ,T} do4: Sample (a

(1)t , . . . , a

(N)t ) ∼ Categorical(w

(1)t−1, . . . ,w

(N)t−1).

5: for i ∈ {1, . . . ,N} Sample X(i)t ∼ p(X

(a(i)t )

t−1 , ·).6: for i ∈ {1, . . . ,N} Set

w(i)t ←

g(yt |X (i)t )∑N

j=1 g(yt |X (j)t )

.

Page 8: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Path degeneracy

I Suppose T � 1, and f (X0:T ) depends on every time point.

I Mergers due to resampling mean that times t � T areestimated from m� N paths.

I ⇒ High variance estimators.

I Loss of paths also means that fewer than N × T particlesneed to be stored, reducing memory cost.

I Aim: a priori estimates of:

E[TMRCA], Var(TMRCA), P(TMRCA > t),

etc.

Page 9: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Reasons to Characterize Path Degeneracy include. . .

I Qualitative understanding of methods.

I Calibrating fixed-lag techniques, e.g. Doucet and Senecal(2004).

I Relationship with estimator variance (Chan et al., 2013; Leeand Whiteley, 2015).

I Understanding storage requirements (Jacob et al., 2015).

Page 10: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

The coalescent process (Kingman, 1982)

I Let {R(n)t }t≥0 be a partition-valued process.

I R(n)0 = {{1}, . . . , {n}}.

I Each pair of blocks {i}, {j} merge at rate 1.

I A “death” process of rate(k

2

)where k is the number of

blocks.

Example: n = 4

T1 ∼Exp(6) T2 ∼Exp(3) Exp(1)

T1 T2 T3

Page 11: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

The genealogical process

I It is convenient to reverse the direction of time. . .

I Let {G (n,N)t }t∈N0 be the

genealogy of n ≤ N particlessampled randomly from anN-particle SMC algorithm ofinterest.

I G(n,N)0 = {{1}, . . . , {n}}.

I i ∼ j in G(n,N)t ⇒ particles i and

j have a common ancestor tgenerations ago.

I G (2,7) illustrated.

Page 12: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Objective: Establish conditions under which

As N →∞:

T1 T2 T3

Page 13: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Rescaling time

I For i ∈ {1, . . . ,N} and t ∈ N, let ν(i)t be the number of

children of particle i , t generations ago.

I Define

cN(t) :=1

(N)2

N∑i=1

(ν(i)t )2 ≈ E[ESS(t)−1],

τN(t) := inf

{s ≥ 1 :

s∑r=1

cN(r) ≥ t

},

DN(t) :=1

N(N)2

N∑i=1

(ν(i)t )2

(i)t +

1

N

N∑j 6=i

(ν(j)t )2

).

Page 14: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Convergence theorem

Suppose that all assignments of offspring to parents are equallylikely, and that

limN→∞

E

[τN(t)∑

r=τN(s)+1

DN(r)

]= 0,

limN→∞

E[cN(t)] = 0,

limN→∞

E

[τN(t)∑

r=τN(s)+1

cN(r)2

]= 0,

E[τN(t)− τN(s)] ≤ Ct,sN.

Then (G(n,N)τN(t) )t≥0 converges to the Kingman coalescent in the

sense of finite dimensional distributions.

Page 15: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Proof outline

I Consider finite dimensional distributions.

I Apply straightforward, but intricate counting arguments,

I together with bounds on elementary transition probabilities,

I to upper and lower bound the elements of the FDDs.

I Compare these with those of the coalescent.

Page 16: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Sketch proof

I Let ξ and η be partitions of {1, . . . , n}, with the block countsof η in terms of the blocks of ξ being b1, . . . , b|η|,i.e. b1 + . . .+ b|η| = |ξ|.

I The conditional one-step transition probability of G(n,N)t given

family sizes is

pξη(t) :=1

(N)|ξ|

N∑i1 6=... 6=i|η|=1

(ν(i1)t )b1 . . . (ν

(i|η|)t )b|η| .

I FDDs:

P(G(n,N)τN(t1) = η1, . . . ,G

(n,N)τN(tk ) = ηk |G

(n,N)τN(t0) = η0)

= E

[k∏

d=1

{τN(td )∏

r=τN(td−1)+1

PN(r)

}ηd−1ηd

].

Page 17: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Sketch proof II

I For a single time interval{τN(td )∏

r=τN(td−1)+1

PN(r)

}ηd−1ηd

=∑ξ

τN(td )∏s=τN(td−1)+1

pξs−1ξs (s),

where ξ = (ηd−1, ξτN(td−1)+1, . . . , ξτN(td )−1, ηd).

I Each partition in ξ is either equal to its predecessor, orobtained from its predeceror by merging some subset(s) ofblocks.

Page 18: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Sketch proof III

pξξ(t) ≈ 1−(|ξ|2

)1

(N)2−(|ξ|2

)cN(t).

If η is formed by merging two blocks of ξ,

cN(t)−(|ξ| − 2

2

)DN(t) . pξη(t) . cN(t).

If η is formed by merging more than two blocks of ξ,

pξη(t) .

(|ξ| − 2

2

)DN(t).

Page 19: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Sketch proof IV

∑ξ

τN(td )∏s=τN(td−1)+1

pξs−1ξs (s) ≈|ηd−1|−|ηd |∑

α=1

∑(λ,µ)∈Π2([α])

C

×τN(td )∑

s1<...<sα=τN(td−1)+1

{∏r∈λ

cN(sr )

}{∏r∈µ

DN(sr )

},

DN(t) ≈ cN(t)

N,

τN(td )∑s<...<sα=τN(td−1)+1

α∏r=1

cN(sr ) ≈ (td − td−1)α

α!.

Page 20: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Sketch proof V

When ξ consists of binary mergers only, i.e. α = |ηd−1| − |ηd |,

∑ξ

τN(td )∏s=τN(td−1)+1

pξs−1ξs (s)

≈ C ′τN(td )∑

s1<...<sα=τN(td−1)+1

{α∏

r=1

cN(sr )

}τN(td )∏

r=τN(td−1)+1

{1− C ′′cN(r)}

≈τN(td )−τN(td−1)−α∑

β=0

C ′′′τN(td )∑

s1<...<sα+β=τN(td−1)+1

α+β∏r=1

cN(sr ).

Page 21: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Sketch proof VI

It turns out that the constant C ′′′ is exactly (Qα+β)ηd−1ηd , whereQ is the Kingman coalescent generator!

∑ξ

τN(td )∏s=τN(td−1)+1

pξs−1ξs (s)

≈τN(td )−τN(td−1)−α∑

β=0

C ′′′τN(td )∑

s1<...<sα+β=τN(td−1)+1

α+β∏r=1

cN(sr )

≈τN(td )−τN(td−1)−α∑

β=0

(Qα+β)ηd−1ηd

(td − td−1)α+β

(α + β)!

≈ (eQ(td−td−1))ηd−1ηd .

Page 22: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Corollary 1

The genealogy of n particles sampled uniformly at random from anN-particle bootstrap particle filter with multinomial resamplingconverges to a Kingman coalescent under the time-scaling τN(t)whenever

1

a≤ g(yt |xt) ≤ a,

εh(xt) ≤ p(xt−1, xt) ≤1

εh(xt),

for some 0 < ε ≤ 1 ≤ a <∞, and some probability density h(x)uniformly in time, space, and the observations.

Page 23: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Sketch proof

I Conditional on weights, the offspring counts have multinomial

distributions with parameters (N;w(1)t , . . . ,w

(N)t ).

I Upper and lower bounds on observation densities imply

ε2

a2N≤ w

(i)t ≤

a2

ε2N.

I The required upper and lower bounds follow from thesebounds, standard moment-calculations for multinomialdistributions, and the local dependence structure of particlefilters.

Page 24: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Corollary 2

Let Tn be the total Kingman coalescent tree height of n particles.Under the preceding assumptions,

2ε4N

a4

(1− 1

n

)≤ E[τN(Tn)] ≤ 2ε4N

a4

(1− 1

n

)+

a8

ε4,

N2ε8

a8

(4π2

3− 12 + O(n−1)

)≤Var(τN(Tn))

≤N2a8

ε8

(4π2

3− 12 + O(n−1)

)+ O(N).

Page 25: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

A numerical example

I Take the earlier HMM to be

Xt+1 = (1−∆)Xt +√

∆ξt

X0 ∼ N(0, 1),

Yt |Xt ∼ N(Xt , σ2),

where ξt ∼ N(0, 1).

I This model violates the assumed lower bounds.

I Nevertheless, simulations using a boostrap particle filter showthat the Kingman scalings hold, even when n = N.

Page 26: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Mean tree height

●● ● ● ●

0 20 40 60 80 100 120

0.0

0.1

0.2

0.3

0.4

N = 128

n

E[−

T/N

]

+

++ + + + +

x

xx

x x x x

*

** * * * *

o+x

*

multinomialresidualstratifiedsystematic

●●●●● ● ● ● ● ●

0 2000 4000 6000 8000

0.0

0.1

0.2

0.3

0.4

N = 8192

n

E[−

T/N

]

+

+

++++++ + + + + +

x

x

x

xxxxx x x x x x

*

*

*

***** * * * * *

o+x

*

multinomialresidualstratifiedsystematic

Page 27: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Mean tree height II

●●

● ●

● ● ●

0 2000 4000 6000 8000

0.0

0.1

0.2

0.3

0.4

n = 2

N

E[−

T/N

]

++

++

+ + +

xx x

x xx x

**

*

* * * *

o+x

*

multinomialresidualstratifiedsystematic

●●

0 2000 4000 6000 8000

0.0

0.1

0.2

0.3

0.4

n = 2048

N

E[−

T/N

]

++

+xx

x

* * *

o+x

*

multinomialresidualstratifiedsystematic

Page 28: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Tree height variance

● ● ● ● ●

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

0.10

N = 128

n

Var

(−T

/N)

+

++ +

+ + +

x

x

xx x x x

*

*

** * * *

o+x

*

multinomialresidualstratifiedsystematic

●●●●●●●● ● ● ● ● ●

0 2000 4000 6000 8000

0.00

0.02

0.04

0.06

0.08

0.10

N = 8192

n

Var

(−T

/N)

++++++++ + + + + +xxxxxxxx x x x x x

******** * * * * *

o+x

*

multinomialresidualstratifiedsystematic

Page 29: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Tree height variance II

●● ● ●

0 2000 4000 6000 8000

0.00

0.02

0.04

0.06

0.08

0.10

n = 2

N

Var

(−T

/N)

+

+

++ + + +

x

xx x

x

x x

*

** * * *

*

o+x

*

multinomialresidualstratifiedsystematic

●●

0 2000 4000 6000 8000

0.00

0.02

0.04

0.06

0.08

0.10

n = 2048

N

Var

(−T

/N)

++ +

x

x

x**

*

o+x

*

multinomialresidualstratifiedsystematic

Page 30: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Conclusions

I Genealogies of n� N particles from N-particle SMCalgorithms converge to the Kingman coalescent when time ismeasured in units of N, as N →∞.

I Strong technical assumptions (i.e. a compact state space)which do not seem necessary in practice.

I Predicted scalings observed in experiments for finite N, seemto hold even when n ≈ N.

I Result holds for multinomial resampling, but other schemesagree with predictions empirically.

I This result also demonstrates that the domain of attraction ofthe Kingman coalescent includes certain non-Markoviangenealogies.

Page 31: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Outlook

I Some areas in which genealogical results might be interesting:I Variance estimation from SMC output (Lee and Whiteley,

2015).I Smoothing and static parameter estimation.I Mixing in particle Gibbs/iterated cSMC.

I Room for improvement (selected topics. . . )I Relaxing strong assumptions.I Incorporating other resampling schemes.I Obtaining stronger modes of convergence.I (Formal analysis of n ≈ N.)I Incorporating conditional SMC.

Page 32: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

ReferencesH. P. Chan, T. L. Lai, A general theory of particle filters in hidden

Markov models and some applications. Annals of Statistics, 41(6):2877–2904, 2013.

A. Doucet and Stephane Senecal. Fixed-lag sequential MonteCarlo. In EURASIPCO, 2004.

N. J. Gordon, S. J. Salmond, and A. F. M. Smith. Novel approachto nonlinear/non-Gaussian Bayesian state estimation. IEEProceedings-F, 140(2):107–113, April 1993.

P. E. Jacob, L. Murray, and S. Rubenthaler. Path storage in theparticle filter. Statistics and Computing, 25(2):487–496, 2015.

J. F. C. Kingman. The coalescent. Stochastic Processes and theirApplications, 13(3):235–248, 1982.

J. Koskela, P. Jenkins, A. M. Johansen, and D. Spano. Asymptoticgenealogies of interacting particle systems with an application tosequential Monte Carlo. ArXiv mathematics e-print 1804.01811,ArXiv Mathematics e-prints, April 2018.

A. Lee and N. Whiteley. Variance estimation and allocation in theparticle filter. Technical Report 1509.00394, Arxiv, 2015.

Page 33: Asymptotic Genealogies of sequential Monte Carlo algorithms · Asymptotic Genealogies of sequential Monte Carlo algorithms Jere Koskela, Paul Jenkins1, Adam M. Johansen2, and Dario

Acknowledgments

I Paul Jenkins was supported in part by EPSRC grantEP/L018497/1.

I Adam Johansen was partially supported by funding from theLloyd’s Register Foundation — Alan Turing InstituteProgramme on Data-Centric Engineering.

I Jere Koskela was supported by EPSRC as part of theMASDOC DTC at the University of Warwick (GrantNo. EP/HO23364/1), and by DeutscheForschungsgemeinschaft (DFG) grant BL 1105/3-2.