bayesian inversion: algorithms · bayesian inversion: algorithms andrew m stuart1 1mathematics...
Post on 17-Mar-2021
6 Views
Preview:
TRANSCRIPT
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Bayesian Inversion:Algorithms
Andrew M Stuart1
1Mathematics Institute andCentre for Scientific Computing
University of Warwick
Woudshoten Lectures 2013October 4th 2013
Work funded by EPSRC, ERC and ONR
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
http://homepages.warwick.ac.uk/∼masdr/
A.M. Stuart. Inverse problems: a Bayesian perspective.Acta Numerica 19(2010).∼masdr/BOOKCHAPTERS/stuart15c.pdfM. Dashti, K.J.H. Law, A.M. Stuart and J. Voss. MAPestimators and posterior consistency . . . .Inverse Problems, 29(2013), 095017. arxiv:1303.4795.
F. Pinski, G. Simpson, A.M. Stuart and H. Weber.Kullback-Leibler approximation for probability measures oninfinite dimensional spaces. In preparation.
S.L. Cotter , G.O. Roberts, A.M. Stuart and D. White.MCMC methods for functions . . . . Statistical Science28(2013). arxiv:1202.0709.
M. Hairer, A.M.Stuart and S. Vollmer. Spectral gaps for aMetropolis-Hastings algorithm . . . . arxiv:1112.1392.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Outline
1 SETTING AND ASSUMPTIONS
2 MAP ESTIMATORS
3 KULLBACK-LEIBLER APPROXIMATION
4 SAMPLING
5 CONCLUSIONS
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Outline
1 SETTING AND ASSUMPTIONS
2 MAP ESTIMATORS
3 KULLBACK-LEIBLER APPROXIMATION
4 SAMPLING
5 CONCLUSIONS
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
The Setting
Probability measure µ on Hilbert space H.Reference measure µ0 (often a prior).µ related to µ0 by (often Bayes’ Theorem)
dµdµ0
(u) =1
Zµexp(−Φ(u)
).
Another way of saying the same thing:
Eµf (u) =1
ZµEµ0(
exp(−Φ(u)
)f (u)
).
How do we get information from µ if we know µ0 and Φ?
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
The Talk In One Picture
Gaussian
MAP
Samples
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
The Assumptions
µ0 = N(0,C0) a centred Gaussian measure on H.µ0(X ) = 1; X (Banach) continuously embedded in H.
Let E = D(C− 1
20 ) (Cameron-Martin space).
Then E ⊂ X ⊆ H. E (Hilbert) compactly embedded in X .The function Φ ∈ C(X ;R+).
For all u, v with ‖u‖X ≤ r , ‖v‖X ≤ r there are Mi(r):
|Φ(u)| ≤ M1(r);
|Φ(u)− Φ(v)| ≤ M2(r)‖u − v‖.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Outline
1 SETTING AND ASSUMPTIONS
2 MAP ESTIMATORS
3 KULLBACK-LEIBLER APPROXIMATION
4 SAMPLING
5 CONCLUSIONS
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Probability Maximizers and Tikhonov Regularization
Define the Tikhonov-regularized LSQ functional I : E → R+ by
I(u) :=12‖C−
12
0 u‖2 + Φ(u).
Let Bδ(z) be a ball of radius δ in X centred at z ∈ E = D(C− 1
20 ).
Theorem(Dashti, Law, S and Voss, 2013). The probability measure µand functional I are related by
limδ→0
µ(Bδ(z1)
)µ(Bδ(z2)
) = exp(I(z2)− I(z1)
).
Thus probability maximizers are minimizers of the regularizedTikhonov functional I.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Existence of Probability Maximizers
The minimization is well-defined:
Theorem(S, Acta Numerica, 2010). ∃u ∈ E :
I(u) = I := infI(u) : u ∈ E.
Furthermore, if un is a minimizing sequence satisfyingI(un)→ I then there is a subsequence un′ that convergesstrongly to u in E.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Example: Navier-Stokes Inversion for Initial Condition
Stream function
x1
x 2
−1 −0.5 0 0.5−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
|uk|
k1
k 2
−10 0 10
−15
−10
−5
0
5
10
15
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Stream function
x1
x 2
−1 −0.5 0 0.5−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
|uk|
k1
k 2
−10 0 10
−15
−10
−5
0
5
10
15
−0.4
−0.2
0
0.2
0.4
0.6
0.8
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Incompressible NSE on ΩT = T2× (0,∞) :
∂tv − ν4v + v · ∇v +∇p = f in ΩT ,∇ · v = 0 in ΩT ,
v |t=0 = u inT2.
yj,k = v(xj , tk ) + ηj,k , ηj,k ∼ N(0, σ2I2×2).
y = G(u) + η, η ∼ N(0, σ2I).C0 = (−4stokes)
−2; Φ = 1103σ2 |y − G(u)|2.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Example: Navier-Stokes Inversion for Initial Condition
102
103
104
105
106
10710
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
1/σ
|u*−u+|/|u+|
|G(u*)−G(u+)|/|G(u+)||G(u*)−y|/|y|
Figure: MAP estimator u?; Truth u†
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Outline
1 SETTING AND ASSUMPTIONS
2 MAP ESTIMATORS
3 KULLBACK-LEIBLER APPROXIMATION
4 SAMPLING
5 CONCLUSIONS
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
The Objective Functional
Recall µ0 = N(0,C0) and µ(du) ∝ exp(−Φ(u)
)µ0(du). Let A
denote a set of simple measures on H (usually Gaussian).
ProblemFind ν ∈ A that minimizes I(ν) := DKL(ν‖µ).
Here DKL =Kullbach-Leibler divergence = relative entropy
DKL(ν‖µ) =
∫H
dνdµ(x) log
(dνdµ(x)
)µ(dx) if ν µ
+∞ else.
We note, for intuition, the inequality:
dHell(ν, µ)2 ≤ 2DKL(ν‖µ).
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Existence of Minimizers
The minimization is well-defined:
Theorem(Pinski, Simpson, S, Weber, 2013) If A is closed under weakconvergence and there is ν ∈ A with I(ν) <∞ then ∃ ν ∈ Asuch that
I(ν) = I := infI(ν) : ν ∈ A.Furthermore, if νn is a minimizing sequence satisfyingI(νn)→ I then there is a subsequence νn′ that converges to νin the Hellinger metric:
dHell(νn, ν)→ 0.
Example: A := G = Gaussian measures on H.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Parameterization of G
Gaussian case equivalent to ν = N(
m,(C−1
0 + Γ)−1).
J(m, Γ) =
DKL(ν‖µ) if (m, Γ) ∈ E ×HS(E ,E∗)+∞ else.
Theorem
(Pinski, Simpson, S, Weber, 2013) ∃ (m, Γ) ∈ E ×HS(E ,E∗) :
J(m, Γ) = J := infJ(m, Γ) : (m, Γ) ∈ E ×HS(E ,E∗).
Furthermore, if (mn, Γn) is a minimizing sequence satisfyingJ(mn, Γn)→ J then there is a subsequence (mn′ , Γn′) suchthat
‖mn′ −m‖E + ‖Γn′ − Γ‖HS(E ,E∗) → 0.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Cautionary Examples
−1 −0.5 0 0.5 10.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
x
ϕ(n
x)
n =1n =4n =16
−1 −0.5 0 0.5 10
10
20
30
40
50
60
70
x
nϕ(n
x)
n =1n =4n =16
µ0 is Brownian bridge on [−1,1] :
νn := N(
0,(C−1
0 + Γn)−1)
.
Either:
(Γnu)(x) = ϕ(nx)u(x), ϕ ∈ C∞per, meanϕ
νn ⇒ ν := N(
0,(C−1
0 + ϕId)−1)
Or:
(Γnu)(x) = nϕ(nx)u(x), ϕ ∈ C∞0 , ‖ϕ‖L1 = 1
νn ⇒ ν := N(
0,(C−1
0 + δ0Id)−1).
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Regularization of J I
Let (S, ‖ · ‖S) be compact in L(E ,E∗).
Jδ(m, Γ) =
J(m, Γ) + δ‖Γ‖2S if (m, Γ) ∈ E × S+∞ else.
Theorem
(Pinski, Simpson, S, Weber, 2013) ∃ (m, Γ) ∈ E × S :
Jδ(m, Γ) = Jδ := infJδ(m, Γ) : (m, Γ) ∈ E × S.
Furthermore, if νn(mn, Γn) is a minimizing sequencesatisfying Jδ(mn, Γn)→ Jδ then there is a subsequence(mn′ , Γn′) such that
dHell(νn′ , ν) + ‖Γn′ − Γ‖S → 0.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Regularization of J II
Let H = L2(Ω), Ω ⊂ Rd , C0 = (−4)−α with α > d/2.Choose (Γu)(x) = B(x)u(x) and S = H r , r > 0.Thus ν = N(m,C), C−1 = C−1
0 + B(·)Id for potential B.
Jδ(m,B) =
J(m,B) + δ‖B‖2H r if (m,B) ∈ Hα × H r
+∞ else.
Theorem
(Pinski, Simpson, S, Weber, 2013) ∃ (m,B) ∈ Hα × H r :
Jδ(m,B) = Jδ := infJδ(m,B) : (m,B) ∈ Hα × H r.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Example: Conditioned Diffusion in a Double Well
−1.5 −1 −0.5 0 0.5 1 1.50
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
x
V(x
)
V (x) = (x2 − 1)2
−5 0 5−1.5
−1
−0.5
0
0.5
1
1.5
t
Xt
Consider the conditioned diffusion
dXt = −∇V (Xt )dt +√
2εdWts, .X−T = x− < 0,X+T = x+ > 0.
For x− = −x+, by symmetry, we can studypaths satisfying X0 = 0, XT = x+Path space distribution approximated byN(m(t),C), with
C−1 = 12ε
(− d2
dt2 + BI)
B is either a constant or B = B(t)m and B obtained by minimizing DKL
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Stochastic Root Finding & OptimizationRobbins-Monro with Derivatives and Iterate Averaging
Functions Estimated Via Sampling
Assume f (x) (the target function) can be estimated via F (y ; x)as
f (x) = EY [F (Y ; x)], f ′(x) = EY [∂xF (Y ; x)]
Iteration Scheme (See, for instance, Asmussen & Glynn)
xn+1 = xn − an
(1M
M∑i
∂xF (Yi ; xn))−1(
1M
M∑i
F (Yi ; xn)),
with an ∼ n−γ , γ ∈ (12 ,1) and Yi i.i.d. Also let xn ≡
∑nj=1 xi .
Then xn → x?, with f (x?) = 0, in distribution at rate n−1/2.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Numerical Results With Constant Potential BT = 5, ε = .15,104 Iterations, 103 Samples per Iteration, 102 Points in (0,T ) per Sample
01
23
45
0
5000
100000
0.2
0.4
0.6
0.8
1
tn
mn
102
103
104
0
10
20
30
40
50
60
n
Bn
Bn
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Numerical Results With Variable Potential BT = 5, ε = .15, δ = 1
2 × 10−4, r = 1 so H1 regularization104 Iterations, 104 Samples per Iteration, 102 Points in (0,T ) per Sample
01
23
45
0
5000
100000
0.2
0.4
0.6
0.8
1
tn
mn
01
23
45
0
5000
100000
20
40
60
80
tn
Bn
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Model Comparison95% Confidence Intervals about the Mean Path
Constant B
0 1 2 3 4 5−0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
t
Xt
Variable B
0 1 2 3 4 5−0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
t
Xt
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Outline
1 SETTING AND ASSUMPTIONS
2 MAP ESTIMATORS
3 KULLBACK-LEIBLER APPROXIMATION
4 SAMPLING
5 CONCLUSIONS
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
MCMC
MCMC: create an ergodic Markov chain u(k) which isinvariant for approximate target µ (or µN the approximationon RN ) so that
1K
K∑k=1
f (u(k))→ Eµf
.
Recall µ0 = N(0,C0) and µ(du) ∝ exp(−Φ(u)
)µ0(du).
Recall the Tikhonov functional I(u) = 12‖C
− 12
0 u‖2 + Φ(u).
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Standard Random Walk Algorithm
Metropolis, Rosenbluth, Teller and Teller,J. Chem. Phys. 1953.
Set k = 0 and Pick u(0).Propose v (k) = u(k) + βξ(k), ξ(k) ∼ N(0,C0).
Set u(k+1) = v (k) with proability a(u(k), v (k)).
Set u(k+1) = u(k) otherwise.k → k + 1.
Here a(u, v) = min1,exp(I(u)− I(v)
).
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
New Random Walk Algorithm
Cotter, Roberts, S and White, Stat. Sci. 2013.
Set k = 0 and Pick u(0).Propose v (k) =
√(1− β2)u(k) + βξ(k), ξ(k) ∼ N(0,C0).
Set u(k+1) = v (k) with proability a(u(k), v (k)).
Set u(k+1) = u(k) otherwise.k → k + 1.
Here a(u, v) = min1,exp(Φ(u)− Φ(v)
).
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Example: Navier-Stokes Inversion for Forcing
10−5 10−4 10−3 10−2 10−1 100
β
0.0
0.2
0.4
0.6
0.8
1.0
Ave
rage
Acc
epta
nce
Pro
bab
ilit
y
SRWMH, ∆x = 0.100
SRWMH, ∆x = 0.050
SRWMH, ∆x = 0.020
SRWMH, ∆x = 0.010
SRWMH, ∆x = 0.005
SRWMH, ∆x = 0.002
10−5 10−4 10−3 10−2 10−1 100
β
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Ave
rage
Acc
epta
nce
Pro
bab
ilit
y
RWMH, ∆x = 0.100
RWMH, ∆x = 0.050
RWMH, ∆x = 0.020
RWMH, ∆x = 0.010
RWMH, ∆x = 0.005
RWMH, ∆x = 0.002
Incompressible NSE onΩT = T2 × (0,∞) :
∂tv − ν4v + v · ∇v +∇p = u in ΩT ,∇ · v = 0 in ΩT ,
v |t=0 = v0 inT2.
yj,k = v(xj , tk )+ξj,k , ξj,k ∼ N(0, σ2I2×2).
y = G(u) + ξ, ξ ∼ N(0, σ2I).Prior OU process; Φ = 1
σ2 |y − G(u)|2.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Spectral Gaps
Theorem(Hairer, S, Vollmer, arXiv 2012.)
For the standard Random walk algorithm the spectral gapis bounded above by C N−
12 .
For the new Random walk algorithm the spectral gap isbounded below independently of dimension.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
Outline
1 SETTING AND ASSUMPTIONS
2 MAP ESTIMATORS
3 KULLBACK-LEIBLER APPROXIMATION
4 SAMPLING
5 CONCLUSIONS
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
What We Have Shown
We have shown that:
Common Structure: A range of problems requireextracting information from a probability measure on aHilbert space, having density with respect to a Gaussian.Algorithmic Approaches We have laid the foundations ofa range of computational methods related to this task.MAP Estimators Maximum a posteriori estimators can bedefined on Hilbert space; there is a link to Tikhonovregularization.Kullback-Leibler Approximation Kullback-Leiblerapproximation can be defined on Hilbert space and findingthe closest Gaussian results in a well-defined problem inthe calculus of variations.Sampling MCMC methods can be defined on Hilbertspace. Results in new algorithms robust to discretization.
SETTING AND ASSUMPTIONS MAP ESTIMATORS KULLBACK-LEIBLER APPROXIMATION SAMPLING CONCLUSIONS
http://homepages.warwick.ac.uk/∼masdr/
A.M. Stuart. Inverse problems: a Bayesian perspective.Acta Numerica 19(2010).
M. Dashti, K.J.H. Law, A.M. Stuart and J. Voss. MAPestimators and posterior consistency . . . .Inverse Problems, 29(2013), 095017. arxiv:1303.4795.
F. Pinski, G. Simpson, A.M. Stuart and H. Weber.Kullback-Leibler approximation for probability measures oninfinite dimensional spaces. In preparation.
S.L. Cotter , G.O. Roberts, A.M. Stuart and D. White.MCMC methods for Functions . . . . Statistical Science28(2013). arxiv:1202.0709.
M. Hairer, A.M.Stuart and S. Vollmer. Spectral gaps for aMetropolis-Hastings algorithm . . . . arxiv:1112.1392.
top related