bayesian inference for inverse problems using surrogate model · – real experimental data:...
TRANSCRIPT
Bayesian Inference for Inverse Problems Using Surrogate Model
Dongbin Xiu
Department of Mathematics, Purdue University
Supported by AFOSR, DOE, NNSA, NSF
Overview
• Forward problem: uncertainty propagation • Brief introduction of generalized polynomial chaos (gPC)
• Uncertainty quantification (UQ) and stochastic modeling
• Key Issues: • Efficiency • Curse-of-dimensionality
• Back to the forward problem • Issues and challenges
• Inverse problem: Bayesian inference • Use generalized polynomial chaos (gPC)
∂u∂t
+ u ∂u∂x
= ν∂2u∂x2
, u(−1) = 1, u(1) =−1• Burgers’ equation :
• Steady state solution:
Illustrative Example: Burgers Equation
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
perturbed solutionunperturbed solutionmesh distribution
u(x) =−A tanh A
2νx− z( )
⎡
⎣⎢⎢
⎤
⎦⎥⎥ , where u(z) = 0, A =−
∂u∂x x=z
u(-1)=1
u(-1)=1.01
ν=0.05
Effects of Uncertainty – “Supersensitivity”
∂u∂t
+ u ∂u∂x
= ν∂2u∂x2
, x ∈ −1,1⎡⎣⎢
⎤⎦⎥• Burgers’ equation :
u(−1) = 1+ δ; u(1) =−1; δ ~ U (0,0.1)• Boundary conditions :
(Xiu & Karniadakis, Int. J. Numer. Eng., vol. 61, 2004)
Lower bd
upper bd
Mean
Standard dev
(Re-)Formulation of PDE: Input Parameterization
∂u∂t
(t,x) = L(u) + boundary/initial conditions
• Goal: To characterize the random inputs by a set of random variables Finite number Mutual independence
• If inputs == parameters Identify the (smallest) independent set Prescribe probability distribution
• Else if inputs == fields/processes Approximate the field by a function of finite number of RVs Well-studied for Gaussian processes Under-developed for non-Gaussian processes Examples: Karhunen-Loeve expansion, spectral decomposition, etc.
a(x,ω)≈µa (x) + ai (x)Zi (ω)
i=1
d
∑
The Reformulation
• Uncertain inputs are characterized by nz random variables Z
∂u∂t
(t,x,Z ) = L(u) + boundary/initial conditions
u(t, x,Z ) : [0,T ]× D × RnZ → R
FZ (s) = Pr(Z ≤ s), s ∈RnZ
• Probability distribution of Z is prescribed
• Stochastic PDE:
• Solution:
Non-trivial task
uN (t,x,Z ) uk (t,x)Φk (Z )
k =0
N
∑
• Nth-order gPC expansion:
• Orthogonal basis:
E Φi (Z )Φj(Z )⎡⎣⎢
⎤⎦⎥ = Φi (z)Φj(z)ρ(z) dz∫ = δij
Generalized Polynomial Chaos (gPC)
uN (Z )∈PN ={space of nZ -variate polynomials of degree up to N}
dimPN =
nz + N
N
⎛
⎝
⎜⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟⎟⎟
u(i,Z ) : RnZ → R• Focus on dependence on Z:
i = (i1,,inZ
), i = i1 ++ inZ
Gaussian distribution Gamma distribution Beta distribution
E g(Z )( ) = g(z)ρ(z) dz
R∫ Expectation:
gPC: Basis
Φi (z)Φj(z)ρ(z) dz∫ = E Φi (Z )Φj(Z )⎡
⎣⎢⎤⎦⎥ = δij
Orthogonality:
Φi (z)Φj(z)e−z2
dz−∞
∞
∫ = δij
Φi (z)Φj(z)e−z dz
0
∞
∫ = δij
Φi (z)Φj(z) dz
−1
1
∫ = δij
Hermite polynomial Laguerre polynomial Legendre polynomial (Xiu & Karniadakis, SISC, 2002)
Example: Uniform random variable o Convergence o Non-optimal o First-order Legendre is exact
gPC Basis: the Choices
Φi (z)Φj(z)ρ(z) dz∫ = E Φi (Z )Φj(Z )⎡
⎣⎢⎤⎦⎥ = δij
Orthogonality:
Φi (z)Φj(z)e−z2
dz−∞
∞
∫ = δij
Example: Hermite polynomial
The polynomials: Z~N(0,1)
Φ0 = 1, Φ1 = Z , Φ2 = Z 2−1, Φ3 = Z 3−3Z ,
Approximation of arbitrary random variable: Requires L2 integrability
Stochastic Galerkin
uN (t,x,Z ) uk (t,x)Φk (Z )
k =0
N
∑
• Galerkin method: Seek
E
∂uN
∂t(t,x,Z )Φm (Z )
⎡
⎣⎢
⎤
⎦⎥ = E L(uN )Φm (Z )⎡⎣ ⎤⎦ , ∀ m ≤ N
Such that
• The result: Residue is orthogonal to the gPC space A set of deterministic equations for the coefficients The equations are usually coupled – requires new solver
Stochastic Collocation
• Sampling: (solution statistics only) • Random (Monte Carlo) • Deterministic (lattice rule, tensor grid, cubature)
• Collocation: To satisfy governing equations at selected nodes Allow one to use existing deterministic codes repetitively
• Stochastic collocation: To construct polynomial approximations Node selection is critical to efficiency and accuracy More than sampling
Let Z1,Z 2 ,…Z Q ∈RnZ , be a set of nodes/samples, then solve∂u∂t
(t, x,Z j ) = L(u), j = 1,…,Q.
Sparse grids: more efficient Tensor grids: inefficient
(Xiu & Hesthaven, SIAM J. Sci. Comput., 05)
Stochastic Collocation: Interpolation
• Definition: Given a set of nodes and solution ensemble, find uN in a proper polynomial space, such that uN≈u in a proper sense.
• Interpolation Approaches: uQ (Z ) u(Z j )Lj (Z )
j=1
Q
∑
Li (Z j ) = δij , 1≤ i, j≤Q
• Optimal nodal distribution in high dimensions…
Stochastic Collocation: Discrete Projection
uN (t,x,Z ) PNu = uk (t,x)Φk (Z )
k =0
N
∑
uk = E[u(Z )Φk (Z )] = u(z)Φk (z)ρ(z) dz∫
• Orthogonal projection:
wN (t,x,Z ) = wk (t,x)Φk (Z )
k =0
N
∑
wk = u(t,x,Z j )Φk (Z j )α j
j=1
Q
∑ ≈ u(z)Φk (z)ρ(z) dz∫
wk (t,x) → uk (t,x), Q →∞
• Discrete projection:
εQ uN −wN Lρ
2• Aliasing Error:
Inverse Parameter Estimation
• Solution of a stochastic system:
• Need: Probability distribution of the parameters Z
• Information of the prior distribution is critical Requires direct measurements of the parameters No/not enough direct measurements? (Use experience/intuition …) How to take advantage of measurements of other variables?
Prior distribution:
• Error:
Bayesian Inference for Parameter Estimation
π (Z | d) ∝π (d | Z )π (Z )
• Likelihood function:
• Posterior distribution:
• Notes: Difficult to manipulate Classical sampling approaches can be time consuming (MCMC, etc)
• Setup: Prior distribution of the parameters Z: π(z) Forward problem simulation: y=G(Z) Measurement/data: d
• Goal: To estimation the distribution of Z --- posterior distribution
Surrogate-based Bayesian Estimation
• Surrogate: yN = GN (Z )
• Approximate Bayes rule:
π N (Z | d) ∝π N (d | Z )π (Z )
where π N (d | Z ) = π ei
(di − GN ,i (Z ))i=1
nd
∏
• Properties: Allows direct sampling with arbitrarily large samples No additional simulations – forward problem solver only
Convergence Analysis
• Kullback-Leibler divergence:
• Basic assumption: observation error is i.i.d. Gaussian
• Notes: Fast (exponential) convergence rate is retained
Theorem. If the gPC expansion GN converges to G in LπZ
2 , then the posterior density
π Nd converges to π d in the sense
D π Nd π d( ) → 0, N →∞.
Moreover, if
Gi (Z ) − GN ,i (Z )Lπ z
2≤ CN −α , 1≤ i ≤ nd ,α > 0, C independent of N ,
then for sufficiently large N ,
D π Nd π d( ) N −α .
(Marzouk & Xiu, Comm. Comput. Phys.,vol. 6, 08) So is the slow convergence
Parameter Estimation: Supersensitivity Example
∂u∂t
+ u ∂u∂x
= ν∂2u∂x2
, x ∈ −1,1⎡⎣⎢
⎤⎦⎥• Burgers’ equation :
u(−1) = 1+ δ(Z ); u(1) =−1; 0 < δ 1• Boundary conditions :
δ
noisy observation of transition layer location (contrived numerically)
Measurement noise: e~N(0,0.052)
Prior distribution is uniform Z~(0,0.1)
Parameter Estimation: Step Function
• Assume the forward model is a step function • Posterior distribution is discontinuous • Gibb’s oscillations exist • Slow convergence with global gPC basis functions
Forward model and its approximation Posterior distribution and its approximation
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2
0
0.2
0.4
0.6
0.8
1
1.2
z
G(z
) or G
N(z
)
exact forward solutiongPC approximation
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
2.5
3
z
!d (z)
exact posteriorgPC posterior
Biological Application • Example: estimate kinetic parameters in a genetic toggle switch
– Differential-algebraic equations model from [Gardner et al, Nature, 2000] – Real experimental data: steady-state expression levels of one gene (v)
• 6 uncertain parameters • Assumed to be independent and uniform in the forward problem • Good agreement with measurement --- but let’s now use the data again
−6 −5.5 −5 −4.5 −4 −3.5 −3 −2.5 −2
0
0.2
0.4
0.6
0.8
1
log10(IPTG)
Nor
mal
ized
GFP
exp
ress
ion
1-parameter and 2-parameter marginal posterior densities (dim = 6, N = 4, 6-level sparse grid forward problem solver)
Using the Data Again – Parameter estimation
An (almost) Industrial Application: Free-Cantilever Damping
24
Gas damping of freely-vibrating cantilever
3-parameter marginal posterior densities Level 2 sparse grid in thickness, gas density and frequency
MEMOSA for damping simulation
Summary
Generalized polynomial chaos (gPC) • Multivariate approximation theory • An highly efficient uncertainty propagation strategy • Makes many UQ tasks “post processing”
Data, any data, can help • UQ simulation needs to be dynamically data driven.
Uncertainty Analysis: To provide improved prediction • Input characterization • Uncertainty propagation • Post processing
• NSF: • DMS-0645035 • IIS-0914447 • IIS-1028291
• AFOSR: FA9550-08-1-0353 • DOE:
• DE-FC52-08NA28617 • DE-SC0005713
Support: