bayesian inference for inverse problems using surrogate model · – real experimental data:...

Bayesian Inference for Inverse Problems Using Surrogate Model

Dongbin Xiu

Department of Mathematics, Purdue University

Supported by AFOSR, DOE, NNSA, NSF

Overview

•  Forward problem: uncertainty propagation •  Brief introduction of generalized polynomial chaos (gPC)

•  Uncertainty quantification (UQ) and stochastic modeling

•  Key Issues: •  Efficiency •  Curse-of-dimensionality

•  Back to the forward problem •  Issues and challenges

•  Inverse problem: Bayesian inference •  Use generalized polynomial chaos (gPC)

∂u∂t

+ u ∂u∂x

= ν∂2u∂x2

, u(−1) = 1, u(1) =−1•  Burgers’ equation :

•  Steady state solution:

Illustrative Example: Burgers Equation

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

perturbed solutionunperturbed solutionmesh distribution

u(x) =−A tanh A

2νx− z( )

⎡

⎣⎢⎢

⎤

⎦⎥⎥ , where u(z) = 0, A =−

∂u∂x x=z

u(-1)=1

u(-1)=1.01

ν=0.05

Effects of Uncertainty – “Supersensitivity”

∂u∂t

+ u ∂u∂x

= ν∂2u∂x2

, x ∈ −1,1⎡⎣⎢

⎤⎦⎥•  Burgers’ equation :

u(−1) = 1+ δ; u(1) =−1; δ ~ U (0,0.1)•  Boundary conditions :

(Xiu & Karniadakis, Int. J. Numer. Eng., vol. 61, 2004)

Lower bd

upper bd

Mean

Standard dev

(Re-)Formulation of PDE: Input Parameterization

∂u∂t

(t,x) = L(u) + boundary/initial conditions

•  Goal: To characterize the random inputs by a set of random variables   Finite number   Mutual independence

•  If inputs == parameters   Identify the (smallest) independent set   Prescribe probability distribution

•  Else if inputs == fields/processes   Approximate the field by a function of finite number of RVs   Well-studied for Gaussian processes   Under-developed for non-Gaussian processes   Examples: Karhunen-Loeve expansion, spectral decomposition, etc.

a(x,ω)≈µa (x) + ai (x)Zi (ω)

i=1

d

∑

The Reformulation

•  Uncertain inputs are characterized by nz random variables Z

∂u∂t

(t,x,Z ) = L(u) + boundary/initial conditions

u(t, x,Z ) : [0,T ]× D × RnZ → R

FZ (s) = Pr(Z ≤ s), s ∈RnZ

•  Probability distribution of Z is prescribed

•  Stochastic PDE:

•  Solution:

Non-trivial task

uN (t,x,Z ) uk (t,x)Φk (Z )

k =0

N

∑

•  Nth-order gPC expansion:

•  Orthogonal basis:

E Φi (Z )Φj(Z )⎡⎣⎢

⎤⎦⎥ = Φi (z)Φj(z)ρ(z) dz∫ = δij

Generalized Polynomial Chaos (gPC)

uN (Z )∈PN ={space of nZ -variate polynomials of degree up to N}

dimPN =

nz + N

N

⎛

⎝

⎜⎜⎜⎜⎜

⎞

⎠

⎟⎟⎟⎟⎟⎟

u(i,Z ) : RnZ → R•  Focus on dependence on Z:

i = (i1,,inZ

), i = i1 ++ inZ

Gaussian distribution Gamma distribution Beta distribution

E g(Z )( ) = g(z)ρ(z) dz

R∫  Expectation:

gPC: Basis

Φi (z)Φj(z)ρ(z) dz∫ = E Φi (Z )Φj(Z )⎡

⎣⎢⎤⎦⎥ = δij

  Orthogonality:

Φi (z)Φj(z)e−z2

dz−∞

∞

∫ = δij

Φi (z)Φj(z)e−z dz

0

∞

∫ = δij

Φi (z)Φj(z) dz

−1

1

∫ = δij

Hermite polynomial Laguerre polynomial Legendre polynomial (Xiu & Karniadakis, SISC, 2002)

  Example: Uniform random variable o  Convergence o  Non-optimal o  First-order Legendre is exact

gPC Basis: the Choices

Φi (z)Φj(z)ρ(z) dz∫ = E Φi (Z )Φj(Z )⎡

⎣⎢⎤⎦⎥ = δij

  Orthogonality:

Φi (z)Φj(z)e−z2

dz−∞

∞

∫ = δij

  Example: Hermite polynomial

  The polynomials: Z~N(0,1)

Φ0 = 1, Φ1 = Z , Φ2 = Z 2−1, Φ3 = Z 3−3Z ,

  Approximation of arbitrary random variable: Requires L2 integrability

Stochastic Galerkin

uN (t,x,Z ) uk (t,x)Φk (Z )

k =0

N

∑

•  Galerkin method: Seek

E

∂uN

∂t(t,x,Z )Φm (Z )

⎡

⎣⎢

⎤

⎦⎥ = E L(uN )Φm (Z )⎡⎣ ⎤⎦ , ∀ m ≤ N

Such that

•  The result:   Residue is orthogonal to the gPC space   A set of deterministic equations for the coefficients   The equations are usually coupled – requires new solver

Stochastic Collocation

•  Sampling: (solution statistics only) •  Random (Monte Carlo) •  Deterministic (lattice rule, tensor grid, cubature)

•  Collocation: To satisfy governing equations at selected nodes   Allow one to use existing deterministic codes repetitively

•  Stochastic collocation: To construct polynomial approximations   Node selection is critical to efficiency and accuracy   More than sampling

Let Z1,Z 2 ,…Z Q ∈RnZ , be a set of nodes/samples, then solve∂u∂t

(t, x,Z j ) = L(u), j = 1,…,Q.

Sparse grids: more efficient Tensor grids: inefficient

(Xiu & Hesthaven, SIAM J. Sci. Comput., 05)

Stochastic Collocation: Interpolation

•  Definition: Given a set of nodes and solution ensemble, find uN in a proper polynomial space, such that uN≈u in a proper sense.

•  Interpolation Approaches: uQ (Z ) u(Z j )Lj (Z )

j=1

Q

∑

Li (Z j ) = δij , 1≤ i, j≤Q

•  Optimal nodal distribution in high dimensions…

Stochastic Collocation: Discrete Projection

uN (t,x,Z ) PNu = uk (t,x)Φk (Z )

k =0

N

∑

uk = E[u(Z )Φk (Z )] = u(z)Φk (z)ρ(z) dz∫

•  Orthogonal projection:

wN (t,x,Z ) = wk (t,x)Φk (Z )

k =0

N

∑

wk = u(t,x,Z j )Φk (Z j )α j

j=1

Q

∑ ≈ u(z)Φk (z)ρ(z) dz∫

wk (t,x) → uk (t,x), Q →∞

•  Discrete projection:

εQ uN −wN Lρ

2•  Aliasing Error:

Inverse Parameter Estimation

•  Solution of a stochastic system:

•  Need: Probability distribution of the parameters Z

•  Information of the prior distribution is critical   Requires direct measurements of the parameters   No/not enough direct measurements? (Use experience/intuition …)   How to take advantage of measurements of other variables?

  Prior distribution:

•  Error:

Bayesian Inference for Parameter Estimation

π (Z | d) ∝π (d | Z )π (Z )

•  Likelihood function:

•  Posterior distribution:

•  Notes:   Difficult to manipulate   Classical sampling approaches can be time consuming (MCMC, etc)

•  Setup:   Prior distribution of the parameters Z: π(z)   Forward problem simulation: y=G(Z)   Measurement/data: d

•  Goal: To estimation the distribution of Z --- posterior distribution

Surrogate-based Bayesian Estimation

•  Surrogate: yN = GN (Z )

•  Approximate Bayes rule:

π N (Z | d) ∝π N (d | Z )π (Z )

where π N (d | Z ) = π ei

(di − GN ,i (Z ))i=1

nd

∏

•  Properties:   Allows direct sampling with arbitrarily large samples   No additional simulations – forward problem solver only

Convergence Analysis

•  Kullback-Leibler divergence:

•  Basic assumption: observation error is i.i.d. Gaussian

•  Notes:   Fast (exponential) convergence rate is retained

Theorem. If the gPC expansion GN converges to G in LπZ

2 , then the posterior density

π Nd converges to π d in the sense

D π Nd π d( ) → 0, N →∞.

Moreover, if

Gi (Z ) − GN ,i (Z )Lπ z

2≤ CN −α , 1≤ i ≤ nd ,α > 0, C independent of N ,

then for sufficiently large N ,

D π Nd π d( ) N −α .

(Marzouk & Xiu, Comm. Comput. Phys.,vol. 6, 08)   So is the slow convergence

Parameter Estimation: Supersensitivity Example

∂u∂t

+ u ∂u∂x

= ν∂2u∂x2

, x ∈ −1,1⎡⎣⎢

⎤⎦⎥•  Burgers’ equation :

u(−1) = 1+ δ(Z ); u(1) =−1; 0 < δ 1•  Boundary conditions :

δ

noisy observation of transition layer location (contrived numerically)

Measurement noise: e~N(0,0.052)

Prior distribution is uniform Z~(0,0.1)

Parameter Estimation: Step Function

•  Assume the forward model is a step function •  Posterior distribution is discontinuous •  Gibb’s oscillations exist •  Slow convergence with global gPC basis functions

Forward model and its approximation Posterior distribution and its approximation

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

z

G(z

) or G

N(z

)

exact forward solutiongPC approximation

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5

3

z

!d (z)

exact posteriorgPC posterior

Biological Application •  Example: estimate kinetic parameters in a genetic toggle switch

–  Differential-algebraic equations model from [Gardner et al, Nature, 2000] –  Real experimental data: steady-state expression levels of one gene (v)

•  6 uncertain parameters •  Assumed to be independent and uniform in the forward problem •  Good agreement with measurement --- but let’s now use the data again

−6 −5.5 −5 −4.5 −4 −3.5 −3 −2.5 −2

0

0.2

0.4

0.6

0.8

1

log10(IPTG)

Nor

mal

ized

GFP

exp

ress

ion

1-parameter and 2-parameter marginal posterior densities (dim = 6, N = 4, 6-level sparse grid forward problem solver)

Using the Data Again – Parameter estimation

An (almost) Industrial Application: Free-Cantilever Damping

24

Gas damping of freely-vibrating cantilever

3-parameter marginal posterior densities Level 2 sparse grid in thickness, gas density and frequency

MEMOSA for damping simulation

Summary

  Generalized polynomial chaos (gPC) •  Multivariate approximation theory •  An highly efficient uncertainty propagation strategy •  Makes many UQ tasks “post processing”

  Data, any data, can help •  UQ simulation needs to be dynamically data driven.

  Uncertainty Analysis: To provide improved prediction •  Input characterization •  Uncertainty propagation •  Post processing

•  NSF: •  DMS-0645035 •  IIS-0914447 •  IIS-1028291

•  AFOSR: FA9550-08-1-0353 •  DOE:

•  DE-FC52-08NA28617 •  DE-SC0005713

  Support:

bayesian inference for inverse problems using surrogate model · – real experimental data:...

Documents