approximation errors, inverse problems and model reductionapproximation errors, inverse problems and...

Post on 09-Feb-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Approximation errors, inverse problemsand model reduction

Ville Kolehmainen1

1Department of Applied Physics,University of Eastern Finland (UEF), Kuopio, Finland

IMA workshop “Sensor Location in Distributed ParameterSystem”, Minneapolis, MN, September 6-8, 2017

Joint work with:

I Antti LipponenI Meghdoot MozumderI Tanja TarvainenI Simon ArridgeI Jari Kaipio

Outline:

Approximation Error Model (AEM)

Approximate marginalization of auxiliary unknowns ininverse problems using the AEM

Construction of low cost predictor model using the AEM

Approximation Error Model (AEM)

Let y ∈ Rm and x ∈ Rn, and let us have:

I An accurate modely = f (x)

I A reduced modely ≈ f (x)

I In the approximation error model(AEM), we write the accurate modelas

y = f (x) + [f (x)− f (x)]︸ ︷︷ ︸ε(x)

= f (x) + ε(x)

where ε(x) is the approximationerror.

I We discuss how the model can beused for

1) Marginalization of auxiliaryunknowns in inverse problems

2) Construction of a low costpredictor model for f (x).

AEM was originally proposed in:

Approximate marginalization of auxiliaryunknowns in inverse problems using the AEM

I Consider the inverse problem of estimating x ∈ Rn fromnoisy observation y ∈ Rm, given the model

y = f (x , z) + e

wherex ∈ Rn: primary unknownz ∈ Rd : uninteresting, auxiliary unknowns (in this talk, sensor

locations & coupling coefficients)I Complete Bayesian solution: Posterior density modelπ(x , z|y). In many practical applications

I estimation of all parameters (x , z) orI marginalization π(x |y) =

∫ ∫π(x , z|y)d z

is infeasible due to computation time limitations.

I Classical “ignorance” solution: treat z as fixed conditioningvariables, and estimate x from

π(x |y , z = z0)

→ large errors if z0 is incorrect.I Figure: 1D-marginal posterior π(x`|y):

I exact marginal π(x`|y) (black line)I π(x`|y , z = z0) with incorrect z0 (blue)I true value of x` (vertical).

Conventional measurement error model (CEM)

I Consider the conventional measurement model

y = f (x) + e (1)

I Joint densityπ(y , x ,e) = π(y | x ,e)π(e | x)π(x) = π(y ,e | x)π(x)

I In case of (1), we have π(y | x ,e) = δ(y − f (x)− e), and

π(y | x) =

∫π(y ,e | x) d e

=

∫δ(y − f (x)− e)π(e | x) d e

= πe | x (y − f (x) | x)

I In the (usual) case of mutually independent x and e, wehave πe | x (e | x) = πe(e) and

π(y |x) = πe(y − f (x))

I Furthermore, if π(e) = N (e∗, Γe) and π(x) = N (x∗, Γx ), wehave

π(x | y) ∝ exp(−1

2

(‖Le(y − f (x)− e∗)‖2 + ‖Lx (x − x∗)‖2

)),

where LTe Le = Γ−1

e and LTx Lx = Γ−1

x .I MAP estimate with the CEM:

minx

‖Le(y − f (x)− e∗)‖2 + ‖Lx (x − x∗)‖2

Approximation error model (AEM)

I Accurate measurement model

y = f (x , z) + e (2)

I Instead of using f (x , z) and treating (x , z) as unknowns,we fix z ← z0 and use a possibly drastically reducedforward model

x 7→ f (x , z0)

I We write the accurate measurement model as

y = f (x , z) + e= f (x , z0) +

[f (x , z)− f (x , z0)

]+ e

= f (x , z0) + ε(x , z) + e (3)

where ε(x , z) = f (x , z)− f (x , z0) is the approximation error.I The objective is to formulate posterior model

π(x |y) ∝ π(y |x)π(x)

using measurement model (3).I We consider e independent of (x , z).

I Using Bayes formula repeatedly, we get

π(y , x , z,e, ε) = π(y | x , z,e, ε)π(x , z,e, ε)

= δ(y − f (x , z0)− e − ε)π(e, ε | x , z)π(z | x)π(x)

= π(y , z,e, ε | x)π(x)

I Hence

π(y | x) =

∫∫∫∫π(y , z,e, ε | x)de dεdz

=

∫πe(y − f (x , z0)− ε)πε|x (ε | x) dε

(note: convolution integral w.r.t. ε)I To get a computationally useful and efficient form, πe andπε|x are approximated with Gaussian distributions.

I Let the Gaussian approximation of π(ε, x) be

π(ε, x) ∝ exp

−1

2

(ε− ε∗x − x∗

)T (Γε ΓεxΓxε Γx

)−1(ε− ε∗x − x∗

)

I Hence π(e) = N (e∗, Γe), π(ε | x) = N (ε∗|x , Γε|x ), where

ε∗|x = ε∗ + Γεx Γ−1x (x − x∗), Γε|x = Γε − Γεx Γ−1

x Γxε

I Define ν | x = e + ε | x , π(ν | x) = N (ν∗|x , Γν|x ), where

ν∗|x = e∗+ε∗+ Γεx Γ−1x (x−x∗), Γν|x = Γe + Γε−Γεx Γ−1

x Γxε

I Approximate likelihood

π(y | x) = N (y − f (x , z0)− ν∗|x , Γν|x )

I Posterior model

π(x | y) ∝ π(y | x)π(x) ∝ exp(−1

2V (x)

)where V (x)

V (x) = (y − f (x , z0)− ν∗|x )T Γ−1ν|x (y − f (x , z0)− ν∗|x )

+ (x − x∗)T Γ−1x (x − x∗)

= ‖Lν|x (y − f (x , z0)− ν∗|x )‖2 + ‖Lx (x − x∗)‖2

where Γ−1ν | x = LT

ν|xLν|x and Γ−1x = LT

x Lx .I MAP estimate with the AEM:

minx‖Lν|x (y − f (x , z0)− ν∗|x )‖2 + ‖Lx (x − x∗)‖2

Diffuse Optical Tomography (DOT)

qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq

qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq

qqqqqqqqqqqqqqqqqqq

qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq

qqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq

∂Ω

Ω

µa(r), µs (r)

s1

s2

s3

s4

s5

s6

d1

d2

d3

d4

d5

d6

I Imaging of biological tissues usingNIR light (turbid medium).

I Target is illuminated at locationsj ⊂ ∂Ω, response measured atlocations dk ⊂ ∂Ω.

I Sinusoidally modulated source→Amplitude & phase of the modulatedwave are measured.

I Inverse problem; Estimateabsorption and scatteringcoefficients (µa(r), µs (r)).

Mathematical model

I Diffusion approximation (DA) to the RTE;

−∇ · κ(r)∇Φ(r , ω) + µa(r)Φ(r , ω) +iωc

Φ(r , ω) = 0, r ∈ Ω

where κ(r) = (3(µa(r) + µs (r)))−1.I Boundary condition

Φ(r , ω) + 2ζκ(r)∂Φ(r , ω)

∂ν= g(r , ω), r ∈ ∂Ω,

I Measurable quantity (exitance);

φ =

∫d−κ(r)

∂Φ(r , ω)

∂νd S, d ⊂ ∂Ω

I Numerical solution by FEM. Notation;

y = f (x , z), x = (µa , µs )T ∈ Rn

Computational Examples

I We consider estimation of x = (µa , µs )T from

y = f (x , z) + e

I Three cases. Auxiliary parameters z are:a) Coupling losses (amplitude & phase shift) of the source and

detector fibres.b) Locations sj ,dk of the sources and detectors.c) Combination of a) & b)

I We study the approximate marginalization over a)-c) by theAEM with 2D simulations

Estimates

I REF: x estimated using correct realization of z with theconventional error model y = f (x , z) + e:

minx

‖Le(y − f (x , z)− e∗)‖2 + ‖Lx (x − x∗)‖2

(4)

I CEM: x estimated using incorrect realization z = z0 andconventional error model y = f (x , z0) + e:

minx

‖Le(y − f (x , z0)− e∗)‖2 + ‖Lx (x − x∗)‖2

(5)

I AEM: x estimated using incorrect z = z0 and theapproximation error model y = f (x , z0) + ε+ e:

minx

‖Lν|x (y − f (x , z0)− ν∗|x )‖2 + ‖Lx (x − x∗)‖2

(6)

Estimation of approximation error statistics

I Approximation error

ε(x , z) = f (x , z)− f (x , z0)

I Draw sets of samples x (`) andz(`) from π(x) and π(z)

I Compute realizations

ε(`) = f (x (`), z(`))− f (x (`), z0)

I Estimate ε∗ and Γε|x as sampleaverages from x (`), ε(`).

a) source and detector coupling coefficients

I Top: µa . Bottom: µs .I Columns from left to right: 1) True target, 2) REF, 3) CEM,

4) AEM

b) source and detector locations

I Top: µa . Bottom: µs .I Columns from left to right: 1) True target, 2) REF, 3) CEM,

4) AEM

c) source and detector locations & couplingcoefficients

I Top: µa . Bottom: µs .I Columns from left to right: 1) True target, 2) REF, 3) CEM,

4) AEM

Robustness w.r.t. the prior model π(z) (couplingcoefficients)

I Variance of the prior π(z) increases from left to right.I Magnitude of the error ‖z − z0‖ increases from top to

bottom.

Robustness w.r.t. the prior model π(z) (optodelocations)

I Variance of the prior π(z) increases from left to right.I Magnitude of the error ‖z − z0‖ increases from top to

bottom.

Construction of low cost predictor model usingthe AEM

A low cost predictor model using the AEM:

I Accurate simulation model model

y = f (x)

I Reduced simulation model model

y ≈ f (x)

I (AEM) model

y = f (x) + [f (x)− f (x)]︸ ︷︷ ︸ε(x)

= f (x) + ε(x)

A low cost predictor model using the AEM:

I We construct a low cost predictor model

ε(x) ≈ Pε(x)

I Approximate the accurate model by

y ≈ f (x) + Pε(x)

I Pε(x) constructed using statistical learning.I Remark: Conventional approach is to construct predictor

directly for f (x):y ≈ Pf (x)

Algorithm:

1. Construct prior model for x2. Construct training set x (`), ε(x (`))3. Train/construct the predictor model Pε(x) for ε4. The final simulation model f (x) + Pε(x)

We evaluate different regressor models in the numericalexample.

Simulation models in the numerical example:

I REF: Accurate simulation model f (x) used as thereference model.

I RED: Reduced simulation model f (x) .I REG-ACC: Regressor model

Pf (x)

that predicts the output of the accurate model.I REG-AE: AE model

f (x) + Pε(x)

that consists of the reduced model + predictor of theapproximation error.

Non-linear heat equation

I Let x ∈ [0 1]. We consider solution of

∂u∂t− ∂

∂x

(κ(u)

∂u∂x

)= 0, (7)

where u = u(x , t) is temperature and κ := κ(u) is thethermal conductivity.

I The initial and boundary conditions

u(x ,0) = u0, (8)

u(0, t) = u(1, t) = 0 (9)

where u0 is the initial temperature distribution.I Thermal conductivity

κ(u) = 0.2− 0.1/exp(

u2).

I Letu`+1 = f (u`)

denote the accurate model (REF) for the solution of theheat equation over an (sampling) interval ∆t = 0.01s fromtime index ` to `+ 1.

I Accurate model f (u`):I FD discretization with element size h = 1

99 .I Implicit Euler with a time step of δt = ∆t

100 .I Reduced model f (u`):

I FD discretization with element size h = 111 .

I Implicit Euler with a time step of δt = ∆t .

Training set for the regressor models

I u(k)0

Nk=1 drawn from u0 ∼ N (0, Γ) where

Γi,j = exp

−|xi − xj |2

2r2

(10)

I To emulate temperature distributions at different timeinstants `, each training sample was scaled asu(k) = au(k)

0 , where a ∼ Gamma(0.5,0.75)

Three random samples u0.

Regressor models

I Regressor models:I Linear regressionI Gaussian processesI LassoI K nearest neighborsI Support Vector MachineI Random Forest

I Error metric: the median of the relative error

‖u`+1 − f (u`) ‖/‖u`+1‖

Training sample size: Error wrt N using 10-foldcross validation.

RED: f (x)REG-ACC: Pf (x)REG-AE: f (x) + Pε(x)

Accuracy of u(x , t) using Gaussian processes(N = 7500)

RED: f (x)REG-ACC: Pf (x)REG-AE: f (x) + Pε(x)

Computation times

Table: Average computation times for different models in heatequation test case corresponding to a simulation of 50 timesteps.

Model Overall computation timeREF 18.0 sRED 0.16 sREG-ACC 0.40 sREG-AE 0.60 s

top related