Lecture 1
Describing Inverse Problems
SyllabusLecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized InversesLecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value DecompositionLecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems
Purpose of the Lecture
distinguish forward and inverse problems
categorize inverse problems
examine a few examples
enumerate different kinds of solutions to inverse problems
Part 1
Lingo for discussing the relationship between observations and the things
that we want to learn from them
three important definitions
things that are measured in an experiment or observed in nature…
data, d = [d1, d2, … dN]T
things you want to know about the world …
model parameters, m = [m1, m2, … mM]T relationship between data and model parameters
quantitative model (or theory)
gravitational accelerationstravel time of seismic waves
data, d = [d1, d2, … dN]T model parameters, m = [m1, m2, … mM]T quantitative model (or theory)
densityseismic velocity
Newton’s law of gravityseismic wave equation
Quantitative Model
Quantitative Model
mest dpre
mest dobs
Forward Theory
Inverse Theory
estimates predictions
observationsestimates
Quantitative Model
Quantitative Model
mtrue dpre
dobs≠
mest
due to observational
error
Quantitative Model
Quantitative Model
mtrue dpre
dobs≠
mest
due to observational
error≠ due to error
propagation
Understanding the effects ofobservational error
is central to Inverse Theory
Part 2
types of quantitative models(or theories)
A. Implicit Theory
L relationships between the data and the model are known
Example
mass = density ⨉ length ⨉ width ⨉ height
MH
M = ρ ⨉ L ⨉ W ⨉ HLρ
weight = density ⨉ volumemeasure
mass, d1size, d2, d3, d4,
want to knowdensity, m1 d1 = m1 d2 d3 d4 or d1 - m1 d2 d3 d4 = 0
d=[d1, d2, d3, d4]T and N=4m=[m1]T and M=1f1(d,m)=0 and L=1
note
No guarantee thatf(d,m)=0contains enough information
for unique estimate m
determining whether or not there is enough
is part of the inverse problem
B. Explicit Theory
the equation can be arranged so that d is a function of m
L = N one equation per datum
d = g(m) or d - g(m) = 0
Example
Circumference = 2 ⨉ length + 2 ⨉ heightL
rectangle H
Area = length ⨉ height
C = 2L+2Hmeasure
C=d1A=d2
want to knowL=m1 H=m2
d=[d1, d2]T and N=2m=[m1, m2]T and M=2
Circumference = 2 ⨉ length + 2 ⨉ heightArea = length ⨉ heightA=LH
d1 = 2m1 + 2m2d2 m1m2 d=g(m)
C. Linear Explicit Theory
the function g(m) is a matrix G times m
G has N rows and M columns
d = Gm
C. Linear Explicit Theory
the function g(m) is a matrix G times m
G has N rows and M columns
d = Gm“data kernel”
Example
M = ρg ⨉ V g+ ρq ⨉ V q
gold quartz
total mass = density of gold ⨉ volume of gold+ density of quartz ⨉ volume of quartzV = V g+ V q
total volume = volume of gold + volume of quartz
M = ρg ⨉ Vg+ ρq ⨉ V qV = V g+ V q
measure V = d1M = d2
want to know Vg =m1 Vq =m2assume ρg ρg
d=[d1, d2]T and N=2m=[m1, m2]T and M=2
d = 1 1ρg ρqm
known
D. Linear Implicit Theory
The L relationships between the data are linear
L rowsN+M columns
in all these examples m is discrete
one could have a continuous m(x) instead
discrete inverse theory
continuous inverse theory
as a discrete vector m
in this course we will usually approximate a continuous m(x)
m = [m(Δx), m(2Δx), m(3Δx) … m(MΔx)]T
but we will spend some time later in the course dealing with the continuous
problem directly
Part 3
Some Examples
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010-0.5
0
0.5
1
calendar year
tem
pera
ture
ano
mal
y, d
eg C
time, t (calendar years)
temperatu
re anomal
y, T i (deg
C)A. Fitting a straight line to data
T = a + bt
each data pointis predicted by a
straight line
matrix formulation
d = G m M=2
B. Fitting a parabola
T = a + bt+ ct2
each data pointis predicted by a
strquadratic curve
matrix formulation
d = G m M=3
straight line
note similarity
parabola
in MatLab
G=[ones(N,1), t, t.^2];
C. Acoustic Tomography
1 2 3 4
5 6 7 8
13 14 15 16
hh
source, S receiver, R
travel time = length ⨉ slowness
collect data along rows and columns
matrix formulation
d = G m M=16N=8
In MatLabG=zeros(N,M);for i = [1:4]for j = [1:4] % measurements over rows k = (i-1)*4 + j; G(i,k)=1; % measurements over columns k = (j-1)*4 + i; G(i+4,k)=1;endend
D. X-ray Imaging
S
R1R2R3 R4 R5enlarged
lymph node
(A) (B)
theory
I = Intensity of x-rays (data)s = distancec = absorption coefficient (model parameters)
Taylor Seriesapproximation
Taylor Seriesapproximationdiscrete pixelapproximation
Taylor Seriesapproximationdiscrete pixelapproximation length of beam i in pixel jd = G m
d = G m
matrix formulation
M≈106N≈106
note that G is huge106⨉106
but it is sparse(mostly zero)
since a beam passes through only a tiny fraction of the total number of
pixels
in MatLab
G = spalloc( N, M, MAXNONZEROELEMENTS);
E. Spectral Curve Fitting
0 2 4 6 8 10 120.65
0.7
0.75
0.8
0.85
0.9
0.95
1
velocity, mm/s
coun
ts
single spectral peak
0 5 100
0.1
0.2
0.3
0.4
0.5
d
p(d)
0 5 100
10
20
30
d
coun
ts
0 5 100
0.1
0.2
0.3
0.4
0.5
d
p(d)
area, Aposition, f
width, czp(z)
q spectral peaks“Lorentzian”
d = g(m)
e1e2e3e4e5
e1e2e3e4e5
s1 s2
ocean
sediment
F. Factor Analysis
d = g(m)
Part 4
What kind of solution are we looking for ?
A: Estimate of model parameters
meaning numerical values
m1 = 10.5m2 = 7.2
But we really need confidence limits, too
m1 = 10.5 ± 0.2m2 = 7.2 ± 0.1 m1 = 10.5 ± 22.3m2 = 7.2 ± 9.1orcompletely different implications!
B: probability density functions
if p(m1) simplenot so different than confidence intervals
0 5 100
0.1
0.2
0.3
0.4
m
p(m
)
0 5 100
0.1
0.2
0.3
0.4
m
p(m
)
0 5 100
0.1
0.2
0.3
0.4
m
p(m
)
0 5 100
0.1
0.2
0.3
0.4
m
p(m
)
0 5 100
0.1
0.2
0.3
0.4
m
p(m
)0 5 10
0
0.1
0.2
0.3
0.4
m
p(m
)
m is about 5plus or minus 1.5m is eitherabout 3plus of minus 1or about 8plus or minus 1but that’s less likely
we don’t really know anything useful about m
C: localized averages
A = 0.2m9 + 0.6m10 + 0.2m11might be better determined than eitherm9 or m10 or m11 individually
Is this useful?
Do we care aboutA = 0.2m9 + 0.6m10 + 0.2m11?
Maybe …
Suppose if m is a discrete approximation of m(x)
m(x)x
m10 m11m9
m(x)x
m10 m11m9
A= 0.2m9 + 0.6m10 + 0.2m11weighted average of m(x)
in the vicinity of x10
x10
m(x)x
m10 m11m9
average “localized’in the vicinity of x10
x10 weightsof weighted
average
Localized average meancan’t determine m(x) at x=10
but can determineaverage value of m(x) near x=10 Such a localized average might very
well be useful