geology 5670/6670 inverse theory 16 mar 2015 © a.r. lowry 2015 last time: review of inverse...
DESCRIPTION
1.iii. Assume measurement error is unknown and estimate from the misfit: We calculate a residual and use statistics to estimate the data variance via: This yields = 21.1 mW m -2 (as compared to 37.1 mW m -2 if we use the mean of variance from the inverse of values in column 4 of the file). Using that estimate of data variance, the model uncertainties are mW m -2 and km. If we use the a priori estimate of data variance, the parameter of misfit is. This probably reflects a “correlation” that has been introduced by interpolating heat flow to a large number of locations from a small number of measurements (and that is not reflected in our representation of error).TRANSCRIPT
Geology 5670/6670Inverse Theory
16 Mar 2015
© A.R. Lowry 2015
Last time: Review of Inverse Assignment 1
Expect Assignment 2 on Wed or Fri of this week!
Assignment 1: Key to the assignmentQ1: OLS inversion for mantle heat flow Qm & radiogenic length lrad:
1.i. Use ordinary least squares to find the model parameters for Qs = Qm + A0lrad: Let
from column 1, ; ; and If the units are correctly scaled, this gives m1 = Qm = 43.701 mW/m2 and m2 = lrad = 19.457 km.
1.ii. Assume variance from given values and estimate formal parameter uncertainties: Note there are three issues in what I gave you! (1) Should use mean variance, not mean of ! (2) from column 4, not column 3! (3) And these errors are expressed in mW m-2 so must also be converted to MKS units. But, using that and , uncertainty in Qm
is mW/m2 and in lrad is km.
€
d = Qs
€
G = 1 A0[ ]
€
G+
= GT
G ⎛ ⎝ ⎜
⎞ ⎠ ⎟
−1
GT
€
m = G+d
€
Cm =σ 2 GT
G ⎛ ⎝ ⎜
⎞ ⎠ ⎟
−1
€
Qm= C m 1,1( ) = 0.462
€
lrad= C m 2,2( ) = 0.399
1.iii. Assume measurement error is unknown and estimate from the misfit: We calculate a residual and use
statistics to estimate the data variance via:
This yields = 21.1 mW m-2 (as compared to 37.1 mW m-2
if we use the mean of variance from the inverse of values in column 4 of the file). Using that estimate of data variance, the model uncertainties are mW m-2 and km. If we use the a priori estimate of data variance, the parameter of misfit is
. This probably reflects a “correlation”
that has been introduced by interpolating heat flow to a large number of locations from a small number of measurements (and that is not reflected in our representation of error).
€
e = d −Gm
€
˜ σ 2 = eT
eN − M
€
Qm= C m 1,1( ) = 0.26
€
lrad= C m 2,2( ) = 0.23
€
2 = eT
eσ 2 N − M( )
= 0.323
1.iv. We’ve already looked at the square-root of the model parameter variances; in addition, the model covariance matrix exhibits a negative covariance. It is difficult to interpret what that means (other than a negative cross-correlation) without normalizing. The correlation matrix is:
That is, the parameter estimates are highly cross-correlated (and so, very suspect!)
From these results, I would say that (1) the parameter variances are surprisingly small (primarily because of the very large N) but (2) that doesn’t help us much, because the parameters are so strongly cross-correlated that the parameter variance is only a conditional uncertainty (i.e. of one parameter is conditioned on the premise that the other parameter estimate is correct!)
€
Rm[ ]ij
=C m[ ]
ij
C m[ ]ii
C m[ ]jj
→ Rm =1 −0.93
−0.93 1
⎡ ⎣ ⎢
⎤ ⎦ ⎥
Q2: WLS inversion for mantle heat flow Qm & radiogenic length lrad:
2.i. Use weighted least squares to find Qm & lrad: Now the pseudoinverse is given by where the measurement covariance matrix C
has 1/i2 (column 4 of
the data file) along its main diagonal. (This can be done in Matlab/Octave with smaller memory & computation using C_eps=sparse(diag(dat(:,4)))*1e6; where dat is the name of the variable the file was loaded in). This gives m1 = Qm = 45.476 mW/m2 and m2 = lrad = 19.049 km (compared with 43.7 mW/m2 and 19.5 km for OLS!)2.ii. Using , uncertainty in Qm is ±0.311 mW/m2
and in lrad is ±0.263 km. The parameter of misfit is now 0.778, much closer to 1 we got using an averaged estimate of variance, and showing that using an accurate representation of data uncertainty is important to getting trustworthy results.
€
G+
= GT
Cε−1
G ⎛ ⎝ ⎜
⎞ ⎠ ⎟
−1
GT
Cε−1
€
C m = GT
Cε−1
G ⎛ ⎝ ⎜
⎞ ⎠ ⎟
−1
Although the changes resulting from using WLS instead of OLS were small (~1.8 mW m-2 and 0.4 km for Qm and lrad
respectively), they were large relative to the formal uncertainties in each estimate. The improvement of the 2
parameter (along with basic theory) suggests the WLS is a better estimate, but the problematic high parameter cross- correlations are about the same in both cases (–0.9327 for OLS; –0.9325 for WLS).
Q3: The scatter is very large on the plot and the lines don’t pass through the densest part of the cloud, suggesting model inaccuracies and outlier effects… But there does appear to be a significant relationship!
Q4: Grid-search analysis of parameter error (using WLS):
The figures at left color-contour theWRMS misfit as a function of modelparameters near the best-fit case.The grey contours show and confidence intervals for (top) thelikelihood ratio method:
and (bottom) the model length:
for & . The contoursare identical (as expected). Alsoshown are the bounds from Cm
as white bars (top); interestingly thelrad bounds stop at the contour butthe Qm bound is larger…
€
EαWLS = Emin
WLS 1+ MN − M
F1−α−1 M ,N − M( )
⎡ ⎣ ⎢
⎤ ⎦ ⎥
€
m − ˜ m ( )T
GT
W G m − ˜ m ( )M
N − MEmin
WLSF1- α-1 M ,N − M( )
… Probably reflecting the coordinatetransformation from the error ellipseaxes to the parameter coordinateaxes. Regardless these closelymatch the expected elliptical form.
These plots were made by firstplotting EWRMS using surf, setting“hold on”, plotting EWRMS or themodel length using contour withinthe same axes, printing to an epscfile format and editing in Illustrator(to remove the opaque backgroundof the contour plot). The matlab/octave shellscript is provided on thecourse website for reference.
Constrained Optimization:Suppose we have inequality constraints on a nonlinear problem (!). Quadratic programming applied to a Taylor-series approximation (iterative) approach can be both computationally expensive and subject to breakdown!Instead, may adopt a nonlinear programming approach:(1) Projection Methods:
€
l ≤ m ≤ u
miui
mj
€
m 0
€
m1
If model update would crossconstraint, “reflect” back…When minimum is outsidepermissible region, convergesto the minimum on the boundary.
(2) Penalty Functions (“Barrier Functions”)
Modify the objective function such that
€
′ E m( ) = E m( )+φ m( ) where e.g.
€
φ m( ) = α i
ui − mi( )k
i=1
M
∑
mi
mj
ui
€
∇ ′ E =∇E +∇φ = 0Minimization becomes:
Suppose k = 1:
€
∇ ′ E = 2GT
Δd −GΔm( )+ β
€
β 1
u1 − m1( )2
α 2
u2 − m2( )2 ... α M
uM − mM( )2
⎡
⎣ ⎢ ⎢
⎤
⎦ ⎥ ⎥
T
where
Then
€
Δm = GT
G ⎛ ⎝ ⎜
⎞ ⎠ ⎟
−1
GTΔd − β
⎛ ⎝ ⎜
⎞ ⎠ ⎟
Can also use parameter transformations:(1) Positivity constraints mi ≥ 0
Transform as, e.g.,
Then minimize the misfit error norm E with respect to:
and transform the m’ vector back to get m.
€
′ m i = log10 mi( ) ⇒ mi =10 ′ m i
€
dipred = Fi m1,m2 ,...,mM( )
= Fi 10 ′ m 1 ,10 ′ m 2 ,...,10 ′ m M( )
€
′ m 1, ′ m 2 , ... , ′ m M
(2) Bounds
Transform as
where:
P(m’i) has the property:
Again, we minimize the misfit error E with respect to
and transform back to m after we’ve found m’.
€
l ≤ m ≤ u
€
mi = li + ui − li( )P ′ m i( )
€
P ′ m i( ) =1+erf ′ m i( )
2
€
0 ≤ P ′ m i( ) ≤1 for −∞ ≤ ′ m i ≤ ∞€
erf x( ) = 2π
exp −u2( )
0
x
∫ du
€
′ m 1, ′ m 2 , ... , ′ m M