mathematics of data assimilation - university of bathpeople.bath.ac.uk/mamamf/talks/na080208.pdf ·...
Post on 20-May-2020
0 Views
Preview:
TRANSCRIPT
Mathematics of Data Assimilation
Melina Freitag
Department of Mathematical Sciences
University of Bath
Numerical Analysis Seminar8th February 2008
1 Introduction
2 Basic concepts
3 Variational Data AssimilationLeast square estimationKalman Filter
4 Problems
5 Plan
Outline
1 Introduction
2 Basic concepts
3 Variational Data AssimilationLeast square estimationKalman Filter
4 Problems
5 Plan
What is Data Assimilation?
Loose definition
Estimation and prediction (analysis) of an unknown, true state bycombining observations and system dynamics (model output).
What is Data Assimilation?
Loose definition
Estimation and prediction (analysis) of an unknown, true state bycombining observations and system dynamics (model output).
Some examples
Navigation
What is Data Assimilation?
Loose definition
Estimation and prediction (analysis) of an unknown, true state bycombining observations and system dynamics (model output).
Some examples
Navigation
Medical imaging
What is Data Assimilation?
Loose definition
Estimation and prediction (analysis) of an unknown, true state bycombining observations and system dynamics (model output).
Some examples
Navigation
Medical imaging
Numerical weather prediction
Data Assimilation in NWP
Estimate the state of the atmosphere xi.
Observations y
Satellites
Ships and buoys
Surface stations
Aeroplanes
Data Assimilation in NWP
Estimate the state of the atmosphere xi.
A priori information xB
background state (usualprevious forecast)
Observations y
Satellites
Ships and buoys
Surface stations
Aeroplanes
Data Assimilation in NWP
Estimate the state of the atmosphere xi.
A priori information xB
background state (usualprevious forecast)
Models
a model how the atmosphereevolves in time (imperfect)
xi+1 = M(xi)
Observations y
Satellites
Ships and buoys
Surface stations
Aeroplanes
Data Assimilation in NWP
Estimate the state of the atmosphere xi.
A priori information xB
background state (usualprevious forecast)
Models
a model how the atmosphereevolves in time (imperfect)
xi+1 = M(xi)
a function linking model spaceand observation space(imperfect)
yi = H(xi)
Observations y
Satellites
Ships and buoys
Surface stations
Aeroplanes
Data Assimilation in NWP
Estimate the state of the atmosphere xi.
A priori information xB
background state (usualprevious forecast)
Models
a model how the atmosphereevolves in time (imperfect)
xi+1 = M(xi)
a function linking model spaceand observation space(imperfect)
yi = H(xi)
Observations y
Satellites
Ships and buoys
Surface stations
Aeroplanes
Assimilation algorithms
used to find an (approximate)state of the atmosphere xi attimes i (usually i = 0)
using this state a forecast forfuture states of the atmospherecan be obtained
xA: Analysis (estimation of thetrue state after the DA)
Outline
1 Introduction
2 Basic concepts
3 Variational Data AssimilationLeast square estimationKalman Filter
4 Problems
5 Plan
Schematics of DA
Figure: Background state xB
Schematics of DA
Figure: Observations y
Schematics of DA
Figure: Analysis xA (consistent with observations and model dynamics)
Data Assimilation in NWP
Underdeterminacy
Size of the state vector x: 432 × 320 × 50 × 7 = O(107)
Data Assimilation in NWP
Underdeterminacy
Size of the state vector x: 432 × 320 × 50 × 7 = O(107)
Number of observations (size of y): O(105 − 106)
Data Assimilation in NWP
Underdeterminacy
Size of the state vector x: 432 × 320 × 50 × 7 = O(107)
Number of observations (size of y): O(105 − 106)
Operator H (nonlinear!) maps from state space into observationsspace: y = H(x)
Some easy schemes
Cressman analysis
Figure: Copyright:ECMWF
At each time step i
xA(k) = x
B(k)+
Pn
l=1w(lk)(y(l) − xB(l))
Pn
l=1w(lk)
w(lk) = max
„
0,R2 − d2
lk
R2 + d2lk
«
dlk measures the distance between pointsl and k.
Outline
1 Introduction
2 Basic concepts
3 Variational Data AssimilationLeast square estimationKalman Filter
4 Problems
5 Plan
Data Assimilation in NWP
Estimate the state of the atmosphere xi.
Apriori information xB
background state (usualprevious forecast) has errors!
Models
a model how the atmosphereevolves in time (imperfect)
xi+1 = M(xi) + error
a function linking model spaceand observation space(imperfect)
yi = H(xi) + error
Observations y has errors!
Satellites
Ships and buoys
Surface stations
Airplanes
Assimilation algorithms
used to find an (approximate)state of the atmosphere xi attimes i (usually i = 0)
using this state a forecast forfuture states of the atmospherecan be obtained
xA: Analysis (estimation of thetrue state after the DA)
Error variables
Modelling the errors
background error εB = xB − xTruth of average εB and covariance
B = (εB − εB)(εB − εB)T
Error variables
Modelling the errors
background error εB = xB − xTruth of average εB and covariance
B = (εB − εB)(εB − εB)T
observation error εO = y − H(xTruth) of average εO and covariance
R = (εO − εO)(εO − εO)T
Error variables
Modelling the errors
background error εB = xB − xTruth of average εB and covariance
B = (εB − εB)(εB − εB)T
observation error εO = y − H(xTruth) of average εO and covariance
R = (εO − εO)(εO − εO)T
analysis error εA = xA − xTruth of average εA and covariance
A = (εA − εA)(εA − εA)T
measure of the analysis error that we want to minimise
tr(A) = ‖εA − εA‖2
Assumptions
Linearised observation operator: H(x) − H(xB) = H(x− xB)
Nontrivial errors: B, R are positive definite
Unbiased errors: xB − xTruth = y − H(xTruth) = 0
Uncorrelated errors: (xB − xTruth)(y − H(xTruth))T = 0
Optimal least-squares estimater
Cost function
Solution of the variational optimisation problem xA = arg minJ(x) where
J(x) = (x − xB)T
B−1(x− x
B) + (y − H(x))TR
−1(y − H(x))
= JB(x) + JO(x)
Optimal least-squares estimater
Cost function
Solution of the variational optimisation problem xA = arg minJ(x) where
J(x) = (x − xB)T
B−1(x− x
B) + (y − H(x))TR
−1(y − H(x))
= JB(x) + JO(x)
Interpolation equations
xA = x
B + K(y − H(xB)), where
K = BHT (HBH
T + R)−1K . . . gain matrix
Optimal least-squares estimater
Cost function
Solution of the variational optimisation problem xA = arg minJ(x) where
J(x) = (x − xB)T
B−1(x− x
B) + (y − H(x))TR
−1(y − H(x))
= JB(x) + JO(x)
Interpolation equations
xA = x
B + K(y − H(xB)), where
K = BHT (HBH
T + R)−1K . . . gain matrix
Analysis error covariance
A = (I− KH)B(I− KH)T + KRKT
If K is the optimal least-squares gain
A = (I −KH)B
Conditional probabilities
Non-Gaussian PDF’s (probability density function)
P (x) is a priori PDF (background)
P (y|x) is the observation PDF (likelihood of the observations givenbackground x)
Conditional probabilities
Non-Gaussian PDF’s (probability density function)
P (x) is a priori PDF (background)
P (y|x) is the observation PDF (likelihood of the observations givenbackground x)
P (x|y) conditional probability of the model state given theobservations, Bayes theorem:
argx max P (x|y) = argx maxP (y|x)P (x)
P (y)
Conditional probabilities
Non-Gaussian PDF’s (probability density function)
P (x) is a priori PDF (background)
P (y|x) is the observation PDF (likelihood of the observations givenbackground x)
P (x|y) conditional probability of the model state given theobservations, Bayes theorem:
argx max P (x|y) = argx maxP (y|x)P (x)
P (y)
Gaussian PDF’s
P (x|y) = c1 exp“
−(x− xB)T
B−1(x − x
B)”
·
c2 exp“
−(y − H(x))TR
−1(y − H(x))”
xA is the maximum a posteriori estimator of xTruth. Maximising P (x|y)equivalent to minimising J(x)
A simple scalar illustration
Room temperature
T O observation with standard deviation σO
T B background with standard deviation σB
A simple scalar illustration
Room temperature
T O observation with standard deviation σO
T B background with standard deviation σB
T A = T B + k(T O − T B) with error variance σ2O = (1 − k)2σ2
B + k2σ2O
optimal k which minimises error variance
k =σ2
B
σ2B + σ2
O
equivalent to minimising
J(T ) =(T − T B)2
σ2B
+(T − T O)2
σ2O
and then1
σ2A
=1
σ2B
+1
σ2O
Optimal interpolation
Optimal interpolation
Computation ofx
A = xB + K(y − H(xB))
K = BHT (HBH
T + R)−1K . . . gain matrix
Optimal interpolation
Optimal interpolation
Computation ofx
A = xB + K(y − H(xB))
K = BHT (HBH
T + R)−1K . . . gain matrix
expensive!
Three-dimensional variational assimilation (3D-Var)
Minimisation of
J(x) = (x − xB)T
B−1(x− x
B) + (y − H(x))TR
−1(y − H(x))
= JB(x) + JO(x)
Three-dimensional variational assimilation (3D-Var)
Minimisation of
J(x) = (x − xB)T
B−1(x− x
B) + (y − H(x))TR
−1(y − H(x))
= JB(x) + JO(x)
avoids computation of K by using a descent algorithm
working in control variable space
Four-dimensional variational assimilation (4D-Var)
Minimise the cost function
J(x0) = (x0 − xB)T
B−1(x0 − x
B) +nX
i=0
(yi − Hi(xi))TR
−1i (yi − Hi(xi))
subject to model dynamics xi = M0→ix0
Four-dimensional variational assimilation (4D-Var)
Minimise the cost function
J(x0) = (x0 − xB)T
B−1(x0 − x
B) +nX
i=0
(yi − Hi(xi))TR
−1i (yi − Hi(xi))
subject to model dynamics xi = M0→ix0
Figure: Copyright:ECMWF
4D-Var analysis
Model dynamics
Strong constraint: model states xi are subject to
xi = M0→ix0
nonlinear constraint optimisation problem (hard!)
4D-Var analysis
Model dynamics
Strong constraint: model states xi are subject to
xi = M0→ix0
nonlinear constraint optimisation problem (hard!)
Simplifications
Causality (forecast expressed as product of intermediate forecast steps)
xi = Mi,i−1Mi−1,i−2 . . . M1,0x0
Tangent linear hypothesis (H and M can be linearised)
yi−Hi(xi) = yi−Hi(M0→ix0) = yi−Hi(M0→ixB0 )−HiM0→i(x0−x
B0 )
M is the tangent linear model.
4D-Var analysis
Model dynamics
Strong constraint: model states xi are subject to
xi = M0→ix0
nonlinear constraint optimisation problem (hard!)
Simplifications
Causality (forecast expressed as product of intermediate forecast steps)
xi = Mi,i−1Mi−1,i−2 . . . M1,0x0
Tangent linear hypothesis (H and M can be linearised)
yi−Hi(xi) = yi−Hi(M0→ix0) = yi−Hi(M0→ixB0 )−HiM0→i(x0−x
B0 )
M is the tangent linear model.
unconstrained quadratic optimisation problem one.
Minimisation of the 4D-Var cost function
Efficient implementation of J and ∇J :
forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0
Minimisation of the 4D-Var cost function
Efficient implementation of J and ∇J :
forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0
normalised departures di = R−1i (yi − Hi(xi))
cost function JOi = (yi − Hi(xi))T di
Minimisation of the 4D-Var cost function
Efficient implementation of J and ∇J :
forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0
normalised departures di = R−1i (yi − Hi(xi))
cost function JOi = (yi − Hi(xi))T di
∇J is calculated by
−1
2∇JO = −
1
2
nX
i=0
∇JOi
=nX
i=0
MT1,0 . . .M
Ti,i−1H
Ti di
= HT0 d0 + M
T1,0[H
T1 d1 + M2,1[H
T2 d2 + . . . + M
Tn,n−1H
Tndn] . . .]
Minimisation of the 4D-Var cost function
Efficient implementation of J and ∇J :
forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0
normalised departures di = R−1i (yi − Hi(xi))
cost function JOi = (yi − Hi(xi))T di
∇J is calculated by
−1
2∇JO = −
1
2
nX
i=0
∇JOi
=nX
i=0
MT1,0 . . .M
Ti,i−1H
Ti di
= HT0 d0 + M
T1,0[H
T1 d1 + M2,1[H
T2 d2 + . . . + M
Tn,n−1H
Tndn] . . .]
initialise adjoint variable x̃n = 0 and then x̃i−1 = MTi,i−1(x̃i + HT
i di)etc., . . . x̃0 = − 1
2∇JO
Minimisation of the 4D-Var cost function
Efficient implementation of J and ∇J :
forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0
normalised departures di = R−1i (yi − Hi(xi))
cost function JOi = (yi − Hi(xi))T di
∇J is calculated by
−1
2∇JO = −
1
2
nX
i=0
∇JOi
=nX
i=0
MT1,0 . . .M
Ti,i−1H
Ti di
= HT0 d0 + M
T1,0[H
T1 d1 + M2,1[H
T2 d2 + . . . + M
Tn,n−1H
Tndn] . . .]
initialise adjoint variable x̃n = 0 and then x̃i−1 = MTi,i−1(x̃i + HT
i di)etc., . . . x̃0 = − 1
2∇JO
Further simplifications
preconditioning with B = LLT (transform into control variable space)so that x̂ = L−1x
Incremental 4D-Var
Example - Three-Body Problem
Motion of three bodies in a plane, two position (q) and two momentum (p)coordinates for each body α = 1, 2, 3
Example - Three-Body Problem
Motion of three bodies in a plane, two position (q) and two momentum (p)coordinates for each body α = 1, 2, 3
Equations of motion
H(q,p) =1
2
X
α
|pα|2
mα
−X X
α<β
mαmβ
|qα − qβ|
dqα
dt=
∂H
∂pα
dpα
dt= −
∂H
∂qα
Example - Three-Body problem
solver: partitioned Runge-Kutta scheme with time step h = 0.001
observations are taken as noise from the truth trajectory
background is given from a perturbed initial condition
Example - Three-Body problem
solver: partitioned Runge-Kutta scheme with time step h = 0.001
observations are taken as noise from the truth trajectory
background is given from a perturbed initial condition
assimilation window is taken 300 time steps
minimisation of cost function J using a Gauss-Newton method(neglecting all second derivatives)
∇J(x0) = 0
∇∇J(xj0)∆x
j0 = −∇J(xj
0), xj+10 = x
j0 + ∆x
j0
subsequent forecast is take 5000 time steps
Example- Three-Body problem
Figure: Truth trajectory
Example- Three-Body problem
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500−0.05
0
0.05
0.1
0.15
0.2
Time step
x po
sitio
n of
the
sun
Solution for position of the sun
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500−1
−0.5
0
0.5
1
Time step
x po
sitio
n of
the
moo
n
Solution for position of the moon
Truth x
Observation x
Background x
Figure: Truth trajectory with observations and background
Example- Three-Body problem
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500−0.05
0
0.05
0.1
0.15
0.2
Time step
x po
sitio
n of
the
sun
Solution for position of the sun
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500−1
−0.5
0
0.5
1
Time step
x po
sitio
n of
the
moo
n
Solution for position of the moon
Truth x
Observation x
Background x
Analysis x (after EKF)
Figure: Analysis
Example- Three-Body problem
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 55000
0.05
0.1
0.15
0.2
Time step
RM
S e
rror
before assimilation
after assimilation
Figure: RMS error
Example- Three-Body problem
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 55000
0.05
0.1
0.15
0.2
Time step
RM
S e
rror
before assimilation
after assimilation
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 55000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
sun before
planet before
moon before
sun after
planet after
moon after
Figure: RMS error
Example- Three-Body problem - Cycled 4D-Var
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500
0
0.05
0.1
0.15
0.2
0.25
0.3
Time step
RM
S e
rror
before assimilation
after assimilation
Figure: First cycle
Example- Three-Body problem - Cycled 4D-Var
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500
0
0.05
0.1
0.15
0.2
0.25
0.3
Time step
RM
S e
rror
before assimilation
after assimilation
Figure: Second cycle
Example- Three-Body problem - Cycled 4D-Var
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500
0
0.05
0.1
0.15
0.2
0.25
0.3
Time step
RM
S e
rror
before assimilation
after assimilation
Figure: Third cycle
The Kalman Filter Algorithm
Sequential data assimilation, background is provided by the forecastthat starts from the previous analysis
covariance matrices BF , BA
forecast/model error xTruthi+1 = Mi+1,ix
Truthi + ηi where ηi ∼ N (0,Qi),
assumed to be uncorrelated to analysis error of previous forecast
The Kalman Filter Algorithm
Sequential data assimilation, background is provided by the forecastthat starts from the previous analysis
covariance matrices BF , BA
forecast/model error xTruthi+1 = Mi+1,ix
Truthi + ηi where ηi ∼ N (0,Qi),
assumed to be uncorrelated to analysis error of previous forecast
State and error covariance forecast
State forecast xFi+1 = Mi+1,ix
Ai
Error covariance forecast BFi+1 = Mi+1,iB
Ai M
Ti+1,i + Qi
The Kalman Filter Algorithm
Sequential data assimilation, background is provided by the forecastthat starts from the previous analysis
covariance matrices BF , BA
forecast/model error xTruthi+1 = Mi+1,ix
Truthi + ηi where ηi ∼ N (0,Qi),
assumed to be uncorrelated to analysis error of previous forecast
State and error covariance forecast
State forecast xFi+1 = Mi+1,ix
Ai
Error covariance forecast BFi+1 = Mi+1,iB
Ai M
Ti+1,i + Qi
State and error covariance analysis
Kalman gain Ki = BFi H
Ti (HiB
Fi H
Ti + Ri)
−1
State analysis xAi = x
Fi + Ki(yi − Hix
Fi )
Error covariance of analysis BAi = (I −KiHi)B
Fi
Example - Three-Body Problem
solver: partitioned Runge-Kutta scheme with time step h = 0.001
observations are taken as noise from the truth trajectory
background is given from a perturbed initial condition
assimilation window is taken 300 time steps
minimisation of cost function J using a Gauss-Newton method(neglecting all second derivatives)
application of 4D-Var
Example - Three-Body Problem
solver: partitioned Runge-Kutta scheme with time step h = 0.001
observations are taken as noise from the truth trajectory
background is given from a perturbed initial condition
assimilation window is taken 300 time steps
minimisation of cost function J using a Gauss-Newton method(neglecting all second derivatives)
application of 4D-Var
Compare using B = I with using a flow-dependent matrix B which wasgenerated by a Kalman Filter before the assimilation starts (see G.Inverarity (2007))
Example - Three-Body Problem
2.4 2.5 2.6 2.7 2.8 2.9 3
x 104
0
0.02
0.04
0.06
0.08
0.1
0.12
Time step
RM
S e
rror
before assimilation
after assimilation
Figure: 4D-Var with B = I
Example - Three-Body Problem
2.4 2.5 2.6 2.7 2.8 2.9 3
x 104
0
0.02
0.04
0.06
0.08
0.1
0.12
Time step
RM
S e
rror
before assimilation
after assimilation
Figure: 4D-Var with B = I
2.4 2.5 2.6 2.7 2.8 2.9 3
x 104
0
0.005
0.01
0.015
0.02
0.025
0.03
Time step
RM
S e
rror
before assimilation
after assimilation
Figure: 4D-Var with B = PA
Outline
1 Introduction
2 Basic concepts
3 Variational Data AssimilationLeast square estimationKalman Filter
4 Problems
5 Plan
Problems with Data Assimilation
DA is computational very expensive, one cycle is much more expensivethan the actual forecast
Problems with Data Assimilation
DA is computational very expensive, one cycle is much more expensivethan the actual forecast
estimation and storage of the B-matrix is hardin operational DA B is about 107
× 107
B should be flow-dependent but in practice often staticB needs to be modeled and diagonalised since B−1 too expensive tocompute (”control variable transform”)
Problems with Data Assimilation
DA is computational very expensive, one cycle is much more expensivethan the actual forecast
estimation and storage of the B-matrix is hardin operational DA B is about 107
× 107
B should be flow-dependent but in practice often staticB needs to be modeled and diagonalised since B−1 too expensive tocompute (”control variable transform”)
many assumptions are not validerrors non-Gaussian, data have biasesforward model operator M is not exact and also non-linear and systemdynamics are chaoticminimisation of the cost function needs close initial guess, smallassimilation window
Problems with Data Assimilation
DA is computational very expensive, one cycle is much more expensivethan the actual forecast
estimation and storage of the B-matrix is hardin operational DA B is about 107
× 107
B should be flow-dependent but in practice often staticB needs to be modeled and diagonalised since B−1 too expensive tocompute (”control variable transform”)
many assumptions are not validerrors non-Gaussian, data have biasesforward model operator M is not exact and also non-linear and systemdynamics are chaoticminimisation of the cost function needs close initial guess, smallassimilation window
model error not included
Outline
1 Introduction
2 Basic concepts
3 Variational Data AssimilationLeast square estimationKalman Filter
4 Problems
5 Plan
Model error and perturbation error
Figure: One assimilation window (6 hours)
xF1 − x
Truth1 = Mappr(x
A0 ) − M(xTruth
0 )
= Mappr(xA0 ) − Mappr(x
Truth0 )
| {z }
Perturbation error
+Mappr(xTruth0 ) − M(xTruth
0 )| {z }
Model error
Model error and perturbation error
Figure: Model error and perturbation error
xj − x
C → perturbation error (after 6/12 hours)
xj − x
A → model error
Model error
Figure: Perturbation error after 7 hours (Copyright: MetOffice)
Model error
Figure: Perturbation error after 12 hours (Copyright: MetOffice)
Model error
Figure: Model error after 12 hours (Copyright: MetOffice)
Met Office research and plans
design a simple chaotic model of reduced order (Lorenz model)
include several time scales (to model the atmosphere)
Met Office research and plans
design a simple chaotic model of reduced order (Lorenz model)
include several time scales (to model the atmosphere)
identify and analyse model error and analyse influence of this modelerror onto the DA scheme
analyse the influence of the error made by the numericalapproximation (part of the model error) on the error in the DA scheme
compare assimilation algorithms and optimisation strategies to reduceexisting errors
Met Office research and plans
design a simple chaotic model of reduced order (Lorenz model)
include several time scales (to model the atmosphere)
identify and analyse model error and analyse influence of this modelerror onto the DA scheme
analyse the influence of the error made by the numericalapproximation (part of the model error) on the error in the DA scheme
compare assimilation algorithms and optimisation strategies to reduceexisting errors
improve the representation of multiscale behaviour in the atmospherein existing DA methods
improve the forecast of small scale features (like convective storms)
top related