Advanced data assimilation methods-EKF and EnKF
Alghero, Lecture 4
Hong Li and Eugenia Kalnay
University of Maryland
Optimal Interpolation for a scalar
Ta= T
b+ w(T
0! T
b)
w =!b
2
!b
2+!
o
2
Tb= T
t+ !
b
Information available in OI
background or forecast !b= 0; !
b!b= "
b
2
To= T
t+ !
oNew observations !
o= 0; !
o!o= "
o
2
analysis=forecast+optimal weight * observational increment:
Optimal weight
!a
2= (!
b
"2+!
o
"2)"1= (1" w)!
b
2 Analysis error
Recall the basic formulation in OI
• OI for a Scalar:
Optimal weight to minimize the analysis error is:
• OI for a Vector:
• B is statistically pre-estimated, and constant with time in its practicalimplementations. Is this a good approximation?
Ta= T
b+ w(T
0! T
b)
w =!b
2
!b
2+!
o
2
B = !xb•!x
b
T!x
b= x
b" x
t
True atmospheret=0 t=6ha
i
f
i M1!
= xx
xi
a= x
i
b+W(y
i
o! Hx
i
b)
BWHIP ][ !=a
1][!
+= RHBHBHWTT
OI and Kalman Filter for a scalar
• OI for a scalar:
• Kalman Filter for a scalar:
• Now the background error variance is forecasted using the lineartangent model L and its adjoint LT
Ta= T
b+ w(T
0! T
b)
w =!b
2
!b
2+!
o
2; !
a
2= (1" w)!
b
2
Tb(ti+1) = M (T
a(ti); assume !
b
2= (1+ a)!
a
2=
1
1" w!
a
2= const.
Tb(ti+1) = M (T
a(ti)); !
b
2(t) = (L!
a)(L!
a)T; L = dM / dT
Ta= T
b+ w(T
0! T
b)
w =!b
2
!b
2+!
o
2; !
a
2= (1" w)!
b
2
“Errors of the day” computed with the Lorenz 3-variablemodel: compare with rms (constant) error
Not only the amplitude, but also the structure of B is constant in 3D-Var/OI
This is important because analysis increments occur only within the subspacespanned by B
!z
2= (z
b" z
t)2
=0.08
“Errors of the day” obtained in the NCEP reanalysis(figs 5.6.1 and 5.6.2 in the book)
Note that the mean error went down from 10m to 8m because ofimproved observing system, but the “errors of the day” (on a synoptic timescale) are still large.
In 3D-Var/OI not only the amplitude, but also the structure of B is fixedwith time
Flow independent error covariance
In OI(or 3D-Var), the scalar error correlation between two points in thesame horizontal surface is assumed homogeneous and isotropic. (p162 inthe book)
!!!!!
"
#
$$$$$
%
&
=
2
,2,1
,2
2
22,1
,12,1
2
1
...covcov
.........
covcov
cov...cov
nnn
n
n
B
'
'
'
If we observe only Washington, D.C,we can get estimate for Richmond andPhiladelphia corrections through the
error correlation (off-diagonal term in B).
Typical 3D-Var error covariance
In OI(or 3D-Var), the error correlation between two mass points in thesame horizontal surface is assumed homogeneous and isotropic.(p162in the book)
!!!!!
"
#
$$$$$
%
&
=
2
,2,1
,2
2
22,1
,12,1
2
1
...covcov
.........
covcov
cov...cov
nnn
n
n
B
'
'
'
Background ~106-8 d.o.f.
Suppose we have a 6hr forecast (background) and new observationsSuppose we have a 6hr forecast (background) and new observations
The 3D-Var Analysis doesn’t know about the errors of the day
Observations ~105-7 d.o.f.
BR
Background ~106-8 d.o.f.
Errors of the day: they lieon a low-dim attractor
With Ensemble Kalman Filtering we get perturbations pointingWith Ensemble Kalman Filtering we get perturbations pointingto the directions of the to the directions of the ““errors of the dayerrors of the day””
3D-Var Analysis: doesn’t know about the errors of the day
Observations ~105-7 d.o.f.
Background ~106-8 d.o.f.
Errors of the day: they lieon a low-dim attractor
Ensemble Kalman Filter Analysis:correction computed in the low dimensemble space
Ensemble Kalman Filtering is efficient because Ensemble Kalman Filtering is efficient because matrix operations are performed in the low-dimensional matrix operations are performed in the low-dimensional
space of the ensemble perturbationsspace of the ensemble perturbations
3D-Var Analysis: doesn’t know about the errors of the day
Observations ~105-7 d.o.f.
Background ~106-8 d.o.f.
Errors of the day: they lieon the low-dim attractor
Observations ~105-7 d.o.f.
After the EnKF computes the analysis and the analysis error covarianceAfter the EnKF computes the analysis and the analysis error covarianceAA, the new ensemble initial perturbations are computed:, the new ensemble initial perturbations are computed:1
1
k
T
i i
i
! !+
=
=" a a A
i!a
These perturbations represent the analysis error covariance and areused as initial perturbations for thenext ensemble forecast
Flow-dependence – a simple example (Miyoshi,2004)
This is not appropriateThis does reflects the flow-dependence.
There is a cold front inour area…
What happens in thiscase?
Extended Kalman Filter (EKF)
• Forecast step
• Analysis step
• Using the flow-dependent , analysis is expected to be improvedsignificantly
However, it is computational hugely expensive. , n*n matrix, n~107
computing equation directly is impossible
Pi
b
xi
b= Mx
i!1
a
xia= xi
b+Ki (yi
o!Hxi
f)
Pi
a
n!n= [I "K
iH]P
i
b= [(P
i
b)"1+H
TR
"1H]
"1
Ki= P
i
bH
T[HP
i
bH
T+ R]
!1
iL
*Pi
b
Pi
b= L
i!1Pi!1
aLi!1
T+Q
* *
!i
b= L
i"1!i"1
a+ !
i"1
m
*
Ensemble Kalman Filter (EnKF)
Although the dimension of is huge, therank ( ) << n (dominated by the errors of theday)
Using ensemble method to estimate
*
Pib!1
m(xk
f
k=1
m
" # xt)(xk
f# x
t)T
!"mIdeally
K ensemble members, K<<n
f
iP
Pib!
1
K "1(xk
f
k=1
K
# " xf)(xk
f" x
f)T
=1
K "1X
b•X
bT
Problem left: How to update ensemble ?i.e.: How to get for each ensemblemember?
f
iPPi
b= L
i!1Pi!1
aLi!1
T+Q
a
ix
*
“errors of day” are the instabilities ofthe background flow. Strong instabilitieshave a few dominant shapes(perturbations lie in a low-dimensionalsubspace).
It makes sense to assume that largeerrors are in similarly low-dimensionalspaces that can be represented by alow order EnKF.
Physically,
Ensemble Update: two approaches1. Perturbed Observations method:An “ensemble of data assimilations”
It has been proven that an observationalensemble is required (e.g., Burgers et al.1998). Otherwise
is not satisfied.
Random perturbations are added to theobservations to obtain observations foreach independent cycle
However, perturbing observationsintroduces a source of sampling errors(Whitaker and Hamill, 2002).
xi
a
(k )= x
i
b
(k )+K
i(y
i
o
(k )! Hx
i
b
(k ))
Pi
a
n!n= [I "K
iH]P
i
b
xi
b
(k )= Mx
i!1
a
(k )
Ki= P
i
bH
T[HP
i
bH
T+ R]
!1
Pi
b!
1
K "1(x
k
b
k=1
K
# " xb)(x
k
b" x
b)T
noiseo
ik
o
i+= yy
)(
Ensemble Update: two approaches
2. Ensemble square root filter(EnSRF) Observations are assimilated to
update only the ensemble mean.
Assume analysis ensembleperturbations can be formed bytransforming the forecast ensembleperturbations through a transformmatrix
xi
a= x
i
b+K
i(y
i
o! Hx
i
b)
1
k !1X
aX
aT
= Pi
a
n"n= [I !K
iH]P
i
b= [I !K
iH]
1
k !1X
bX
bT
xi
b= Mx
i!1
a
Ki= P
i
bH
T[HP
i
bH
T+ R]
!1
Pi
b!
1
K "1(x
k
b
k=1
K
# " xb)(x
k
b" x
b)T
Xi
a= T
iX
i
b
xi
a= x
i
a+ X
a
xi
a= x
i
b+K
i(y
i
o! Hx
i
b)
Xi
a= T
iX
i
b
Several choices of the transform matrix
EnSRF, Andrews 1968, Whitaker and Hamill, 2002)
EAKF (Anderson 2001)
ETKF (Bishop et al. 2001)
LETKF (Hunt, 2005)Based on ETKF but perform analysissimultaneously in a local volumesurrounding each grid point
X
a= (I ! !KH)X
b, !K = "K
Xa= X
bT
Xa= AX
b
one observation at a time
An Example of the analysis corrections from3D-Var (Kalnay, 2004)
An Example of the analysis corrections fromEnKF (Kalnay, 2004)
Summary steps of LETKF
Ensemble analysis
Observations
GCM
Obs.operator
Ensembleforecast
Ensemble forecastat obs. locations
LETKF
1) Global 6 hr ensemble forecaststarting from the analysis ensemble
2) Choose the observations used foreach grid point3) Compute the matrices of forecastperturbations in ensemble space Xb
4) Compute the matrices of forecastperturbations in observation space Yb
5) Compute Pb in ensemble spacespace and its symmetric square root
6) Compute wa, the k vector ofperturbation weights
7) Compute the local grid pointanalysis and analysis perturbations.
8) Gather the new global analysisensemble. Go to 1
LETKF algorithm summary, Hunt 2005Forecast step (done globally): advance the ensemble 6 hours, global model size n
xn
b(i )= M (x
n!1
a(i )) i = 1,...,k
Analysis step (done locally). Local model dimension m, locally used obs s
Xb= X ! x
bY
b= H (X) ! H (x
b) " HX
b
in model space (mxm)
Ensemble analysis perturbations in model space (mxk)
in ensemble space (kxk)
Analysis increments in ensemble space (kx1)
xn
a= x
n
a+ X
bw
n
a Analysis (mean) in model space (mx1)
We finally gather all the analysis and analysis perturbations from each grid pointand construct the new global analysis ensemble (nxk) and go to next forecast step
xn
a= x
n
a+ X
a Analysis ensemble in model space (mxk)
(mxk) (sxk)
!Pa= (k !1)I + HX
b( )T
R!1HX
b"#
$%
!1
babTaaTaXPXXXP
~==
2/1)~( abaPXX =
)(~ 1 b
n
o
n
bTaa
nyyRYPw !=
!
References posted: www.atmos.umd.edu/~ekalnayGoogle: “chaos weather umd”, publications
Ott, Hunt, Szunyogh, Zimin, Kostelich, Corazza, Kalnay, Patil, Yorke, 2004: LocalEnsemble Kalman Filtering, Tellus, 56A,415–428.
Kalnay, 2003: Atmospheric modeling, data assimilation and predictability,Cambridge University Press, 341 pp. (3rd printing)
Hunt, Kalnay, Kostelich, Ott, Szunyogh, Patil, Yorke, Zimin, 2004: Four-dimensional ensemble Kalman filtering. Tellus 56A, 273–277.
Szunyogh, Kostelich, Gyarmati, Hunt, Ott, Zimin, Kalnay, Patil, Yorke, 2005:Assessing a local ensemble Kalman filter: Perfect model experiments with theNCEP global model. Tellus, 57A. 528-545.
Hunt, Kostelich and Szunyogh 2007: Efficient Data Assimilation forSpatiotemporal Chaos: a Local Ensemble Transform Kalman Filter.Physica D.
Whitaker and Hamill, 2002: Ensemble Data Assimilation Without PerturbedObservations. Monthly Weather Review, 130, 1913-1924.
EnKF vs 4D-Var
EnKF is simple and modelindependent, while 4D-Var requires thedevelopment and maintenance of theadjoint model (model dependent)
4D-Var can assimilate asynchronousobservations, while EnKF assimilateobservations at the synoptic time.
Using the weights wa at any time 4D-LETKF can assimilate asynchronousobservations and move them forward orbackward to the analysis time
Disadvantage of EnKF:
Low dimensionality of the ensemble inEnKF introduces sampling errors in theestimation of . ‘Covariancelocalization’ can solve this problem.
Pb
Ensemble analysis
Observations
GCM
Obs.operator
Ensembleforecast
Ensemble forecastat obs. locations
LETKF