data assimilation - european space agency · 2013. 10. 23. · data assimilation alan o’neill...

Data Assimilation

Alan O’NeillData Assimilation Research Centre

University of Reading

Contents

• Motivation• Univariate (scalar) data assimilation• Multivariate (vector) data assimilation

– Optimal Interpolation (BLUE)– 3d-Variational Method– Kalman Filter– 4d-Variational Method

• Applications of data assimilation in earth system science

Motivation

What is data assimilation?

Data assimilation is the technique whereby observational data are combined with output from a numerical model to produce an optimal estimate of the evolving state of the system.

DARC

Why We Need Data Assimilation

• range of observations• range of techniques• different errors• data gaps• quantities not measured• quantities linked

Some Uses of Data Assimilation

• Operational weather and ocean forecasting

• Seasonal weather forecasting• Land-surface process• Global climate datasets• Planning satellite measurements• Evaluation of models and observations

DARC

Preliminary Concepts

What We Want To Know

csx

)()(

tt atmos. state vector

surface fluxes

model parameters

)),(),(()( csxX ttt =

What We Also Want To Know

Errors in models

Errors in observations

What observations to make

Numerical ModelDAS

DATA ASSIMILATION SYSTEM

O

Data Cache

A

A

B

F

model

observations

Error Statistics

The Data Assimilation Process

observations forecasts

estimates of state & parameters

compare reject adjust

errors in obs. & forecasts

X

t

observation

model trajectory

Data Assimilation:an analogy

Driving with your eyes closed: open eyes every 10 seconds and correct trajectory

DARC

Basic Concept of Data Assimilation

• Information is accumulated in time into the model state and propagated to all variables.

What are the benefits of data assimilation?

• Quality control• Combination of data• Errors in data and in model• Filling in data poor regions• Designing observational systems• Maintaining consistency• Estimating unobserved quantities

DARC

Methods of Data Assimilation

• Optimal interpolation (or approx. to it)

• 3D variational method (3DVar)

• 4D variational method (4DVar)

• Kalman filter (with approximations)

DARC

Types of Data Assimilation

• Sequential• Non-sequential• Intermittent• Continous

Sequential Intermittent Assimilation

analysis analysisanalysismodel model

obsobsobsobsobs obs

Sequential Continuous Assimilation

obs obs obs obs obs

Non-sequential Intermittent Assimilation

analysis + model analysis + model analysis + model

obs obs obs obs obs obs

Non-sequential Continuous Assimilation

analysis + model

obsobs

obsobsobs

obs

Statistical Approach to Data Assimilation

Data Assimilation Made Simple(scalar case)

Least Squares Method(Minimum Variance)

eduncorrelat

are tsmeasuremen twothe,0 )(

)(

0

21

22

22

21

21

21

22

11

>=<>=<

>=<

>=>=<<+=+=

εεσε

σε

εεεε

t

t

TT

TT

1

:unbiased be should analysis The

:nsobservatio theof

ncombinatiolinear a as Estimate

21

2211

t

=+⇒

>=<

+=

aa

TT

TaTaT

T

ta

a

Least Squares Method Continued

1 constraint thesubject to

)(

))()(()(

:error squaredmean its minimizingby Estimate

21

22

22

221

2221

22211

22

=+

+>=+=<

>−+−>=<−=<

11

aa

aaaa

TTaTTaTT

T

tttaa

a

σσεε

σ

Least Squares Method Continued

22

2

2

1 11

1

σσ

σ+

=

1

1a2

22

22

2 11

1

σσ

σ+

=

1

a

22

21

2

111σσσ

+=⇒a

The precision of the analysis is the sum of the precisions of the measurements. The analysis therefore has higher precision than any single measurement (if the statistics are correct).

Variational Approach

0for which of value theis

)()(21

)( 22

22

2

21

=∂∂

−+

−=

1

TJTT

TTTTTJ

a

σσ

)(TJ

T

Maximum Likelihood Estimate

• Obtain or assume probability distributions for the errors

• The best estimate of the state is chosen to have the greatest probability, or maximum likelihood

• If errors normally distributed,unbiased and uncorrelated, then states estimated by minimum variance and maximum likelihood are the same

Maximum Likelihood Approach (Baysian Derivation)

pdf.prior the2

)(exp )(

:is statisticserror Gaussian for truth theofon distributiy probabilit Then the

on).assimilati datain forecast background the(

,n observatioan madealready have weAssume

2

21

1

−−∝

1σTT

Tp

T

T

Maximum Likelihood Continued

).statisticsGaussian (for case ldimensiona-multifor holds eEquivalencfunction.cost theminimizing asanswer same get the We

truth. theestimate toln)(or pdfposterior theMaximizeT. oft independenfactor gnormalisin is )( since

2)(

exp 2

)(exp

)()()¦(

)¦(

:is n observatio given the pdfposterior for the formula sBayes'

2

21

21

22

22

2

22

2

Tp

TTTTTp

TpTTpTTp

T

−−

−−∝=

σσ

Simple Sequential Assimilation

22

1222

21

)1(

:is anceerror vari analysis theand ,)(

:bygiven is weight optimal The

".innovation" theis ) ( where) (

Let

ba

obb

boboba

ob

W

W

W

TTTTWTT

TTTT

σσ

σσσ

−=

+=

−−+=

==

−

Comments

• The analysis is obtained by adding first guess to the innovation.

• Optimal weight is background error variance multiplied by inverse of total variance.

• Precision of analysis is sum of precisions of background and observation.

• Error variance of analysis is error variance of background reduced by (1- optimal weight).

Simple Assimilation Cycle

• Observation used once and then discarded.

• Forecast phase to update and • Analysis phase to update and • Obtain background as

• Obtain variance of background as

bT 2bσ

aT 2aσ

)]([)( 1 iaib tTMtT =+

)()(ely alternativ )()( 21

221

2iaibibib tattt σσσσ == ++

Simple Kalman Filter

22,

221,

21,

,

,,11,

221

)(M)()(

:is covarianceerror backgroundForecast

M whereM

)()()(Then

)biased!not (model ,)]([)(

before. as step Analysis

Q

TM

TMTMTT

QtTMtT

iaibib

mia

mitiaitbib

mmitit

+>==<

∂∂=+=

+−=−=

>=<−=

++

++

+

σεσ

εε

εε

εε

Multivariate Data Assimilation

Multivariate Case

=

nx

xx

t 2

1

)(x vector state

=

my

y

y

t 2

1

)( vector nobservatio y

State Vectors

x

bxtx

ax

state vector (column matrix)

true state

background state

analysis, estimate of tx

Ingredients of Good Estimate of the State Vector (“analysis”• Start from a good “first guess” (forecast

from previous good analysis)• Allow for errors in observations and first

guess (give most weight to data you trust)

• Analysis should be smooth• Analysis should respect known physical

laws

Some Useful Matrix Properties

inversion) through conserved isproperty this(. unless 0,scalar the,

:matrix symmetrix for ssdefinitene Positive)()( : transposea of Inverse

)( :product a of Inverse

)( :product a of Transpose

T

T11T

111

TTT

0xxAxx

AAA

ABAB

ABAB

=>∀

=

=

=

−−

−−−

Observations

• Observations are gathered into an observation vector , called the observation vector.

• Usually fewer observations than variables in the model; they are irregularly spaced; and may be of a different kind to those in the model.

• Introduce an observation operator to map from model state space to observation space.

y

)(xx H→

Errors

Variance becomes Covariance Matrix

• Errors in xi are often correlated– spatial structure in flow– dynamical or chemical relationships

• Variance for scalar case becomes Covariance Matrix for vector case COV

• Diagonal elements are the variances of xi

• Off-diagonal elements are covariances between xi and xj

• Observation of xi affects estimate of xj

The Error Covariance Matrix

( )

><><><

><><><><><><

>==<

=

=

nnnn

n

n

n

n

eeeeee

eeeeeeeeeeee

eee

e

ee

.....................

...

...

...

.

.

.

21

22212

12111

T

21T

2

1

εε

εε

P

2iiiee σ>=<

Background Errors

• They are the estimation errors of the background state:

• average (bias)• covariance

tbb xx −=ε>< bε

>><−><−=< Tbb ))(( εεεεB

Observation Errors• They contain errors in the observation

process (instrumental error), errors in the design of , and “representativeness errors”, i.e. discretizaton errors that prevent from being a perfect representation of the true state.

H

tx

)( to xy H−=ε >< oε

>><−><−=< Toooo ))(( εεεεR

Control Variables

• We may not be able to solve the analysis problem for all components of the model state (e.g. cloud-related variables, or need to reduce resolution)

• The work space is then not the model space but the sub-space in which we correct , called control-variable space

bxxxx δ+= ba

Innovations and Residuals

• Key to data assimilation is the use of differences between observations and the state vector of the system

• We call the innovation

• We call the analysisresidual

)( bxy H−)( axy H−

Give important information

Analysis Errors

• They are the estimation errors of the analysis state that we want to minimize.

taa xx −=ε

Covariance matrix A

Using the Error Covariance Matrix

Recall that an error covariance matrix for the error in has the form:

T >=< εεC

If where is a matrix, then the error covariance for is given by:

Hxy = H

x

yTHCHC =y

C

BLUE Estimator• The BLUE estimator is given by:

• The analysis error covariance matrix is:

• Note that:

1TT

bba

)(

))((−+=

−+=

RHBHBHK

xyKxx H

BKHIA )( −=

1T11T11TT )()( −−−−− +=+ RHHRHBRHBHBH

Statistical Interpolation with Least Squares Estimation

• Called Best Linear Unbiased Estimator (BLUE).

• Simplified versions of this algorithm yield the most common algorithms used today in meteorology and oceanography.

Assumptions Used in BLUE• Linearized observation operator:

• and are positive definite.• Errors are unbiased:

• Errors are uncorrelated:

• Linear anlaysis: corrections to background depend linearly on (background – obs.).

• Optimal analysis: minimum variance estimate.

)()()( bb xxHxx −=− HHB R

0)( ttb >=−>=<−< xyxx H

0))()(( Tttb >=−−< xyxx H

Optimal Interpolation

))(( bba xyKxx H−+=

“analysis”

“background”(forecast)

observation

1RHBHBHK −+= )( TT

• linearity H H

• matrix inverse

• limited area

observation operator

∆

∆

∆

∆

∆

∆∆

∆

)( bxy H−=∆ at obs. point

bx

data void

data assimilation - european space agency · 2013. 10. 23. · data assimilation alan o’neill...

Documents