trading space for time in bayesian framework

1
TRADING SPACE FOR TIME IN BAYESIAN FRAMEWORK Nataliya Bulygina 1 , Susana L. Almeida 1 , Thorsten Wagener 2 , Wouter Buytaert 1 , Neil McIntyre 1 1 Department of Civil and Environmental Engineering, Imperial College London, UK ([email protected] ) ; 2 Department of Civil Engineering, University of Bristol, Bristol, UK CONCLUSIONS Due to model structural error, more information does not necessarily lead to better predictions. The prior plays an important role under data-scarce conditions, and can bias predictions due to model choice. When a few pieces of information are available a prior that is uniform in the relevant output space (not parameters), is recommended to be used in Bayesian conditioning. REFERENCE Almeida, S., N. Bulygina, N. McIntyre, T. Wagener, and W. Buytaert, 2012. Predicting flows in ungauged catchments using correlated information sources, BHS National Symposium Proceedings, Dundee, UK. Five behavioural indices are considered: - Runoff Ratio (RR), - Base Flow Index (BFI), - High Pulse Count (HPC), - Slope of Flow Duration Curve (SFDC), - Streamflow Elasticity (SE). The indices are related to climatic indices, catchment shape characteristics, soils, and vegetation (LAI) via a step-wise regression (Almeida et al, 2012). Probability distributions are fitted to the residuals including inter- index residual covariance, and used to define likelihoods. Regionalised information, as contained in regionalised streamflow response indices, is used in a Bayesian framework to constrain hydrological models, thus trading space for time when dealing with non-stationarity. In our approach, likelihoods are derived explicitly by taking account of the inter-index error covariance structure. Meanwhile the prior is shown to play a significant role, and the assumption of uniform parameter priors is found to be unsuitable, and a transformation is required. US catchments taken from the MOPEX database are used to test the methodological development. RESULTS: LIKELIHOOD RESULTS: PRIOR Susana Almeida is supported under FCT - Fundação para a Ciência e a Tecnologia, Portugal (grant SFRH/BD/65522/2009). ABSTRACT METHOD : LIKELIHOOD -0.1 0 0.1 B FIres -2 0 2 S E res -0.02 0 0.02 S FD C res -0.2 0 0.2 -50 0 50 H P C res R R res -0.1 0 0.1 BFI res -2 0 2 SE res -0.02 0 0.02 SFD C res -0.2 0 0.2 0 20 40 R R res -0.2 0 0.2 0 10 20 -2 0 2 0 20 40 -0.02 0 0.02 0 20 40 -50 0 50 0 20 40 H PC res =-0.03 p-val=0.76 =-0.12 p-val=0.28 =-0.04, p-val=0.73 =-0.14 p-val=0.18 =0.34 p-val=0.00 =-0.25 p-val=0.02 =-0.31 p-val=0.00 =-0.19 p-val=0.07 =0.08 p-val=0.45 =0.66 p-val=0.00 BAYES’ LAW M – model, Θ – model parameters, I * - regionalized indices, I M, Θ – modelled indices, L(I M, Θ | I * ) – model parameter likelihood given regionalised indices, and p 0 (Θ|M) prior model parameter distribution. ) | ( ) | ( ) , | ( 0 * , * M p I I L I M p M Distribution of α from QQ plots (the closer to α =1 the better) Model inability to represent a pair of indices: modelled distribution of indices vs. regionalised indices (red star) On average, prediction quality stops improving after including information from a sub-set of indices (three, as shown here). Model structure used appears incapable of capturing sub-sets of behavioural indices. METHOD: PRIOR A model transforms the common uniform-in-parameters prior into a non-uniform-in-output prior, which often illustrates that the highest sampling density is in less relevant parts of the output space. In data- scarce problems (here, using regionalised indices), where the shape of the prior is influential on the posterior, this leads to bias. We propose to use a uniform-in-indices distribution as a prior. Due to numerical sampling difficulties, importance sampling can be used, so that a uniform-in- indices prior p 0 approximates as where model parameters Θ are drawn from a proposal distribution g(.), weights w i are inversely proportional to a behavioural index distribution G(.) that is a model M transformation of the proposal distribution g(.). i i i i i i I G g w g w M p (.); ~ ; ; ) | ( 0 A uniform-in-index prior leads to more consistent predictions of the behavioural indices when compared to uniform-in-parameters prior. Reliability of flow time series predictions is generally better when a uniform-in-index prior is used (orange box plots for α), rather than uniform-in- parameters INDICES FLOW TIME SERIES 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 RR BFI SE SFDC HPC Distribution across 84 catchments of α from QQ plots (the closer to α =1 the better) Closeness of line to the diagonal reflects consistency of estimated posterior with observed indices Uniform-in-parameters prior, likelihood, and resulting posterior pdfs in behavioural index space

Upload: hesper

Post on 21-Feb-2016

57 views

Category:

Documents


0 download

DESCRIPTION

Trading space for time in Bayesian framework. Nataliya Bulygina 1 , Susana L. Almeida 1 , Thorsten Wagener 2 , Wouter Buytaert 1 , Neil McIntyre 1 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Trading space for time in Bayesian framework

TRADING SPACE FOR TIME IN BAYESIAN FRAMEWORKNataliya Bulygina1, Susana L. Almeida1, Thorsten Wagener2, Wouter Buytaert1, Neil McIntyre1

1Department of Civil and Environmental Engineering, Imperial College London, UK ([email protected]); 2 Department of Civil Engineering, University of Bristol, Bristol, UK

CONCLUSIONS• Due to model structural error, more information does not necessarily lead to better predictions.

• The prior plays an important role under data-scarce conditions, and can bias predictions due to model choice.

• When a few pieces of information are available a prior that is uniform in the relevant output space (not parameters), is recommended to be used in Bayesian conditioning.

REFERENCEAlmeida, S., N. Bulygina, N. McIntyre, T. Wagener, and W. Buytaert, 2012. Predicting flows in ungauged catchments using correlatedinformation sources, BHS National Symposium Proceedings, Dundee, UK.

Five behavioural indices are considered:

- Runoff Ratio (RR),

- Base Flow Index (BFI),

- High Pulse Count (HPC),

- Slope of Flow Duration Curve (SFDC),

- Streamflow Elasticity (SE).

The indices are related to climatic indices, catchment shape characteristics, soils, and vegetation (LAI) via a step-wise regression (Almeida et al, 2012). Probability distributions are fitted to the residuals including inter-index residual covariance, and used to define likelihoods.

Regionalised information, as contained in regionalised streamflow response indices, is used in a Bayesian framework to constrain hydrological models, thus trading space for time when dealing with non-stationarity.

In our approach, likelihoods are derived explicitly by taking account of the inter-index error covariance structure. Meanwhile the prior is shown to play a significant role, and the assumption of uniform parameter priors is found to be unsuitable, and a transformation is required. US catchments taken from the MOPEX database are used to test the methodological development.

RESULTS: LIKELIHOOD

RESULTS: PRIOR

Susana Almeida is supported under FCT - Fundação para a Ciência e a Tecnologia, Portugal (grant SFRH/BD/65522/2009).

ABSTRACT

METHOD : LIKELIHOOD

-0.1

0

0.1

BFI

res

-2

0

2

SE

res

-0.02

0

0.02

SFD

C re

s

-0.2 0 0.2-50

0

50

HP

C re

s

RR res-0.1 0 0.1

BFI res-2 0 2

SE res-0.02 0 0.02

SFDC res

-0.2 0 0.20

20

40

RR

res

-0.2 0 0.20

10

20

-2 0 20

20

40

-0.02 0 0.020

20

40

-50 0 500

20

40

HPC res

=-0.03p-val=0.76

=-0.12p-val=0.28

=-0.04,p-val=0.73

=-0.14p-val=0.18

=0.34p-val=0.00

=-0.25p-val=0.02

=-0.31p-val=0.00

=-0.19p-val=0.07

=0.08p-val=0.45

=0.66p-val=0.00

BAYES’ LAW

M – model, Θ – model parameters, I* - regionalized indices, IM, Θ – modelled indices, L(IM, Θ | I*) – model parameter likelihood given regionalised indices, and p0(Θ|M) – prior model parameter distribution.

)|()|(),|( 0*

,* MpIILIMp M

Distribution of α from QQ plots (the closer to α =1 the better)

Model inability to represent a pair of indices: modelled distribution of indices vs. regionalised indices (red star)

On average, prediction quality stops improving after including information from a sub-set of indices (three, as shown here).

Model structure used appears incapable of capturing sub-sets of behavioural indices.

METHOD: PRIORA model transforms the common uniform-in-parameters prior into a non-uniform-in-output prior, which often illustrates that the highest sampling density is in less relevant parts of the output space. In data-scarce problems (here, using regionalised indices), where the shape of the prior is influential on the posterior, this leads to bias.

We propose to use a uniform-in-indices distribution as a prior. Due to numerical sampling difficulties, importance sampling can be used, so that a uniform-in-indices prior p0 approximates as

where model parameters Θ are drawn from a proposal distribution g(.), weights wi are inversely proportional to a behavioural index distribution G(.) that is a model M transformation of the proposal distribution g(.).

ii

iiii IGgwgwMp

(.);~;;)|(0

A uniform-in-index prior leads to more consistent predictions of the behavioural indices when compared to uniform-in-parameters prior.

Reliability of flow time series predictions is generally better

when a uniform-in-index prior is used (orange box plots for α),

rather than uniform-in-parameters

INDICES FLOW TIME SERIES

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

RR

BFI

SE

SFDC

HPC

Distribution across 84 catchments of α from QQ plots(the closer to α =1 the better)Closeness of line to the diagonal reflects consistency

of estimated posterior with observed indices

Uniform-in-parameters prior, likelihood, and resulting posterior pdfs in behavioural index space