ensemble forecasting: calibration, verification, and use in applications

65
Ensemble Forecasting: Calibration, Verification, and use in Applications Tom Hopson

Upload: walter

Post on 25-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

Ensemble Forecasting: Calibration, Verification, and use in Applications. Tom Hopson. Outline. Motivation for ensemble forecasting and post-processing Introduce Quantile Regression (QR; Kroenker and Bassett, 1978) p ost -processing procedure Ensemble forecast verification - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ensemble Forecasting: Calibration, Verification, and use in Applications

Ensemble Forecasting: Calibration, Verification, and use in Applications

Tom Hopson

Page 2: Ensemble Forecasting: Calibration, Verification, and use in Applications

OutlineI. Motivation for ensemble forecasting and post-

processinga) Introduce Quantile Regression (QR; Kroenker and

Bassett, 1978) post-processing procedureII. Ensemble forecast verificationIII. Thorpex-Tigge data setIV. Ensemble forecast examples:

a) Southwestern African floodingb) African meningitisc) US Army test range weather forecastingd) Bangladesh flood forecasting

Page 3: Ensemble Forecasting: Calibration, Verification, and use in Applications

Goals of an Ensemble Prediction System (EPS)

• Predict the observed distribution of events and atmospheric states

• Predict uncertainty in the day’s prediction• Predict the extreme events that are possible on a

particular day• Provide a range of possible scenarios for a

particular forecast

Page 4: Ensemble Forecasting: Calibration, Verification, and use in Applications

1. Greater accuracy of ensemble mean forecast (half the error variance of single forecast)

2. Likelihood of extremes3. Non-Gaussian forecast PDF’s4. Ensemble spread as a representation of forecast

uncertainty=> All rely on forecasts being calibrated

Further … -- Argue calibration essential for tailoring to local application:

NWP provides spatially- and temporally-averaged gridded forecast output

-- Applying gridded forecasts to point locations requires location specific calibration to account for local spatial- and temporal-scales of variability ( => increasing ensemble dispersion)

More technically …

Page 5: Ensemble Forecasting: Calibration, Verification, and use in Applications

Note:

Take home message:

For a “calibrated ensemble”, error variance of the ensemble mean is 1/2 the error variance of any ensemble member (on average), independent of the distribution being sampled

Prob

abili

ty

obsForecastPDF

Discharge

i=ensembleaverage

( fi −o)2iversus ( f −o)2

i

Simplifying

eq1 : fi2 −2of + o2

eq2 : f 2 −2of + o2

o : fj ⇒ j

eq1 : 2 f 2 − f 2( )

eq2 : f 2 − f 2

⇒ eq1=2 eq2

Page 6: Ensemble Forecasting: Calibration, Verification, and use in Applications

Forecast “calibration” or “post-processing”Pr

obab

ility

calibration

Flow rate [m3/s]

Prob

abili

ty

Post-processing has corrected:• the “on average” bias• as well as under-representation of the 2nd moment of the empirical forecast PDF (i.e. corrected its “dispersion” or “spread”)

“spread” or “dispersion”

“bias”obs

obs

ForecastPDF

ForecastPDF

Flow rate [m3/s]

Our approach:• under-utilized “quantile regression” approach• probability distribution function “means what it says”• daily variation in the ensemble dispersion directly relate to changes in forecast skill => informative ensemble skill-spread relationship

Page 7: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 8: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 9: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 10: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 11: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 12: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 13: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 14: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 15: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 16: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 17: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 18: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 19: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 20: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 21: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 22: Ensemble Forecasting: Calibration, Verification, and use in Applications
Page 23: Ensemble Forecasting: Calibration, Verification, and use in Applications

Rank Histograms – measuring the reliability of an ensemble forecast

• You cannot verify an ensemble forecast with a single observation.

• The more data you have for verification, (as is true in general for other statistical measures) the more certain you are.

• Rare events (low probability) require more data to verify => as do systems with many ensemble members.

From Barb Brown

Page 24: Ensemble Forecasting: Calibration, Verification, and use in Applications

From Tom Hamill

Page 25: Ensemble Forecasting: Calibration, Verification, and use in Applications

Troubled Rank Histograms

Slide from Matt Pocernic

1 2 3 4 5 6 7 8 9 10Ensemble #

1 2 3 4 5 6 7 8 9 10Ensemble #

Coun

ts0

1020

30

Coun

ts0

1020

30

Page 26: Ensemble Forecasting: Calibration, Verification, and use in Applications

From Tom Hamill

Page 27: Ensemble Forecasting: Calibration, Verification, and use in Applications

From Tom Hamill

Page 28: Ensemble Forecasting: Calibration, Verification, and use in Applications

From Tom Hamill

Page 29: Ensemble Forecasting: Calibration, Verification, and use in Applications

From Tom Hamill

Page 30: Ensemble Forecasting: Calibration, Verification, and use in Applications

From Tom Hamill

Page 31: Ensemble Forecasting: Calibration, Verification, and use in Applications

Example of Quantile Regression (QR)

Our application

Fitting T quantiles using QR conditioned on:

1) Ranked forecast ens

2) ensemble mean

3) ensemble median

4) ensemble stdev

5) Persistence

R package: quantreg

Page 32: Ensemble Forecasting: Calibration, Verification, and use in Applications

T [K

]

Timeforecastsobserved

Regressor set: 1. reforecast ens2. ens mean3. ens stdev 4. persistence 5. LR quantile (not shown)

Prob

abili

ty/°

K

Temperature [K]

climatologicalPDF

Step I: Determineclimatological quantiles

Step 2: For each quan, use “forward step-wisecross-validation” to iteratively select best subsetSelection requirements: a) QR cost function minimum, b) Satisfy binomial distribution at 95% confidenceIf requirements not met, retain climatological “prior”

1.

3.2.

4.

Step 3: segregate forecasts into differing ranges of ensemble dispersion and refit models (Step 2) uniquely for each range

Time

forecasts

T [K

]

I. II. III. II. I.Pr

obab

ility

/°K

Temperature [K]

ForecastPDF

prior

posterior

Final result: “sharper” posterior PDFrepresented by interpolated quans

Page 33: Ensemble Forecasting: Calibration, Verification, and use in Applications

RPS =1

n−1CDFfc,i −CDFobs,i( )

2

i=1

n

Rank Probability Scorefor multi-categorical or continuous variables

Page 34: Ensemble Forecasting: Calibration, Verification, and use in Applications

Scatter-plot and Contingency Table

Does the forecast detect correctly temperatures above 18 degrees ?

Slide from Barbara Casati

BS =1n

yi −oi( )2

i=1

n

Brier Score

y = forecasted event occurenceo = observed occurrence (0 or 1)i = sample # of total n samples

=> Note similarity to MSE

Page 35: Ensemble Forecasting: Calibration, Verification, and use in Applications

Other post-processing approaches …1) Bayesian Model Averaging (BMA) –

Raftery et al (1997)

2) Analogue approaches –Hopson and Webster, J. Hydromet (2010)

3) Kalman Filter with analogues –Delle Monache et al (2010)

4) Quantile regression –Hopson and Hacker, MWR (under review)

5) quantile-to-quantile (quantile matching) approach –Hopson and Webster J. Hydromet (2010)

… many others

Page 36: Ensemble Forecasting: Calibration, Verification, and use in Applications

Quantile Matching: another approach when matched forecasts-observationpairs are not available => useful for climate change studies

2004 Brahmaputra Catchment-averaged Forecasts-black line satellite observations-colored lines ensemble forecasts-Basic structure of catchment rainfall similar for both forecasts and observations-But large relative over-bias in forecasts

ECMWF 51-member EnsemblePrecipitation Forecasts comparedTo observations

Page 37: Ensemble Forecasting: Calibration, Verification, and use in Applications

Pmax

25th 50th 75th 100th

PfcstPrec

ipita

tion

Quantile

Pmax

25th 50th 75th 100th

Padj

Quantile

Forecast Bias Adjustment - done independently for each forecast grid

(bias-correct the whole PDF, not just the median)

Model Climatology CDF “Observed” Climatology CDF

In practical terms …

Precipitation 0 1m

ranked forecasts

Precipitation 0 1m

ranked observations

Hopson and Webster (2010)

Page 38: Ensemble Forecasting: Calibration, Verification, and use in Applications

Brahmaputra Corrected Forecasts Original Forecast

Corrected Forecast

=> Now observed precipitation within the “ensemble bundle”

Bias-corrected Precipitation Forecasts

Page 39: Ensemble Forecasting: Calibration, Verification, and use in Applications

OutlineI. Motivation for ensemble forecasting and post-

processinga) Introduce Quantile Regression (QR; Kroenker and

Bassett, 1978) post-processing procedureII. Ensemble forecast verificationIII. Thorpex-Tigge data setIV. Ensemble forecast examples:

a) Southwestern African floodingb) African meningitisc) US Army test range weather forecastingd) Bangladesh flood forecasting

Page 40: Ensemble Forecasting: Calibration, Verification, and use in Applications

• TIGGE, the THORPEX Interactive Grand Global Ensemble

• component of the World Weather Research Programme

• TIGGE archive consists of ensemble forecast data from ten global NWP centers

• designed to accelerate the improvements in the accuracy of 1-day to 2 week high-impact weather forecasts for the benefit of humanity.

• starting from October 2006

• available for scientific research

• near-real time forecasts (some centers delayed)

THORPEX Interactive Grand Global Ensemble

Page 41: Ensemble Forecasting: Calibration, Verification, and use in Applications

Archive Status and Monitoring, Data Receipt

Archive Centre

Current Data Provider

NCAR NCEP

CMC

UKMO

ECMWFMeteoFrance

JMAKMA

CMA

BoMCPTEC

IDD/LDM

HTTP

FTP

Unidata IDD/LDM

Internet Data Distribution / Local Data Manager

Commodity internet application to send and receive data

NCDC

Page 42: Ensemble Forecasting: Calibration, Verification, and use in Applications

Archive Status and Monitoring, Variability between providers

N200N128

0.56x0.561.00x1.001.25x0.831.25x1.251.50x1.50

0 1 2 3 4

Spatial Resolution

ECMWF UKMO JMA NCEP CMA CMC BOM MF KMA CPTEC

Number of Data Providers

Mod

el R

esol

ution

ECMW

F

UKMOJM

ANCEP

CMACMC

BOM MFKMA

CPTEC

0

10

20

30

40

50

60

70

80 # fields, # ensemble members

Conforming parameters

Ensemble Members

ECMW

F

UKMO

JMA

NCEPCM

ACM

CBOM M

FKM

ACPTEC

02468

1012141618

Forecast Length, Initialization

Forecast Length (Days)

Forecasts/day

Page 43: Ensemble Forecasting: Calibration, Verification, and use in Applications

Archive Status and Monitoring, Archive Completeness

PL = Pressure Level, PT = 320K θ Level, PV = ± 2 Potential Vorticity Level, SL = Single/Surface Level

Variable LvL ECWF UKMO JMA NCEP CMA CMC BOM MetF KMA CPTC

Geopotential Z PL

Specific H PL

T PL

U-velocity PL

V-velocity PL

Potential Vor PT

Potential T PV

U-velocity PV

V-Velocity PV

U 10m SL

V 10m SL

CAPE SL

Conv. Inhib. SL

Land-sea SL

Mean SLP SL

Orog. SL

Skin T SL

Snow D. H20 SL

Snow F. H20 SL

Page 44: Ensemble Forecasting: Calibration, Verification, and use in Applications

Archive Status and Monitoring, Archive CompletenessVariable LvL ECWF UKMO JMA NCEP CMA CMC BOM MetF KMA CPTC

Soil Moist. SL

Soil T SL

Sunshine D. SL

Surf. DPT SL

Surf. ATmax SL

Surf. ATmin SL

Surf. AT SL

Surf. P SL

LW Rad. Out SL

LH flux SL

Net Rad SL

Net Therm. Rad SL

Sensible Rad. SL

Cloud Cov SL

Column Water SL

Precipitation SL

Wilt. Point SL

Field Cap. SL

PL = Pressure Level, PT = 320K θ Level, PV = ± 2 Potential Vorticity Level, SL = Single/Surface Level

Page 45: Ensemble Forecasting: Calibration, Verification, and use in Applications

OutlineI. Motivation for ensemble forecasting and post-

processinga) Introduce Quantile Regression (QR; Kroenker and

Bassett, 1978) post-processing procedureII. Ensemble forecast verificationIII. Thorpex-Tigge data setIV. Ensemble forecast examples:

a) Southwestern African floodingb) African meningitisc) US Army test range weather forecastingd) Bangladesh flood forecasting

Page 46: Ensemble Forecasting: Calibration, Verification, and use in Applications

Early May 2011, floods in southwestern Africa

Page 47: Ensemble Forecasting: Calibration, Verification, and use in Applications

Early May 2011, floods in southwestern Africa-- examine ens forecasts … ECMWF 24hr precip

Page 48: Ensemble Forecasting: Calibration, Verification, and use in Applications

Early May 2011, floods in southwestern Africa-- examine ens forecasts … NCEP GEFS 24hr precip

Page 49: Ensemble Forecasting: Calibration, Verification, and use in Applications

Early May 2011, floods in southwestern Africa-- examine ens forecasts … ECMWF 5-day precip

Page 50: Ensemble Forecasting: Calibration, Verification, and use in Applications

Early May 2011, floods in southwestern Africa-- examine ens forecasts … NCEP GEFS 5day precip

Page 51: Ensemble Forecasting: Calibration, Verification, and use in Applications

Early May 2011, floods in southwestern Africa-- examine ens forecasts … NCEP GEFS 5day precip

Page 52: Ensemble Forecasting: Calibration, Verification, and use in Applications

A Cautionary Warning about using ProbabilisticPrecipitation Forecasts in Hydrologic Modeling

(Importance of Maintaining Spatial and Temporal Covariancesfor Hydrologic Forecasting => one option: “Schaake Shuffle”)

River catchtment A

subB

subC

ensemble1 ensemble2 ensemble3

QBQC

QA

Scenario forsmallest possibleQA? No.

Scenario forlargest possibleQA? No.

QA sameFor all 3 possibleensembles

Scenario foraverage QA?

Page 53: Ensemble Forecasting: Calibration, Verification, and use in Applications

Dugway Proving Ground

Page 54: Ensemble Forecasting: Calibration, Verification, and use in Applications

Dugway Proving Grounds, Utah e.g. T Thresholds

• Includes random and systematic differences between members.

• Not an actual chance of exceedance unless calibrated.

Page 55: Ensemble Forecasting: Calibration, Verification, and use in Applications

Challenges in probabilistic mesoscale prediction

• Model formulation• Bias (marginal and conditional)• Lack of variability caused by truncation and approximation• Non-universality of closure and forcing

• Initial conditions• Small-scales are damped in analysis systems, and the model must

develop them• Perturbation methods designed for medium-range systems may not be

appropriate• Lateral boundary conditions

• After short time periods the lateral boundary conditions can dominate• Representing uncertainty in lateral boundary conditions is critical

• Lower boundary conditions• Dominate boundary-layer response• Difficult to estimate uncertainty in lower boundary conditions

Page 56: Ensemble Forecasting: Calibration, Verification, and use in Applications

RTFDDA and Ensemble-RTFDDA

Liu et al. 2010 AMS Annual Meeting, 14th IOAS-AOLS, Atlanta, GA. January 18 – 23, [email protected]

Page 57: Ensemble Forecasting: Calibration, Verification, and use in Applications

National Security Applications Program Research Applications Laboratory

3-hr dewpoint time seriesBefore Calibration After Calibration

Station DPG S01

Page 58: Ensemble Forecasting: Calibration, Verification, and use in Applications

42-hr dewpoint time seriesBefore Calibration After Calibration

Station DPG S01

Page 59: Ensemble Forecasting: Calibration, Verification, and use in Applications

obs

Blue is “raw” ensembleBlack is calibrated ensembleRed is the observed value

Notice: significant change in both “bias” and dispersion of final PDF

(also notice PDF asymmetries)

PDFs: raw vs. calibrated

Page 60: Ensemble Forecasting: Calibration, Verification, and use in Applications

National Security Applications Program Research Applications Laboratory

3-hr dewpoint rank histogramsStation DPG S01

Page 61: Ensemble Forecasting: Calibration, Verification, and use in Applications

National Security Applications Program Research Applications Laboratory

Station DPG S01

42-hr dewpoint rank histograms

Page 62: Ensemble Forecasting: Calibration, Verification, and use in Applications

Measures Used:1) Rank histogram (converted to scalar measure)2) Root Mean square error (RMSE)3) Brier score4) Rank Probability Score (RPS)5) Relative Operating Characteristic (ROC) curve6) New measure of ensemble skill-spread utility

=> Using these for automated calibration model selection by using weighted sum of skill scores of each

Utilizing Verification measures near-real-time …

Page 63: Ensemble Forecasting: Calibration, Verification, and use in Applications

Skill Scores

• Single value to summarize performance.• Reference forecast - best naive guess;

persistence, climatology• A perfect forecast implies that the object

can be perfectly observed• Positively oriented – Positive is good

SS =Aforc −Aref

Aperf −Aref

Page 64: Ensemble Forecasting: Calibration, Verification, and use in Applications

National Security Applications Program Research Applications Laboratory

Skill Score VerificationRMSE Skill Score CRPS Skill Score

Reference Forecasts:Black -- raw ensembleBlue -- persistence

Page 65: Ensemble Forecasting: Calibration, Verification, and use in Applications

Thank You!