development of a verification methods testbed at the wrf dtc

53
Development of a verification methods testbed at the WRF DTC Mike Baldwin Purdue University

Upload: jolie-hall

Post on 31-Dec-2015

28 views

Category:

Documents


1 download

DESCRIPTION

Development of a verification methods testbed at the WRF DTC. Mike Baldwin Purdue University. Acknowledgements. WRF Developmental Testbed Center visiting scientist program Beth Ebert Barbara Casati Ian Jolliffe Barb Brown Eric Gilleland. Motivation for new verification methods. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Development of a verification methods testbed at the WRF DTC

Development of a verification methods testbed

at the WRF DTC

Mike Baldwin

Purdue University

Page 2: Development of a verification methods testbed at the WRF DTC

Acknowledgements

• WRF Developmental Testbed Center– visiting scientist program

• Beth Ebert

• Barbara Casati

• Ian Jolliffe

• Barb Brown

• Eric Gilleland

Page 3: Development of a verification methods testbed at the WRF DTC

Motivation for new verification methods

• Great need within both the research and operational NWP communities for new verification methods

• High-resolution forecasts containing realistic detail/structure

• Ensembles/probabilistic forecasts

Page 4: Development of a verification methods testbed at the WRF DTC
Page 5: Development of a verification methods testbed at the WRF DTC
Page 6: Development of a verification methods testbed at the WRF DTC
Page 7: Development of a verification methods testbed at the WRF DTC
Page 8: Development of a verification methods testbed at the WRF DTC

Forecast #1: smooth

OBSERVED

FCST #1: smooth

FCST #2: detailed

OBSERVED

Page 9: Development of a verification methods testbed at the WRF DTC

Traditional verification measures for these forecasts

Verification Measure Smooth forecast

Detailed forecast

Mean absolute error 0.157 0.159

RMS error 0.254 0.309

Bias 0.98 0.98

Threat score (>0.45) 0.214 0.161

Equitable threat score 0.170 0.102

Page 10: Development of a verification methods testbed at the WRF DTC

Traditional performance measures

• Often fail to provide meaningful information when applied to realistic forecasts

• Many of the unfavorable aspects of traditional measures are well-known

• Yet such measures continue to be used extensively

Page 11: Development of a verification methods testbed at the WRF DTC

S1 score (500mb heights)

Page 12: Development of a verification methods testbed at the WRF DTC

anomaly correlation (500mb height)

Page 13: Development of a verification methods testbed at the WRF DTC

Threat score (QPF)

Page 14: Development of a verification methods testbed at the WRF DTC

Sensitivity to bias and event frequency

Page 15: Development of a verification methods testbed at the WRF DTC

Why?

• History

• Continuity

• Familiarity

• Understandable

• Comfort level

• A certain degree of credibility has been established after forecast performance has been measured over several decades

Page 16: Development of a verification methods testbed at the WRF DTC

New methods

• Plenty of new verification methods have been proposed– Features-based– Morphing– Scale decomposition– Fuzzy/neighborhood

• Why haven’t they caught on?

Page 17: Development of a verification methods testbed at the WRF DTC

Why haven’t they caught on?

• Usability has not been demonstrated

• No history

• Difficult to interpret results

• Credibility has not yet been established

Page 18: Development of a verification methods testbed at the WRF DTC

Fuzzy verification framework

good performance

poor performance

from Beth Ebert (2008)

Page 19: Development of a verification methods testbed at the WRF DTC

Weaknesses and limitations

• Less intuitive than object-based methods

• Imperfect scores for perfect forecasts for methods that match neighborhood forecasts to single observations

• Information overload if all methods invoked at once– Let appropriate decision model(s) guide the choice of

method(s)

• Even for a single method …– there are lots of numbers to look at

– evaluation of scales and intensities with best performance depends on metric used (CSI, ETS, HK, etc.). Be sure the metric addresses the question of interest!

from Beth Ebert (2008)

Page 20: Development of a verification methods testbed at the WRF DTC

Typical path to acceptance and adoption of new verif methods

• Develop a new technique• Test it on a small number of cases• Publish those results and methodology• Apply the technique to forecasts on a routine

basis• Build up a collection of results• Compare the new and traditional methods• ACCEPT: when users become satisfied with the

behavior of the new method

Page 21: Development of a verification methods testbed at the WRF DTC

Propose a testbed for verification methods

• Provide access to a database of operational and experimental forecasts

• Covering a period of several years

• Compare new and traditional measures

• Collaborate with users of verification information

• This will help to speed up the process of establishing credibility and eventual use

Page 22: Development of a verification methods testbed at the WRF DTC

Long-period database of forecasts

• NCEP Operational:– GFS– NAM– model grid spacing– QPF (3h and 24h accumulations)– truth: Stage IV analyses – CONUS region– 00 and 1200 UTC initial times– archive period: 1999-present– additional fields (temperature, heights) may be added

Page 23: Development of a verification methods testbed at the WRF DTC

Forecast archive

• Experimental:– WRF runs produced to support SPC/NSSL

HWT in 2004, 2005, 2007, 2008– 2004 and 2005 data already in hand– used as part of Spatial Forecast Verification

Intercomparison Project (ICP)– hourly QPF: Stage IV analyses– additional fields (surface temps, reflectivity) to

be added if feasible

Page 24: Development of a verification methods testbed at the WRF DTC

Formats

• Forecasts will be available in several standard data formats (GRIB to start with)

• Archived will be maintained by DTC

• Software routines will be provided to read data, interpolation library

• Work with MET verification package– traditional scores– some new methods currently available

Page 25: Development of a verification methods testbed at the WRF DTC

Testbed

• Fits into the WRF/DTC framework• Provides a “proving ground” for new methods• Answer operational concerns

– How much time does a method take to run?– How much time/effort is required to analyze results?– How should information be presented to users?– Compare with traditional methods?– How do results change before/after major model

upgrades?

Page 26: Development of a verification methods testbed at the WRF DTC

Collaboration with users

• Subjective component

• SPC/NSSL HWT (Spring Program) has collected extensive subjective/expert ratings of experimental WRF model forecasts

• DTC facilitates transfer from research to operations

• Potential use for training

Page 27: Development of a verification methods testbed at the WRF DTC

“Show me”

• The testbed will allow researchers to demonstrate meaningful ways to apply new verification information

• Applied to current operational models– accelerate the process of improving guidance

• Event-based errors for specific classes of phenomena

• Error scales

Page 28: Development of a verification methods testbed at the WRF DTC

NDFD-scale surface parameters

• WRF

Page 29: Development of a verification methods testbed at the WRF DTC

NDFD-scale surface parameters

• RTMA

Page 30: Development of a verification methods testbed at the WRF DTC

Possible additions

• OPeNDAP/THREDDS access

• regions beyond the U.S.

• possible WGNE QPF verification data

• ensemble forecasts

• grid-to-obs capability

Page 31: Development of a verification methods testbed at the WRF DTC
Page 32: Development of a verification methods testbed at the WRF DTC

General verification framework

• Any verification method should be built upon the general framework for verification outlined by Murphy and Winkler (1987)

• New methods can be considered an extension or generalization of the original framework

• Joint distribution of forecasts and observations: p(f,o)

Page 33: Development of a verification methods testbed at the WRF DTC

general joint distribution

• p(f,o) : where f and o are vectors containing all variables, matched in space and time– o could come from data assimilation – joint distribution difficult to analyze– different factorizations simplify analysis– provide information on specific aspects of

forecast quality

Page 34: Development of a verification methods testbed at the WRF DTC

general joint distribution

• p(G[f],H[o]) : where G, H are mapping/transformation/operators that are applied to the variable values– morphing– filter– convolution– fuzzy

• some methods perform mapping of o that is a function of f

Page 35: Development of a verification methods testbed at the WRF DTC

general joint distribution

• p(Gm[f],Hm[o]) : where Gm is a specific aspect/attribute/characteristic that results from the mapping operator

• measures-oriented– compute some error measure or score that is

a function of Gm[f],Hm[o]– MSE

• what is the impact of these operators on the joint distribution?

Page 36: Development of a verification methods testbed at the WRF DTC

Standardize terminology

• “feature” – a distinct or important physical object that can be identified within meteorological data

• “attribute” – a characteristic or quality of a feature, an aspect that can be measured

• “similarity” – the degree of resemblance between features

• “distance” – the degree of difference between features

• others?

Page 37: Development of a verification methods testbed at the WRF DTC

framework

• follow Murphy (1993) and Murphy and Winkler (1987) terminology

• joint distribution of forecast and observed features

• goodness: consistency, quality, value

Page 38: Development of a verification methods testbed at the WRF DTC

aspects of quality

• accuracy: correspondence between forecast and observed feature attributes– single and/or multiple?

• bias: correspondence between mean forecast and mean observed attributes

• resolution• reliability• discrimination• stratification

Page 39: Development of a verification methods testbed at the WRF DTC

Features-based process

• Identify featuresFCST OBS

Page 40: Development of a verification methods testbed at the WRF DTC

feature identification

• procedures for locating a feature within the meteorological data

• will depend on the problem/phenomena/user of interest

• a set of instructions that can (easily) be followed/programmed in order for features to be objectively identified in an automated fashion

Page 41: Development of a verification methods testbed at the WRF DTC

Features-based process

• Characterize featuresFCST OBS

Page 42: Development of a verification methods testbed at the WRF DTC

feature characterization

• a set of attributes that describe important aspects of each feature

• numerical values will be the most useful

Page 43: Development of a verification methods testbed at the WRF DTC

FCST OBS

Features-based process

• Compare features

How to determine false alarms/missed events?How to measure differences between objects?

Page 44: Development of a verification methods testbed at the WRF DTC

feature comparison

• similarity or distance measures

• systematic method of matching or pairing observed and forecast features

• determination of false alarms?

• determination of missed events?

Page 45: Development of a verification methods testbed at the WRF DTC

Features-based process

• Classify featuresFCST OBS

Page 46: Development of a verification methods testbed at the WRF DTC

classification

• a procedure to place similar features into groups or classes

• reduces the dimensionality of the verification problem– similar to going from a scatter plot to a

contingency table

• not necessary/may not always be used

Page 47: Development of a verification methods testbed at the WRF DTC

SSEC MODIS archive 10 Apr 2003

Page 48: Development of a verification methods testbed at the WRF DTC

feature matching

Page 49: Development of a verification methods testbed at the WRF DTC

attributes

Lake Fcst #1

Fcst #2

Obs #1

Obs #2

Obs #3

Lat 47.7 44.0 44.8 42.2 43.7

Lon 87.5 87 82.4 81.2 77.9

Area (km2)

82400 58000 59600 25700 19500

Volume (km3)

12000 4900 3540 480 1640

Max depth (m)

406 281 230 64 246

Page 50: Development of a verification methods testbed at the WRF DTC

How to match observed and forecast objects?

= false alarm

F1

O2

O3

= missed event

Objects might “match” more than once…

If di* > dT then false alarm

If d*j > dT : missed event

…for each observed object, choose closest forecast object

dij = ‘distance’ between F i and O j

…for each forecast object, choose closest observed object

O1

F2

Page 51: Development of a verification methods testbed at the WRF DTC

Example of object verf

Fcst_1

ARW 2km (CAPS) Radar mosaic

Obs_2Fcst_2

Obs_1

Object identification procedure identifies 4 forecast objects and 5 observed objects

Page 52: Development of a verification methods testbed at the WRF DTC

Distances between objects

• Use dT = 4 as threshold

• Match objects, find false alarms, missed events

O_34 O_37 O_50 O_77 O_79

F_25 5.84 4.16 8.94 9.03 11.53

F_27 6.35 2.54 7.18 6.32 9.25

F_52 7.43 9.11 4.15 9.19 5.45

F_81 9.39 6.35 6.36 2.77 5.24

ARW 2km (CAPS) Radar mosaic

Page 53: Development of a verification methods testbed at the WRF DTC

= .07 = .08

= .04 = .22

= .04 = -.07

median position errors

matching obs object given a forecast object

NMM4

ARW2 ARW4