Topic 3.1: Effective Warning Process:
Forecast Validation
Barbara Brown1, Beth Ebert2 1NCAR, Boulder, Colorado, USA ([email protected])
2CAWCR / BOM. Melbourne, Australia ([email protected])
Working group: G. Chen, STI, China; E. Fukada, JTWC, USA; E. Gilleland, NCAR, USA;
P. Otto, BOM, Australia; A. Tyagi, Ministry of Earth Sciences, India;
Hoa Vo Van, NCHMF, Vietnam; L. Wilson, EC, Canada; Hui Yu, STI, China
International Workshop on Tropical Cyclones – 8
Jejju, South Korea
4 December 2014
2
Role of verification / validation
Improve forecasting models and processes (feedback into the forecast development process) Develop understanding of prediction errors
Diagnose and quantify systematic and random errors so improvements can be made to operational forecasting methodologies and NWP models
Provide uncertainty and reliability information to users of TC forecasts Users can make better decisions
Forecasters use NWP verification info to make optimal use of NWP models Optimal use of multiple sources of guidance
IWTC VIII, December 2014, Jeju, South Korea
3
Survey of methods in W. Pacific
A survey of the operational practice of TC forecast verification was carried out in 2012, covering all Members of the ESCAP/WMO Typhoon Committee, RSMC Tokyo Typhoon Center, and JTWC.
Aiming to gain an idea on: What verification techniques and products on what
forecast elements are currently available to TC forecasters?
What are the weak points in the verification capabilities which it would beneficial to consider improving across the region in the future?
Yu, H., S.T. Chan, B. Brown, and Coauthors, 2012: Operational tropical cyclone forecast verification practice in the western
North Pacific region. Tropical Cyclone Res. Rev., 1(3), 361-372.
(Available from http://tcrr.typhoon.gov.cn )
IWTC VIII, December 2014, Jeju, South Korea
CMA – China; DOM – Cambodia; DMHL – Laos; JTWC – USA; MMGB – Macao; NHMSV –
Vietnam; NTC/KMA – Korea; PAGASA – Philippines
Forecast elements being verified
routinely at different operational centers
IWTC VIII, December 2014, Jeju, South Korea
5
Background
2013 WMO document on verification methods for TC forecasts
“Commissioned” by World Weather Research Program (WWRP) and Working Group for Numerical Experimentation (WGNE)
Undertaken by WMO Joint Working Group on Forecast Verification Research (JWGFVR)
https://www.wmo.int/pages/prog/arep/wwrp/new/Forecast_Verification.html IWTC VIII, December 2014, Jeju, South Korea
6
Observations
Characteristics are fundamentally important for verification activities
Sources, uncertainties, and limitations of observations and analyses need to be taken into account Uncertainties impact
verification results
Lack of observations limits ability to evaluate forecasts (e.g., storm structure)
Use of obs preferred over analyses
Variable Suggested observations Suggested
analyses
Position of storm
center
Reconnaissance flights,
visible & IR satellite
imagery, passive
microwave imagery
Best track,
IBTrACS
Intensity –
maximum
sustained wind
Dropwinsonde, microwave
radiometer
Best track,
IBTrACS,
Dvorak
analysis
Intensity – central
pressure Ship, buoy, synop, AWS
IBTrACS, Dvorak
analysis
Storm structure
Reconnaissance flights,
Doppler radar, visible &
IR satellite imagery,
passive microwave
H*Wind,
MTCSWA,
ARCHER
Storm life cycle NWP model
analysis
Precipitation
Rain gauge, radar, passive
microwave, spaceborne
radar
Blended gauge-
radar, blended
satellite
Wind speed over
land Synop, AWS, Doppler radar
Wind speed over
sea
Buoy, ship reports,
dropwinsondes,
scatterometer, passive
microwave imagers and
sounders
H*Wind,
MTCSWA
Storm surge Tide gauge, GPS buoy
Waves –
significant wave
height
Buoy, ship reports, altimeter Blended analyses
Waves – spectra Altimeter
IWTC VIII, December 2014, Jeju, South Korea
7
Position and intensity evaluations
Evaluation of Mean track errors (including total, cross-
track, and along-track)
Mean and mean absolute intensity error
Computation of skill vs. a standard of comparison (e.g., statistical model)
Examination of distributions and conditional relationships
IWTC VIII, December 2014, Jeju, South Korea
After J. Franklin
Cangialosi and
Franklin 2013
8
Rainfall and Wind
Storm-centric approaches (precip)
Main issue: Availability meaningful measurements / analyses Storm structure is often not evaluated due to concerns about
analyses and observations
Use of “traditional” approaches: RMSE, MAE, POD, FAR
Alternative approaches: Spatial methods
Wind evaluation using QuikSCAT measurements
(from Durrant and Greenslade 2011)
Storm-centric rainfall evaluation
(Marchok et al. 2007) IWTC VIII, December 2014, Jeju, South Korea
9
Spatial methods
Diagnostic information about performance
Coherent structures make TCs ideal for these approaches
Require gridded obs (or analysis)
CRA method applied to Hurricane Ike
eTRAP forecasts
Method for Object-based
Diagnostic Evaluation
(MODE) applied to TC
precipitation in China (Tang et
al. 2012)
IWTC VIII, December 2014, Jeju, South Korea
10
Probability and
ensemble methods
Majumdar and
Finnochio, 2010
67% Prob circles
ECMWF ens
Strike probability forecasts: Reliability and POD
vs. FAR (from van der Grjin 2005)
Ranked
probability skill
scores (TIGGE;
Yu 2011)
% of best tracks within
67% circles
Going beyond the ensemble mean
IWTC VIII, December 2014, Jeju, South Korea
11
Genesis
Contingency table
statistics for non-
probabilistic
POD, FAR, etc.
Probabilistic
verification
approaches for
probabilistic
(Reliability, ROC,
Brier Score, etc.)
IWTC VIII, December 2014, Jeju, South Korea
Halperin et al.,
2013, Weather
and Forecasting
12
Waves and surge
Limited evaluations
Basic approaches
Contingency table
Continuous
Probabilistic
IWTC VIII, December 2014, Jeju, South Korea
Predicted peak wave height, Hs, bias
as a function of time lag for Atlantic
TCs in 2005. Center lines represent the
means; outer lines show standard
deviation. Asterisks show individual
cases, solid symbols show mean
values at individual buoys. (From Chao
and Tolman, 2010)
13
Seasonal predictions
Focus: Cyclone
counts
May be probabilistic
or ensemble
Methods:
Simple statistics
(MAE, RMSE, Bias)
Probabilistic
(reliability etc.)
IWTC VIII, December 2014, Jeju, South Korea
Example evaluation of ECMWF seasonal
TC frequencies from 1987-2001 The
forecast being evaluated is the mean count
based on the average of ensemble means
from various ensemble systems. (Vitart
2006).
14
Warnings
Systematic evaluation
is limited
Focus is typically on
case-by-case
subjective evaluations
Evaluation of landfall
position and timing
more common
Damage assessments
are done on case-by-
case basis
IWTC VIII, December 2014, Jeju, South Korea
RSMC, New Delhi official landfall
forecast errors (km).
15
Software Tools
Model Evaluation Tools
(MET) and MET-TC
R Verification package
(http://www.r-
project.org/)
Ensemble Verification
System (EVS; http://amazon.nws.noaa.gov/
ohd/evs/evs.htm)
http://www.dtcenter.org/met/users/
Example plot from MET-TC
IWTC VIII, December 2014, Jeju, South Korea
Guidance on verification practice
IWTC VIII, December 2014, Jeju, South Korea
Full description
of evaluation
parameters
Provide all relevant information regarding the verification:
Model information and post-processing methods
Grid domain and scale
Time period, lead times
Verification data source and characteristics (e.g., uncertainty if known)
Sample sizes
Reference
comparison
Utilize and report on a meaningful standard of comparison (e.g., another forecasting
system, persistence forecast, climatological value)
Confidence in
verification
results
When possible, uncertainty in verification results should be represented using
Statistical confidence intervals and hypothesis tests
Box plots or other methods to represent distributions of errors or other statistics
Insight through
stratifying data
Stratification of results to aid in understanding and provide forecasters with
additional insights
Ex: time of year, basin, storm speed, track characteristics, ensemble spread
Must maintain large enough sample size to ensure meaningful results
Relevant metrics Verification measures reported should be selected to be relevant for the particular
users of the information (i.e., answer specific questions about forecast performance
of interest)
17
Gaps
Consistency in best track analyses
Has profound impact on verification and ability to intercompare forecasting systems
Need for better observations
Gridded wind fields
Observation uncertainty information
Incorporation of observation uncertainty into verification studies/analyses
Limited evaluation of ensemble predictions
Going beyond the mean (i.e., probabilistic treatment)
IWTC VIII, December 2014, Jeju, South Korea
18
Recommendations: Observations
WMO should continue to encourage and facilitate greater sharing of relevant observational data for verification of TC forecasts
Best track datasets should be shared as soon as possible to facilitate inter-comparisons between international models and forecasts.
Improvement of observations for verification of landfall location, timing, and weather hazards.
Better documentation of observations of sustained wind and wind gusts, including metadata, assumptions, and (ideally) estimates of the uncertainties
Improvement of estimates of TC structure from remotely sensed and conventional data
IWTC VIII, December 2014, Jeju, South Korea
19
Recommendations: Improvement of
verification approaches
WMO should continue to promote good general verification practices and specific methodologies for TC verification Mechanisms: relevant publications, online resources,
workshops, and in-country training.
NMHSs and agencies should focus additional attention on verifying the extremes of weather – e.g., heavy precipitation, strong wind, storm surge, and dangerous waves Specific, relevant, measures should be used to evaluate
these forecasts
Researchers should continue to develop and improve methodologies for verifying forecast aspects of tropical cyclone formation, structure, evolution, and motion
IWTC VIII, December 2014, Jeju, South Korea
20
Recommendations: Coordination
Specify a basic set of TC verification metrics for use by NMHSs and partner agencies. Relevant groups to involve:
Working Group on Tropical Meteorology Research in coordination with the Tropical Cyclone Programme,
Joint Working Group on Forecast Verification Research
Public Weather Services Programme, and other relevant groups identified by the WMO.
Consider establishment of a Lead Center for Tropical Cyclone Verification to ensure consistent and timely verification of forecasts from NWP models for all basins in which TCs occur. Similar to existing centers for deterministic, ensemble,
and long-range verification,
IWTC VIII, December 2014, Jeju, South Korea