national weather service river forecast verification
Post on 14-Jan-2016
53 Views
Preview:
DESCRIPTION
TRANSCRIPT
National Weather Service National Weather Service River Forecast Verification River Forecast Verification
Peter GabrielsenPeter GabrielsenJuly 2006 Hic MeetingJuly 2006 Hic Meeting
July 10, 2006July 10, 2006
BackgroundBackgroundAs a result of the 2005 NOAA Audit Plan – As a result of the 2005 NOAA Audit Plan – “The Assistant Administrator for Weather “The Assistant Administrator for Weather Services should develop, document and Services should develop, document and implement a timeline and action plan for implement a timeline and action plan for completing a comprehensive river forecast completing a comprehensive river forecast verification program as soon as practicable”verification program as soon as practicable”
In 1996 the NRC stated the verification of In 1996 the NRC stated the verification of hydrologic forecasts are inadequatehydrologic forecasts are inadequate
Background (cont.)Background (cont.)Research has shown:Research has shown: Little is known about the skill of hydrologic forecastsLittle is known about the skill of hydrologic forecasts
Forecasts depend upon imperfect, mathematical Forecasts depend upon imperfect, mathematical descriptions governing runoff and routingdescriptions governing runoff and routing
Hydrologic forecasts depend on meteorological Hydrologic forecasts depend on meteorological forecasts, therefore, they include the uncertainty of forecasts, therefore, they include the uncertainty of meteorological forecastsmeteorological forecasts
Verification leads to improved forecast skillVerification leads to improved forecast skill
Background (cont.) Background (cont.) Team was chartered November 2005Team was chartered November 2005Representatives from five NWS regions Representatives from five NWS regions and OHDand OHDExpert input from trusted scientists are Expert input from trusted scientists are being usedbeing used OHDOHD OCWWSOCWWS UniversitiesUniversities RFCs RFCs
Team CharterTeam CharterVision:Vision: Provide easy access to enhanced river forecast Provide easy access to enhanced river forecast verification data which will be used to improve our scientific and verification data which will be used to improve our scientific and operational techniquesoperational techniques and services. and services.
Mission:Mission: Assess forecaster, program management and Assess forecaster, program management and user needs for verification data. Inventory current national and user needs for verification data. Inventory current national and regional verification practices and identify unmet needs. regional verification practices and identify unmet needs. Establish requirements for a comprehensive national system to Establish requirements for a comprehensive national system to verify hydrologic forecasts and guidance products which satisfy verify hydrologic forecasts and guidance products which satisfy these needs. This system should identify sources of error and these needs. This system should identify sources of error and skill in the forecasts across the entire forecast process. skill in the forecasts across the entire forecast process.
Charter (cont.)Charter (cont.)
Success Criteria/Deliverables:Success Criteria/Deliverables: Deliver a Deliver a NWS river forecast verification plan which NWS river forecast verification plan which measures skill and error in the forecast measures skill and error in the forecast process. The plan includes process. The plan includes conceptualized solution and a definition of conceptualized solution and a definition of operational requirementsoperational requirements
Charter (cont.)Charter (cont.)Team Membership:Team Membership:Julie Demargne (OHD) Julie Demargne (OHD) Peter Gabrielsen (ER)Peter Gabrielsen (ER)Bill Lawrence (SR)Bill Lawrence (SR)Scott Lindsey (AR)Scott Lindsey (AR)Mary Mullusky (OCWWS) Mary Mullusky (OCWWS) Noreen Schwein (CR)Noreen Schwein (CR)Scott Staggs (WR) Scott Staggs (WR) Kevin Werner (WR)Kevin Werner (WR)Tom Adams (ER) Tom Adams (ER) William Marosi (NWSEO) William Marosi (NWSEO)
Verification SystemVerification System
Prior to proposing verification standards – Prior to proposing verification standards – the hydrologic forecast process must be the hydrologic forecast process must be describeddescribed
Data Processing and Quality Control
Observed ForecastPrecipitation PrecipitationTemperature TemperatureStage StageFlow FlowSnow depth Freezing levelDewpointWind speedSky CoverFreezing LevelSnow water equivalentPotential Evaporation
Short-term Deterministic Forecast
Short-term Probabilistic Forecast (Ensemble)
Long-term Statistical Forecast (Water Supply)
Long-term Probabilistic Forecast (Ensemble)
Postprocessor
Model States
Hydrologic and Hydraulic Models
Rainfall/Runoff Snow accumulation …and ablation Unit Graph Consumptive Use Routing Dynamic Routing Rating Curves Reservoir Statistical Water …Supply (SWS)
Final Product Issuance comparison to action-required stage; appropriate action pursued
Model Calibration
Historical data analysisParameter EstimationParameter CalibrationOperational Implementation
Data Assimilation
Forecaster Analysis Review
Quality ControlRun-time modsReality Check
RFC Hydrologic Forecasting Process
Model Parameters
Raw Model Hydrologic Forecast
Contribution of RFC staff correcting bad data
Contribution of hydrologic forecaster through runtime-mods
Contribution of HAS function
Contribution of forecast processing enhancements
Input errors and model errors (parameters, model states, model structure)
Data QC to correct input errors
Perfect Hydrologic Forecast
Runtime-mods to correct input and model errors (parameters, model states)
Adjustments of observed and forecast data (QPF/MPE, MAT, etc.)
Enhanced calibration to correct model parameter errors
Enhanced data assimilation process to correct initial model states errors
Enhanced/new input data to correct input and model errors
Experimental / Operational Hydrologic Forecast
Corrections of all input, model, and forecaster analysis errors
Enhanced/new hydrologic/hydraulic model to correct model deficiencies
Enhanced post-processor to correct output forecast errors
Operational Hydrologic Forecast
Role and Setup of the Verification SystemRole and Setup of the Verification System
PurposePurpose Monitor forecast quality over timeMonitor forecast quality over time Monitor quality at various steps in the forecast Monitor quality at various steps in the forecast
processprocess Improves forecast qualityImproves forecast quality Assist prioritization of forecast system Assist prioritization of forecast system
enhancementsenhancements
Uses of Verification ResultsUses of Verification Results
Verification SystemVerification System Describe forecast performanceDescribe forecast performance
Past and recent Past and recent
Operational Operational
Control (or baseline) Control (or baseline)
ExperimentalExperimental
Specific time periodsSpecific time periods
Verification
Model Setup:Calibration
Operation Installation
State Updating:Data Quality ControlRuntime Simulations
Data Assimilation
Forecast Computation:Hydrologic and
Hydraulic ModelsPostprocessor
Product Review and Issuance:
Forecaster Analysisand Quality Control
An effective verification process must quantify the characteristics of the An effective verification process must quantify the characteristics of the forecast system and offer a means to analyze why forecasts behave the forecast system and offer a means to analyze why forecasts behave the way they do at various steps in the forecast processway they do at various steps in the forecast process
Uses of Verification ResultsUses of Verification ResultsCustomersCustomers Hydrologic program managersHydrologic program managers Emergency managersEmergency managers Scientists/ResearchersScientists/Researchers Hydrologic forecastersHydrologic forecasters Everyday customersEveryday customers
Use ModesUse Modes OperationalOperational Experimental/ResearchExperimental/Research
Verification System ComponentsVerification System Components
Administrative – Administrative – describe the efficiencydescribe the efficiency Logistical aspects Logistical aspects – type, quantity, duration and – type, quantity, duration and
frequencyfrequency Forecast skill Forecast skill
Scientific – Scientific – describe the reliabilitydescribe the reliability Forecast skillForecast skill Forecast system error analysisForecast system error analysis
National Baseline Verification SystemNational Baseline Verification System
LogisticalLogisticalcharacterizing point forecasts by service type, frequency and characterizing point forecasts by service type, frequency and location;location;
characterizing areal forecasts by service type, frequency and characterizing areal forecasts by service type, frequency and location;location;
identifying daily the number of issued forecasts by type and location;identifying daily the number of issued forecasts by type and location;
quantifying the person effort required to set up a basin for quantifying the person effort required to set up a basin for forecasting, including data gathering, calibration, model setup and forecasting, including data gathering, calibration, model setup and implementation efforts;implementation efforts;
quantifying the person effort required to issue each type of forecast, quantifying the person effort required to issue each type of forecast, including manual quality control of input data, forecaster run-time including manual quality control of input data, forecaster run-time modifications and forecaster review and analysis;modifications and forecaster review and analysis;
quantifying the timeliness of issued forecastsquantifying the timeliness of issued forecasts
Categories of Verification MetricsCategories of Verification Metrics
CategoricalCategorical: : statistics related to predefined threshold statistics related to predefined threshold or range of values (e.g., above flood stage, minor).or range of values (e.g., above flood stage, minor).
ErrorError: : statistics that measure various differences statistics that measure various differences between forecast and observed values (including timing between forecast and observed values (including timing errors).errors).
CorrelationCorrelation: : statistics that measure the statistics that measure the correspondence between ordered pairs (e.g., crest correspondence between ordered pairs (e.g., crest forecasts vs. QPF, forecast and observed stages).forecasts vs. QPF, forecast and observed stages).
Distribution PropertiesDistribution Properties:: statistics that summarize statistics that summarize the characteristics of a set of values.the characteristics of a set of values.
Categories of Verification MetricsCategories of Verification Metrics
Skill Scores:Skill Scores: statistics that measure the relative statistics that measure the relative accuracy with respect to some set of standard reference accuracy with respect to some set of standard reference or control set of forecasts.or control set of forecasts.
Conditional Statistics:Conditional Statistics: metrics computed based on metrics computed based on the occurrence of a particular event or events such as a the occurrence of a particular event or events such as a specific range of observations or forecasts.specific range of observations or forecasts.
Statistical Significance:Statistical Significance: mmeasures the uncertainty easures the uncertainty of the computed values of verification metrics.of the computed values of verification metrics.
Verification SystemsVerification Systems
National Baseline Verification SystemNational Baseline Verification System Administrative in natureAdministrative in nature Logistical measuresLogistical measures Skill measuresSkill measures
Comprehensive Verification SystemComprehensive Verification System AdministrativeAdministrative ScientificScientific
Verification System RequirementsVerification System Requirements
Selection of forecasts to be verifiedSelection of forecasts to be verified time attributes (days, months, seasons, years, as well as lead time attributes (days, months, seasons, years, as well as lead
time)time) service attributes (national, regional, RFCs, groups, locations)service attributes (national, regional, RFCs, groups, locations) individual forecaster within guidelines agreed to by the NWS individual forecaster within guidelines agreed to by the NWS
and the NWSEOand the NWSEO basin attributes (response time, size, slope, aspect, elevation, basin attributes (response time, size, slope, aspect, elevation,
snow, non-snow)snow, non-snow) forecast or observed events (crest timing, rising and falling forecast or observed events (crest timing, rising and falling
hydrographs)hydrographs)
Verification System RequirementsVerification System RequirementsArchivingArchiving
Time attributes (days, months, years, seasons)Time attributes (days, months, years, seasons) Service attributes (national, regional, RFCs, forecaster, Service attributes (national, regional, RFCs, forecaster,
groups, locations)groups, locations) Basin attributes (response time, size, slope, aspect, elevationBasin attributes (response time, size, slope, aspect, elevation
HindcastingHindcasting Different QPFs (e.g., Perfect QPF, zero, actual, persistence) Different QPFs (e.g., Perfect QPF, zero, actual, persistence) Different FMATs (e.g., Perfect FMAT, actual, persistence)Different FMATs (e.g., Perfect FMAT, actual, persistence) Different freezing levelsDifferent freezing levels Different MAPEsDifferent MAPEs Different reservoirs forecasts Different reservoirs forecasts Different QPEs (e.g., point based MAP, MAPX, Q2)Different QPEs (e.g., point based MAP, MAPX, Q2) Different sets of model parameters Different sets of model parameters Different models, including the post-processing and state Different models, including the post-processing and state
updating models updating models
Additional recommendationsAdditional recommendationsOHD should assign a program manager for verification.OHD should assign a program manager for verification.
Establish formal verification focal points at each RFC.Establish formal verification focal points at each RFC.
Create national river forecast performance goals. This should be Create national river forecast performance goals. This should be accomplished once the software has been fielded and some experience accomplished once the software has been fielded and some experience gained with the metrics. gained with the metrics.
Ensure adequate hydrologic verification training, and use of the system, is Ensure adequate hydrologic verification training, and use of the system, is captured in OSIP documentation.captured in OSIP documentation.
Publish findings in peer reviewed journals (e.g., BAMS, EOS) to inform the Publish findings in peer reviewed journals (e.g., BAMS, EOS) to inform the research community of our plans.research community of our plans.
Ensure an end-to-end assessment and verification of the elements in the Ensure an end-to-end assessment and verification of the elements in the hydrologic forecasting process that are outside of the control of the RFC hydrologic forecasting process that are outside of the control of the RFC forecaster or produced by other agencies forecaster or produced by other agencies
Additional recommendationsAdditional recommendationsOHD needs to establish a team to define the raw model to enable OHD needs to establish a team to define the raw model to enable the users to assess the impact of various steps (e.g., calibration, the users to assess the impact of various steps (e.g., calibration, quality control, run-time modifications) on the forecast performance.quality control, run-time modifications) on the forecast performance.
Archive of necessary data to support verification software should Archive of necessary data to support verification software should begin within 30 days of the data being defined. begin within 30 days of the data being defined.
Ensure continuity with other activities that support this verification Ensure continuity with other activities that support this verification plan.plan.
Brief the National Performance Management Committee (NPMC) Brief the National Performance Management Committee (NPMC) and ensure incorporation of the RFC hydrologic verification and ensure incorporation of the RFC hydrologic verification requirements requirements
Background InformationBackground Information
National Baseline Verification System National Baseline Verification System MetricsMetrics
CategoricalCategorical Deterministic: POD, FAR, LTDDeterministic: POD, FAR, LTD Probabilistic: Brier Score, Ranked Probability ScoreProbabilistic: Brier Score, Ranked Probability Score
Error (Accuracy) Error (Accuracy) Deterministic: RMSE, MAE, ME, BiasDeterministic: RMSE, MAE, ME, Bias
Correlation:Correlation: Deterministic: Pearson Correlation CoefficientDeterministic: Pearson Correlation Coefficient
National Baseline Verification System MetricsNational Baseline Verification System Metrics
Skill ScoreSkill Score Deterministic: RMSE Skill ScoreDeterministic: RMSE Skill Score Probabilistic: Rank Probability Skill Score, Brier Skill ScoreProbabilistic: Rank Probability Skill Score, Brier Skill Score
ConfidenceConfidence Deterministic: Sample sizeDeterministic: Sample size Probabilistic: Sample sizeProbabilistic: Sample size
Probabilistic forecasts Probabilistic forecasts should also be verified as should also be verified as deterministic forecast using mean or some predetermined deterministic forecast using mean or some predetermined
exceedence levelexceedence level
Verification System RequirementsVerification System Requirements
Analysis of skill and error sourcesAnalysis of skill and error sources Impact of input data errorsImpact of input data errors Impact of model errorsImpact of model errors Impact of forecast analysisImpact of forecast analysis
Computation of verification metrics and results Computation of verification metrics and results presentationpresentation
Dissemination and trainingDissemination and training
CATEGORIESCATEGORIES DETERMINISTIC FORECAST DETERMINISTIC FORECAST VERIFICATION METRICSVERIFICATION METRICS
PROBABILISTIC PROBABILISTIC FORECAST VERIFICATION FORECAST VERIFICATION METRICSMETRICS
11. . CategoricalCategorical Probability Of Detection (POD), Probability Of Detection (POD), False Alarm Rate (FAR)False Alarm Rate (FAR),, Critical Critical Success Index (CSI), Success Index (CSI), Lead Time of Lead Time of Detection (LTD)Detection (LTD),, Pierce Skill Score Pierce Skill Score (PSS), Gerrity Score (GS) (PSS), Gerrity Score (GS)
Brier Score (BS), Rank Brier Score (BS), Rank
Probability Score (RPS)Probability Score (RPS)
2. Error2. Error Root Mean Square Error (RMSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Mean Absolute Error (MAE), Mean Error (ME), Bias (%)Error (ME), Bias (%),, Linear Error Linear Error in Probability Space (LEPS)in Probability Space (LEPS)
Continuous RPS Continuous RPS
3. Correlation3. Correlation Pearson Correlation CoefficientPearson Correlation Coefficient, , Ranked correlation coefficient, Ranked correlation coefficient, scatter plots scatter plots
4. Distribution 4. Distribution PropertiesProperties Mean, variance, higher momentsMean, variance, higher moments Wilcoxon rank sum test, variance Wilcoxon rank sum test, variance
of forecasts, variance of of forecasts, variance of observations, ensemble spread, observations, ensemble spread, Talagrand Diagram (or Rank Talagrand Diagram (or Rank Histogram) Histogram)
Verification metric categories and metrics for deterministic and probabilistic forecasts
CATEGORIESCATEGORIES DETERMINISTIC FORECAST DETERMINISTIC FORECAST VERIFICATION METRICSVERIFICATION METRICS
PROBABILISTIC PROBABILISTIC FORECAST VERIFICATION FORECAST VERIFICATION METRICSMETRICS
5. Skill Score5. Skill Score Root Mean Squared Error Skill Root Mean Squared Error Skill Score (SS-RMSE) (with reference Score (SS-RMSE) (with reference to persistence, climatology,to persistence, climatology, lagged persistence), Wilson Score lagged persistence), Wilson Score (WS), Linear Error in Probability (WS), Linear Error in Probability Space Skill Score (SS-LEPS) Space Skill Score (SS-LEPS)
Rank Probability Skill ScoreRank Probability Skill Score, , Brier Skill Score (with Brier Skill Score (with reference to persistence, reference to persistence, climatology,climatology, lagged persistence lagged persistence
6. Conditional 6. Conditional StatisticsStatistics
Relative Operating Characteristic Relative Operating Characteristic (ROC) and ROC Area, reliability (ROC) and ROC Area, reliability measures, discrimination diagram, measures, discrimination diagram, other discrimination measures other discrimination measures
ROC and ROC Area, other ROC and ROC Area, other resolution measures, Reliability resolution measures, Reliability diagram, discrimination diagram, diagram, discrimination diagram, other discrimination measures other discrimination measures
7. Confidence7. Confidence Sample sizeSample size,, Confidence Interval Confidence Interval (CI) (CI)
Ensemble size, Ensemble size, sample size,sample size, Confidence Interval (CI) Confidence Interval (CI)
Verification metric categories and metrics for deterministic and probabilistic forecasts
Definition of Metrics for the National Definition of Metrics for the National
Baseline Verification SystemBaseline Verification System
Probability of detection (POD)Probability of detection (POD) – Percentage of – Percentage of (categorical) events forecast correctly. (categorical) events forecast correctly.
False Alarm Ration (FAR)False Alarm Ration (FAR) – Percentage of – Percentage of (categorical) forecast events that did not verify. (categorical) forecast events that did not verify.
Lead Time of Detection (LTDLead Time of Detection (LTD) ) – The average lead – The average lead time of all forecasts that fall into the correct observed time of all forecasts that fall into the correct observed category. category.
Definition of Metrics for the National Definition of Metrics for the National
Baseline Verification SystemBaseline Verification System
Root Mean Square Error (RMSE)Root Mean Square Error (RMSE) – The square – The square root of the average of the squared differences between root of the average of the squared differences between forecasts and observations.forecasts and observations.
Mean Absolute Error (MAE)Mean Absolute Error (MAE) – The average of the – The average of the absolute value of the differences between forecasts and absolute value of the differences between forecasts and observations.observations.
Mean Error (ME) Mean Error (ME) – The average difference between – The average difference between forecasts and observations.forecasts and observations.
Bias (%) Bias (%) – The ME expressed as a percentage of the – The ME expressed as a percentage of the mean observation.mean observation.
Definition of Metrics for the National Definition of Metrics for the National Baseline Verification SystemBaseline Verification System
Brier Skill Score (BSS) Brier Skill Score (BSS) – A skill score based on BS – A skill score based on BS values. The recommended reference forecasts are values. The recommended reference forecasts are persistence and climatology.persistence and climatology.
Ranked Probability Skill Score (RPSS) Ranked Probability Skill Score (RPSS) – A skill – A skill score based on RPS values. The recommended score based on RPS values. The recommended reference forecasts are persistence and climatology.reference forecasts are persistence and climatology.
Sample Size Sample Size – A numeration of the number of – A numeration of the number of forecasts involved in the calculation of a metric forecasts involved in the calculation of a metric appropriate to the type of forecast (e.g., categorical appropriate to the type of forecast (e.g., categorical forecasts should numerate forecasts and observations forecasts should numerate forecasts and observations by categories, etc.)by categories, etc.)
Definition of Metrics for the National Definition of Metrics for the National
Baseline Verification SystemBaseline Verification System
Brier Score (BS) Brier Score (BS) - The mean squared error of - The mean squared error of probabilistic two-category forecasts where the probabilistic two-category forecasts where the observations are either 0 (no occurrence) or 1 observations are either 0 (no occurrence) or 1 (occurrence) and forecast probability may be arbitrarily (occurrence) and forecast probability may be arbitrarily distributed between occurrence and non-occurrence.distributed between occurrence and non-occurrence.
Ranked Probability Score (RPS) Ranked Probability Score (RPS) – The mean – The mean squared error of probabilistic multi-category forecasts squared error of probabilistic multi-category forecasts where observations are 1 (occurrence) for the observed where observations are 1 (occurrence) for the observed category and 0 for all other categories and forecast category and 0 for all other categories and forecast probability may be arbitrarily distributed between all probability may be arbitrarily distributed between all categories.categories.
Definition of Metrics for the National Definition of Metrics for the National Baseline Verification SystemBaseline Verification System
Correlation CoefficientCorrelation Coefficient – A measure of the linear – A measure of the linear association between forecasts and observations.association between forecasts and observations.
Skill Score –Skill Score – In general, skill scores are the In general, skill scores are the percentage difference between verification scores for percentage difference between verification scores for two sets of forecasts (e.g., operational forecasts and two sets of forecasts (e.g., operational forecasts and climatology).climatology).
Root Mean Squared Error Skill Score (SS-Root Mean Squared Error Skill Score (SS-RMSE) –RMSE) – A skill score based on RMSE values. The A skill score based on RMSE values. The recommended reference forecasts are persistence and recommended reference forecasts are persistence and climatology.climatology.
top related